OK, I admit it. I’m a hoarder. My tendency to over-accumulate is mainly focused on books, because I love to read: History, science, mysteries, thrillers, even a bit of sci-fi. Books to me are like friends, you want to keep the ones you like within reach over the long run.
While I can’t bear to part with my growing library, I simply don’t have room for another bookcase in my living space. But finding boxes, packing up volumes of fiction and non-fiction, renting a truck to move the books to a storage facility, all that takes planning, time, and effort.
Rapidly Growing Data
The quandary I find myself in is not unsimilar to data warehouse administrators who are facing tough decision about how to manage rapidly growing volumes of data inside of their database engine. But in those cases, availability and cost-effectiveness of object storage in the cloud makes cloud vendor storage a compelling place to move less critical data that is accumulating in your Teradata Vantage system. Just like ferrying my overflow books into a cheap rental storage unit, figuring out how to get your data off the database and onto cloud storage quickly and effortlessly can be a challenge.
If you are one of those users, NOS WRITE is the solution you have been waiting for. This capability in Vantage provides a simple and direct way to offload relational data onto less costly cloud storage. And it does this all in a single SQL statement. NOS WRITE works equally well from Teradata on-premises systems or from Teradata cloud environments.
NOS WRITE is the latest offering within the Native Object Store feature, which was released in early 2020. The first version of Native Object Store introduced NOS READ, a means of reading and transforming external data residing in cloud vendor storage and bringing it into the Vantage SQL Engine. When using NOS READ, no new platforms or unfamiliar tools are required. It allows analysts and data scientists to use simple SQL to reach out, filter, and pull in object store data, wherever it resides.
Parallelism and Performance
In the current version of Native Object Store you can combine the NOS READ and WRITE capabilities. This means your data is never out of reach, even when stored outside of the database. Native Object store has been designed take advantage of the scalability, performance, and workload management that is embedded in the Vantage platform. Each parallel unit, known as an AMP, will act on its share of the data that is either coming from or going into object storage. In NOS WRITE this parallelism translates to each AMP sharing the effort, both the preparation work and the load itself. All of the behind the scenes activity is completely invisible to the user, whose only job is to submit the SQL statement that triggers it.
Figure 1: Native Object Store enables reading from and writing to objects in cloud vendor storage.
NOS WRITE doesn’t just dump the data blindly into cloud storage. It has been designed to easily and automatically organize data in the objects that it is loading in a way that is optimized for subsequent read access via NOS READ. A few optional parameters that feed into the NOS WRITE offload action can influence how the data will be grouped within the resulting objects. That can set the stage for efficient path filtering and other benefits at query time.
For example, data with common access characteristics can be grouped and stored within the same objects by specifying partitioning. A good example might be SalesDate. If you partition by SalesDate, then prior to offloading the data NOS WRITE will re-shuffle rows across AMPs so that rows with the same SalesDate value will end up being loaded into the same object or set of objects. This clustering of similar data can improve performance of queries that access object data with selection criteria on SalesDate.
Using similar options during NOS WRITE, you can influence how object path names are formulated during the offload process, ensuring that the values of frequently referenced columns will appear as part of the path. This provides a hook for path filtering at query time, so that unneeded objects can be bypassed based solely on their assigned name.
To understand why I am so enthusiastic over this new NOS WRITE feature, consider these benefits it offers:
- Offloading infrequently accessed cold data from a Teradata relational table. For example, if you have a large and perpetually growing fact table, NOS WRITE can be used to easily copy all rows older than, say one year, to an external object store. Those rows can then be deleted from the SQL Engine, freeing up significant space inside the database.
- Saving dollars by placing non- critical data on less expensive object storage, as opposed to purchasing more database block storage. This can be helpful when you need to keep history data around for compliancy reasons, but you are not planning on actively using it.
- Copying external object data that is in one format to a differently formatted external object store. For example, you can use NOS WRITE to read from a CSV-formatted object store as the source, then transform and write the data to a Parquet object store. This transference of data from one object store to another can be done in a single NOS WRITE SQL statement, without the data having to be staged as a relational table inside the database at any point in the process.
While I will probably continue my guilty pleasure of hoarding books, it is not something I would recommend that you take up. However, I do
strongly suggest that you give NOS WRITE a serious try.
More Information on this Topic
See an earlier posting of mine
at for a brief introduction to Native Object Store.
For implementation detail, go to the Native Object Store Getting Started Guide.
Or if you are a Teradata customer, there’s some very helpful how-it-works information on both NOS READ and WRITE in the Orange Book Native Object Store