When you think back on the famous cities you have visited, like Paris or New York, what usually comes to mind are the buildings, the parks, the boulevards, and maybe the museums. We tend to ignore the part of city that thrives under the pavement--the tunnels, pipes, and wiring--even though this infrastructure plays a vital role in keeping the city alive and functioning.
Powerful things sometimes lie below the surface, easily escaping our attention. One often unnoticed and frequently under-appreciated component within Teradata Vantage is the BYNET.
What is the BYNET?
The BYNET is the system interconnect that allows the various components of the Vantage database to communicate. This is important because Vantage is composed of many self-contained virtual processors with no inherent connection among them. Just like underground cables, messages travel along the BYNET between the parsing engine modules, which do all logon and query pre-processing activities, and the AMPs. AMPs are the virtual processes that run in parallel and do the database work involved in query executions. Typically, the data from a relational table is spread across all AMPs in the configuration, with each AMP owning a subset of a table's rows.
When I first started at Teradata, the YNet was the interconnect that glued all the physical components within the database together. The YNet was architected to offer a wide range of efficiencies, all of which have been inherited by today's BYNET, Teradata's second-generation interconnect.
Although primarily a message delivery mechanism, the BYNET, is so much more.
Three Key Capabilities of the BYNET that Are Critical to Overall Performance of Vantage
Here are three key capabilities of the BYNET that I consider critical to the overall performance delivered by Vantage today.
1. Multi-AMP Query Coordination
If a group of us are going to meet for lunch, we need to coordinate our plans to arrive at the same time and agree on how we are going to split up the check. The BYNET imposes similar coordination among AMPs that are working in parallel on the same query. To do this, the BYNET performs an on-the-spot association of just the AMPs working on the same piece of work, making sure that all the involved AMPs have a special communication channel set up just for them. This is just like co-workers sharing a Slack channel or Microsoft Teams for direct and easy communication.
The BYNET oversees query step completion and error handling, so that if one AMP fails during the execution of its share of query work, all other AMPs on the same channel will be notified immediately. This avoids the potential of one AMP hitting a problem, yet the other AMPs continue their part of the work and wasting platform resources. To do this coordination, the BYNET uses lightweight communications and signaling behind the scenes, rather than sending more complex messages. Tying the lightweight communications and signaling to our lunch example, we may send a text message instead of making a phone call to adjust for last minute changes to our reservation time or location.
2. Final Answer Set Ordering
One of the things that impresses me the most about the BYNET is its role in returning very large sorted answer sets in an effortless manner. In most database systems, sorting a large final answer set is costly because it often involves several sub-sorts and data merges. This can be I/O-intensive and time-consuming and usually involves writing and reading intermediate data sets. Think about what it would take to reorder all of the books in your local public library so that instead of being grouped by the Dewey Decimal System they were all ordered alphabetically by title. You'd have to lift a lot of books of the shelves and probably reorder a subset at a time in temporary locations. After you would have to lift all of the books back to their new shelf location.
The BYNET knows about the parallelism of the AMPs and recognizes that each AMP has built up a small sorted answer set in a buffer for its portion of the data at the end of a query. The BYNET simply reads the data from all AMPs simultaneously while maintaining the specified sorted order. Like sucking on a straw that is forked into multiple milkshakes, the BYNET pulls data off the AMPs. The answer set emerges in sorted order and is returned to the client without ever having to land anywhere for one big sort/merge operation (See Figure 2). This is an elegant and efficient compilation of the final answer set across parallel units which bypasses I/O-intensive routines and speeds up query completion.
3. Congestion Control
If you pour too much water in a glass, it overflows; if roads have too much traffic, you get stuck in gridlock. Everything comes with a ceiling on capacity. Vantage can avoid the serious consequences of database congestion because of the intelligence of the BYNET. Each AMP is working independently on its share of the work, but sometimes query work can be uneven across AMPs. An AMP that is experiencing a heavier load may sometimes fall behind.
When an individual AMP approaches congestion, it will signal to the BYNET that it has more work than it can handle. In response, the BYNET temporarily stops delivering messages to the AMP. Then as soon as the AMP has worked off its backlog, BYNET messages automatically begin to flow once again. The BYNET regulates message-sending to prevent spilled water or gridlock on any single AMP, thus protecting the throughput of the entire platform. The key advantage this approach offers is scalability. Control over the flow of work takes place deep inside the database, with each AMP independently working with the BYNET to manage itself, with very little overhead and no coordination with other AMPs. The number of AMPs can increase 10-fold, or 1000-fold, and congestion control works just as efficiently.
Transformation from YNet to BYNET
The original Teradata hardware was composed of separate physical structures: parsing engines, AMPs, and an interconnect. As part of Teradata's evolution, all of these physical hardware structures have been converted into software capabilities, including the BYNET. One advantage of this transformation to software is portability. The BYNET today can be installed on many different hardware devices, rather than only running on proprietary hardware, as was the case in the past. Today's "virtualized" BYNET allows Vantage to embrace whatever is the optimal general-purpose interconnect functionality at any point in time. Today the BYNET runs equally well on InfiniBand or on standard ethernet without any changes to the database required, or loss of functionality.
The next time you visit Paris, appreciate the art museums and the view from the Eiffel Tower. However, I hope you now will also take a moment to recognize and appreciate the active world below the pavement. The amazing infrastructure that holds the city together.