Some of the best concerts I’ve attended back in the day have been sold-out shows at large venues with tens of thousands of other fans: Metallica at the Rose Bowl; Rolling Stones at the LA Coliseum. But not all concert locations have been suitable for big crowds. Many sites have had to restrict audience size because of lack of floor space, inadequate acoustics, or limited infrastructure like parking spaces, restrooms, or electrical capacity.
Like small clubs in Hollywood with their single entrances, many database vendors have challenges accommodating high numbers of active users concurrently. Consequently, they are forced to put a one-size-fits-all limit on how many queries they will allow to start up at one time, with no consideration of the resource needs of incoming requests.
Teradata’s Advanced SQL Engine is an exception. By default, it has an open-door policy and, according to some Teradata customers I have worked with, the database offers their end users the freedom to submit “any query any time.” And like the Rose Bowl or the Coliseum, where fans can enter and pass through security from many different gates, Teradata provides multiple entry and security enforcement points for incoming work.
In general, a system’s ability to function under a stressful load is a more illuminating attribute than the system’s behavior when only a handful of users are active. Advancements in technology can mask limitations of non-scalable platforms when running at low concurrency. When you test drive a new car you want to take it out on the freeway and put a little load on the engine, rather than simply cruising around the block.
There are many reasons why the Advanced SQL Engine is capable of handling high demand gracefully. Let’s consider a few of the key ones:
Query plan building is one of the first things a query encounters when it enters the database. The query plan is the roadmap of how a query will construct its answer set. In the Advanced SQL Engine, query plans are designed to utilize multiple different types of parallelism while at the same time minimizing communication back and forth between the optimizer and the different parallel units (called AMPs) working on the query.
The optimizer converts query plans into multiple large chunks of work called “steps”
that are dispatched from the optimizer to the AMPs. Sending out large steps, rather than small ones, cuts down on the communication overhead between components, while expanding opportunities for pipelining functions within each step.
Teradata’s Adaptive Optimizer
has taken plan building even further by introducing the equivalent of a rear-view mirror. Adaptive optimization interleaves database processing and query plan building iteratively, allowing a query plan to be adjusted in-flight based on demographics discovered earlier in the query’s execution.
The SQL Engine interconnect, the BYNET
, converts each query step coming from the optimizer into a message which is packaged and sent to the AMPs. The BYNET makes huge contributions to query efficiency by opening up low-level channels for AMPs working on the same query to signal to each other and make group decisions in flight. These signaling channels, which are only instantiated among AMPs working on the same query step, further reduce the need to circulate more messages, conserving BYNET resources and speeding up query completion.
When lots of users are hitting the system, the BYNET takes on an equally important role, similar to that of a guardian angel. While it delivers work to the AMPs, it is also sensitive to AMPs that might be struggling. If demand on one AMP becomes excessive, that AMP will tell the BYNET to stop sending it new work temporarily, sometimes only for a few milliseconds. This allows a busy AMP to catch its breath while other AMPs continue to work as usual.
The Teradata Advanced SQL Engine is able to operate near or at its resource limits without exhausting any of them by applying control over the flow of work at the lowest possible levels inside the system, rather than from the outside. This decentralized approach of managing congestion is central to Teradata’s ability to take on more than it can handle without shutting down or dropping queries.
Other Embedded Techniques
Other techniques inside the AMPs are in place to accommodate high concurrency when it happens. Synchronized table scan (sync scan) is a database feature that allows multiple scans of the same table to piggy-back on each other and share I/Os, freeing up resources for other work. The more concurrent users scanning the same table, the greater the benefit from sync scan.
A variety of secondary access options also contribute to query efficiency and therefore higher supportable concurrency. Direct, single-AMP access to a row is automatically used when a value for the table’s primary index is supplied in a query. Secondary indexes of different flavors (ordered, hashed, non-hashed), materialized views, and partitioning techniques have been shortening times and boosting concurrency for many years in Teradata.
Just like those VIP entrances and front-row seats at a big concert show, Teradata’s workload management
supports its own kind of A-list privileges and special benefits for more important queries. And just like it is easier to get on the road after the show when you’ve parked in the VIP lot, critical queries can be endowed with faster turnaround times in both getting into and getting out of the system, elevating overall concurrency.
Teradata is blessed with an infrastructure designed like a Mack Truck, ready to meet the unexpected, take a beating, and keep on running. Rather than just throwing more hardware at a system as work starts to ramp up, Teradata is able to maximize the resources it already has available by producing economical query plans that whittle a query’s needs down to a bare minimum, followed up by numerous embedded techniques that reduce demand and weed out inefficiencies once the queries begin to execute.
At first glance, it’s hard to know whether a platform is really able to scale or not. When entering a product evaluation, careful consideration of what data volumes and concurrency levels will be needed now, and in the future, will help establish whether scalable performance can be realized and sustained. Once our current social isolation has come to an end, we’ll all be able to celebrate many things once again, including live music. If you want VIP parking, short security lines, easy access to beverages and bathrooms, guaranteed great seats, and
you want to be surrounded by fans like yourself, when that time comes, plan to head for the big concert venues, which like Teradata, will not disappoint.