You are on page 1of 5

Overview - Teradata Secondary Indexes

RETAIL & CPG


PreetamPadhy
Preetam.Padhy@tcs.com

Introduction:

Overview Teradata Secondary Indexes 2012

Secondary Indexes (SIs) are a unique feature of Teradata, generally defined to provide faster
set selection. The Teradata RDBMS allows up to 32 SIs per table. There are two types of
secondary indexes:
1. Unique Secondary Indexes (USIs)
2. Non-Unique Secondary Indexes (NUSIs)
The system maintains a separate subtable for each secondary index. Subtables keep base table
secondary index row hash, column values, and RowID (which point to the row(s)) in the base
table with that value. Users cannot access subtables directly. Secondary indexes can be defined
for a new table using CREATE TABLE or for an existing table using CREATE INDEX.

Fig 1: Access mechanism of primary index and secondary index

TYPES OF SECONDARY INDEXES:


Unique Secondary Index (USI):
USIs are always preferable to NUSIs for access using a single value. The usual criterion for
choosing between them is the intended application. Data to be indexed tends to be either
unique or not inherently. USIs are useful both for base table access (because USI access is, at
worst, a two-AMP operation) and for enforcing data integrity by applying a uniqueness
constraint on a column set. Like a unique primary index, a unique secondary index can be used
to guarantee the uniqueness of each value in a column set.

USI ACCESS:
USI access is usually a two-AMP operation because a USI typically hashes to a different AMP
than the PI for the same row. If the USI subtable row hashes to the same AMP as the base table
Row it points to, then only one AMP is accessed. The following stages are involved in a USI base
table row access.
The requested USI value is accessed by hashing to its subtable.
The pointer to the base table row is read and used to access the stored row directly.

TCS Confidential

Page 1

Overview Teradata Secondary Indexes 2012

The Subtable ID portion of the Table ID references the USI subtable not the data table. Using the
DSW for the Row Hash, the Message Passing Layer (a.k.a., Communication Layer) directs the
message to the correct AMP which uses the Table ID and Row Hash as a logical index block
identifier and the Row Hash and USI value as the logical index row identifier. If the AMP
succeeds in locating the index row, it extracts the base table Row ID. The Subtable ID portion of
the Table ID is then modified to refer to the base table and a new three-part message is put
onto the Communications Layer. Once again, the Message Passing Layer uses the DSW to
identify the correct AMP that AMP now uses Table ID and Row Hash to locate the correct data
block and then uses Row Hash and Uniqueness Value (Row ID) to locate the correct
row.

Fig 2: USI Access

Non Unique Secondary Index (NUSI):


NUSIs are particularly useful for range access equality and non equality conditions. Highly
selective NUSIs are useful for reducing the cost of frequently made selections and joins on non
unique columns, and provide extremely fast access for equality conditions. However, NUSIs with
low selectivity can be less efficient than a full-table scan.
NUSIs are implemented on an AMP-local basis. Each AMP is responsible for maintaining only
those NUSI subtable rows that correspond to base table rows located on that AMP. Since NUSIs
allow duplicate index values and are based on different columns than the PI, data rows
matching the supplied NUSI value could appear on any AMP. Any AMP that does not have an
index row for the NUSI value will not access the base table to extract rows.

TCS Confidential

Page 2

Overview Teradata Secondary Indexes 2012


NUSIs are a less preferable secondary index choice for other applications for several reasons.
NUSI access is always an all-AMPs operation.
Because NUSI subtable access is not hashed, the subtables must be scanned in order to locate
the relevant pointers to base table rows. This is a fast lookup process when a NUSI is specified in
an equality condition because the NUSI rows are hash-ordered on each AMP.

NUSI ACCESS:
NUSIs are AMP-local indexes; this message gets broadcast to all AMPs. Each AMP uses the
values to search the appropriate index block for a corresponding NUSI row. Only those AMPs
with one or more of the desired rows use the base table Row IDs to access the proper data
blocks and data rows.
By definition, there are multiple rows per value in a NUSI. If a NUSI is not correlated with the
primary index of its base table, those rows are distributed among the AMPS in a way that does
not favour the likelihood of the Optimizer selecting the NUSI to access them. In any case, when
the number of rows per NUSI value approaches or exceeds the number of AMPs in the system,
multiple AMPs must be accessed. The usefulness of a NUSI is correlated with the number of
NUSI rows per value: the fewer number of NUSI rows per value, the less useful the index
1. Single NUSI Access (Between, Less Than, or Greater Than)

The Teradata RDBMS accesses data from a NUSI-defined column in two ways:
Utilize the NUSI and do a Full Table Scan (FTS) of the NUSI subtable. In this
case, the Row IDs of the qualifying base table rows would be retrieved into
spool. The Teradata RDBMS would use those Row IDs in spool to access the
base table rows themselves.
If the NUSI is not value-ordered, the system may do a FTS of the NUSI subtable.
If the NUSI is ordered by values, the NUSI subtable is much more likely be used
to locate matching base table rows.
Ignore the NUSI and do an FTS of the base table itself.
2. Dual NUSI Access:

Two NUSIs are created on separate columns of the table. The Teradata RDBMS
decides how to use these NUSIs based on their selectivity.
a) AND with Equality Conditions:

TCS Confidential

If one of the two indexes is strongly selective, the system uses it alone for
access.
If both indexes are weakly selective, but together they are strongly selective, the
system does a bit-map intersection.
If both indexes are weakly selective separately and together, the system does an
FTS.

Page 3

Overview Teradata Secondary Indexes 2012


b) OR with Equality Conditions:
When accessing data with two NUSI equality conditions joined by the OR,
the Teradata RDBMS may do one of the following:

Do a FTS of the base table.


If each of the NUSIs is strongly selective, it may use each of the NUSIs to return the
appropriate rows.
Do an FTS of the two NUSI subtables and do the following steps.
o Retrieve Rows IDs of qualifying base table rows into two
separate spools.
o Eliminate duplicates from the two spools of Row IDs.
o Access the base rows from the resulting spool of Row IDs.

Covering Indexes:
If the query references only columns of that table that are fully contained within a given
index, the index is said to "cover" the table in the query. In these cases, it is often more
efficient to access only the index subtable and avoid accessing the base table rows
altogether. Covering will be considered for any table in the query that references only
columns defined in a given NUSI.
These columns can be specified anywhere in the query including the:
SELECT list
WHERE clause
Aggregate functions
GROUP BY expressions
The presence of a WHERE condition on each indexed column is not a prerequisite for using
the index to cover the query. The optimizer will consider the legality and cost of covering
versus other alternative access paths and choose the optimal plan. Many of the potential
performance gains from index covering require no user intervention and will be transparent
except for the execution plan returned by EXPLAIN.

Secondary Index Considerations:


SIs require additional storage to hold their subtables. In the case of a Fallback table, the SI
subtables are Fallback also. Twice the additional storage space is required.
SIs require additional I/O to maintain these subtables.

TCS Confidential

Page 4

You might also like