Teradata - Architecture II - v1 1

| HIGH PERFORMANCE INFORMATION SOLUTIONS
www.compactsolutionsllc.com
2013 Compact Solutions LLC. All rights reserved.
Compact Solutions Confidential and Proprietary DO NOT DISCLOSE OUTSIDE OF THE COMPANY
Indexes
Introduction
Indexing is one of the most important features of the Teradata RDBMS.
In the Teradata RDBMS, an index is used to define row uniqueness and
retrieve data rows, it can be used to enforce the primary key and unique constraints for a
table.
The Teradata RDBMS support five types of indexes:
Unique Primary Index (UPI)
Unique Secondary Index (USI)
Non-Unique Primary Index (NUPI)

Non-Unique Secondary Index (NUPI)
Join Index
Indexes (Contd.)
The following rules apply to the indexes used in the Teradata
Relation database:
An index is a scheme used to distribute and retrieve rows of a data table. It can be
based on the values in one or more columns of the table.
A table can have a number of indexes, including one primary index, and up to 32
secondary indexes.
An index for a relational table may be primary or secondary, and may be unique or
non-unique. Each kind of index affects system performance, and can be important
to data integrity.
An index is usually defined on a table column whose values are frequently used in
specifying WHERE constraints or join conditions.
An index is used to enforce PRIMARY KEY and UNIQUE constraints.
Primary Index
Each table in Teradata is required to have a Primary Index.
The Primary Index will determine on which AMP a row will reside.
It is a Physical mechanism used to store and access rows .
The Primary Index plays 3 roles:

Data Distribution
Fastest Way to Retrieve Data
Incredibly important for Joins
It may consist of a single column or a combination of up to 64 columns.
Defined in CREATE TABLE statement.
Primary Index (Contd.)
Changing the choice of Primary Index requires dropping and recreating

the table.
May be unique or non-unique, Values may be changed .
One AMP operation.
Primary Index (Contd.)

Types of PI
Unique Primary Index (UPI)
Does not allow duplicates.
Provides even distribution of rows of the table across all AMPs.
Does not require duplicate row checking.
Retrieves 01 row
Non Unique Primary Index (NUPI)
Allows duplicates
Does not provide even distribution of rows
May choose NUPI over UPI because it may be more efficient for query
access and joins
Retrieves 0-Many rows
Both UPI and NUPI access is always one AMP operation.
Secondary Index
Secondary Indexes provide an alternate path to the data.
So far we have learned that every table has one and only one Primary Index
and we have learned that the Primary Index is much faster than the Full Table
Scan.
Secondary Indexes are not as fast as the Primary Index, but they can be pretty
fast, and they can be much faster than a Full Table Scan.
There can be up to 32 Secondary Indexes on a table.
Every Secondary Index creates a Subtable on every AMP designed to point to
the real Primary Index Row-ID.
There are two types of Secondary Index and they are Unique Secondary
Indexes, which are called USIs and Non-Unique Secondary Indexes called
NUSIs.
An USI is always a Two-AMP operation so it is almost as fast as a Primary
Index, but a NUSI is an All-AMP operation, but not a Full Table Scan.
Secondary Index (Contd.)

The Secondary Index Subtable
If users want another access path to the data the secondary index is the
option.
As soon as the USI is created with the SQL syntax the next move comes from
Teradata creating a Subtable on every AMP. This is true for both the USI and
the NUSI.
Lets say for example the DBA created the maximum of 32 secondary indexes
on a table. Then there would be 32 Sub-tables created, each taking up PERM
Space.
The entire purpose for the Secondary Index Subtable will be to point back to
the real row in the base table via the Row-ID.

USI Always a Two-AMP Operation
Secondary Indexes provide an alternate path to the data.
Teradata runs extremely well without Secondary Indexes, but since secondary indexes
use up space and overhead, they should only be used on KNOWN QUERIES or
queries that are run over and over again.
Every time the Parsing Engine sees the USI column in the WHERE clause it comes up
with a plan that involves only two AMPs. USI query is a two-AMP operation.

The NUSI Subtable is always AMP-Local.
A NUSI is a Non-Unique Secondary Index, obvious again meaning that the value is
Non-Unique and there could be thousands, millions or even billions of duplicates.
So the Parsing Engine takes on a different strategy when building the NUSI Subtable.
Each row in the Subtable only tracks the Base rows on the same AMP. This is what is
meant by AMP-Local.
USI rows are hashed and NUSI rows are AMP-Local.

NUSI Subtable is AMP-Local
The figure shows that the AMP
labeled A Typical AMP holds two
base rows of the Employee Table.
The First_Name values, which was
the column we created the NUSI
Index on holds two values on this
AMP, which are Rakish and Vu.
So in this typical AMPs Sub-table
there will be two rows tracking
Rakish and Vu.
A NUSI Sub-table is always created
on each AMP, but in each AMPs
Sub-table are only values local to
the base rows for that AMP.

USI query is always a two-AMP operation.
A NUSI query is an All-AMP operation, but not a full table scan.
An USI query is much faster than a NUSI.
The Parsing Engine will use an USI at a moments notice, but it will not always
choose to use a NUSI. Sometimes it will choose a Full Table Scan over a NUSI.
The Parsing Engine will never choose a Full Table Scan over an USI.
Summary of Index and Keys

Primary Key
Foreign Key
Primary Index
Secondary Index
One PK
Multiple FKs
One PI
0 to 32 SIs
Unique values
Unique or non-unique
No NULLs
NULLs allowed
NULLs allowed
NULLs allowed
Values should not change
Values may be changed

(redistributes row)
Column should not change Column should not change
Column cannot be changed

Index may be changed
(drop and recreate table) (drop and recreate index)
No column limit
No column limit
64-column limit
64-column limit
n/a
FK must exist as PK in the

related table
n/a
n/a
Join Index
The Join Index is an index structure that contains columns from one or more
tables. Once created, it becomes an option available to the optimizer but is never
directly accessed by the user.
It actually creates new Physical Table. Hence required Permanent Space.
Join Indexes are automatically updated when the base table changes.
Join Index can have different Primary Index than the Base Table.
Type of Join Index

Single Table Join Index
Distributes the rows of a single table on a foreign key hash value.
Multi-Table Join Index
Pre-Joins multiple tables and stores and maintains the results with the
base tables.
Aggregate Join Index
Aggregates one or more columns into a summary table and maintains the
results with the base tables.
Example of Join Index

Customer Table
UPI
Cust_ID
Cust_Name
Order Table
UPI
Order_ID Cust_ID
Order_Date
Total
Single Table Join Index: Retrieve data based on Customer Name

Single Table Join Index
NUPI
Cust_ID
Cust_Name
Multi Table Join Index: Retrieve Order details by Customer ID

Multi Table Join Index
NUPI
Cust_ID
Cust_Name
Order_ID
Order_Date
Total
Aggregate Join Index: Retrieve Total Order Amount by Customer ID and Year
Aggregate Join Index
UPI
Cust_ID
Order_Date_Year
TOTL_YER_AMT
Syntax of Join Index
Create Join Index < Join Index Name > AS

< Single Table Select Query >
< OR Multi Table Join Query >
< OR Aggregate Query >
< Primary Index ( col_name ) > ;
Introduction to Data Protection

Several types of data protection are available with the Teradata Database
Locks
Fallbacks
Fallback Cluster
Cliques
Journals
RAID
Locks
Locking prevents multiple users who are trying to access or change the same data
simultaneously from violating data integrity
This concurrency control is implemented by locking the target data
Locks are automatically acquired during the processing of a request and released
when the request is terminated
We have four types of locks in Teradata while they can be acquired in three
different levels
Lock types are automatically applied based on the SQL command:
SELECT
UPDATE
CREATE TABLE
Applies a Read lock.

Applies a Write lock.
Applies an Exclusive lock.
Locks (Contd.)
Levels of Locking
Locks may be applied at three levels:
Database Locks: Apply to all tables and views in the database.
Table Locks: Apply to all rows in the table or view.
Row Hash Locks: Apply to a group of one or more rows in a table.
Locks (Contd.)
There are 4 types of Locks
Access Lock:The use of an access lock allows for reading data while modifications are
in process. Access locks are designed for decision support on tables that
are updated only by small, single-row changes. Access locks is not
concerned about data consistency. Access locks prevent other users from
obtaining the Exclusive locks on the locked data.
Read Lock:Read locks are used to ensure consistency during read operations.
Several users may hold concurrent read locks on the same data, during
this time no data modification is permitted. Read locks prevent other users
from obtaining the Exclusive locks and Write locks on the locked data.
Locks (Contd.)
Write Lock:Write locks enable users to modify data while maintaining data
consistency. While the data has a write lock on it, other users can
only obtain an access lock. During this time, all other locks are
held in a queue until the write lock is released.
Exclusive Lock:Exclusive locks are applied to databases or tables and not to

rows. When an exclusive lock is applied, no other user can access
the database or table. Exclusive locks are used when a DDL
command is executed .An exclusive lock on a database or table
prevents other users from obtaining any lock on the locked object
Fallback
Fallback is a Teradata database feature that protects data in the case of an AMP
vproc failure.
Fallback guarantees the maximum availability of data.
We can specify Fallback protection at the table or database level. It ensures high
availability of the applications.
Fallback protects our data by storing a second copy of each row of a table on a
different AMP in the same cluster(cluster has more than one AMPs).
Fallback provides AMP fault tolerance at the table level. With Fallback tables, if one
AMP fails, all data is still available and we can continue to use Fallback tables without
any loss of access to data.
During table creation or after a table is created, we may specify whether or not the
system should keep a Fallback copy of the table.
Fallback (Contd.)
Benefits of fallback
Permits access to table data during AMP off-line period
Adds a level of data protection beyond disk array RAID
Automatically restores data changed during AMP off-line
Critical for high availability applications
Cost of fallback
Twice the disk space for table storage is needed
Twice the I/O for INSERTs, UPDATEs and DELETEs is needed
Fallback (Contd.)
Fallback: How It Works

Fallback is accomplished by grouping AMPs into clusters. When a table is defined as Fallbackprotected, the system stores a second copy of each row in the table on a "Fallback AMP" in the
AMP cluster.
Below is a cluster of four AMPs. Each AMP has a combination of Primary and Fallback data rows:
Primary Data Row: A record in a database table that is used in normal system operation.
Fallback Data Row: The online backup copy of a Primary data row that is used in the case of an
AMP failure.
Fallback Clusters
A defined number of AMPs treated as a fault-tolerant unit.
Fallback rows for AMPs in a cluster reside in the cluster.
Loss of an AMP in the Cluster permits continued table access.
Loss of two AMPs in the cluster causes the RDBMS to halt.
Cliques
A clique (pronounced, "kleek") is a group of nodes that share access to the same disk arrays.
Each multi-node system has at least one clique. The cabling determines which nodes are in
which cliques -- the nodes of a clique are connected to the disk array controllers of the same
disk arrays.
Teradata CLIQUES are a method of system protection against the failure of an entire node.
Each node contains in memory AMP VPROCs.
Each AMP is attached to one virtual disk (Vdisk) and that AMP is the only Vproc allowed access
to its Vdisk.
A Clique utilizes access to a set of disks from another node. If a node fails the AMP VPROCs can
migrate to the node that has the backup access to its virtual disk.
The migrating AMP can continue to read and write to its Vdisk while its home node is down.
When the home node is fixed and available again the VPROCs return home.
Cliques (Contd.)
Vprocs are distributed across all nodes in the system. Multiple cliques in the system should have
the same number of nodes.
The diagram below shows three cliques. The nodes in each clique are cabled to the same disk
arrays. The overall system is connected by the BYNET. If one node goes down in a clique the
vprocs will migrate to the other nodes in the clique, so data remains available. However, system
performance decreases due to the loss of a node. System performance degradation is
proportional to clique size.
Cliques (Contd.)
Clique provides protection against the failure of an entire node
BYNET
Amp Amp
Amp Amp
Amp Amp
CLIQUE-2
CLIQUE-1
Disk Array
Amp Amp
Disk Array
Disk Array
Disk Array
Journals In Teradata
Journaling is a data protection mechanism in Teradata. Journals are generated to
maintain pre-images and post images of a DML transaction starting/ending at/from a
checkpoint. When a DML transaction fails, the table is restored back to the last available
checkpoint using the journal Images.
Journal is nothing but a record which does some kind of processing or activity.
1)Single image: One copy of data taken here.
2)Dual image: Two copies of data taken here.
3)Before image: Before changes happened to the rows backup is taken
4)After image: After changed happened to the rows image taken.
Journals In Teradata (Contd.)

There are three types of Journals
(1) Transient Journal
(2) Permanent Journal
(3) Down AMP recovery Journal (DARJ)
Transient Journal:
Is an automatic feature that provides Data Integrity
Automatic rollback of changed rows in the event of transaction failure
Data is always returned to its original state after a transaction failure
Takes Before Image (BI) of changes for rollback purpose
BI is stored in AMPs transient journal
AMPs transient journals are maintained in DBC users Perm Space
When the transaction is committed, the BI in transient journal is purged automatically
Journals In Teradata (Contd..)

Transient Journal (Contd..)
When a transaction fails:
User receives failure message
Transaction is rolled back
Locks are released
Spool files are discarded
Permanent Journal
The Permanent Journal is an optional, user specified, system-maintained journal
which is used for recovery of a database to a specified point in time.
Is used for recovery from unexpected hardware or software disasters.
May be specified for one or more tables.
Permits capture of Before Images for database rollback.

Permanent Journal (Contd.)
Permits capture of After Images for database roll forward.
Permits archiving change images during table maintenance.
Reduces need for full table backups.
Provides a means of recovering NO FALLBACK tables.
Requires additional disk space for change images.
Requires user intervention for archive and recovery activity.

Down AMP Recovery Journal (DARJ)
DARJ is started on all AMPs in a cluster when an AMP is down.
DARJ keeps track of all changes that would have been written to the
failed AMP.
When the AMP comes back online, the DARJ will catch-up the AMP by
applying the missed transactions.
Once everything is caught up, the DARJ is dropped.
After the loss of any AMP, a Down-AMP Recovery Journal is started

automatically. Its purpose is to log any changes to rows which reside
on the down AMP. Any inserts, updates or deletes affecting rows on the
down AMP, are applied to the Fallback copy within the cluster. The AMP
which holds the Fallback copy logs the row-id in its Recovery Journal.
RAID
RAID Redundant Array of Independent Disks provides protection against a
disk failure.
Teradata uses RAID-1
RAID 1
Transparent Mirroring.
Provides high data availability and performance, but storage costs are high.
Characteristics:
Data is fully replicated.
Mirrored striping is possible with multiple pairs of disks in a drive group.
Transparent to operating system.
Advantages:
Maximum data availability, read performance gains.
No performance penalty with write operations.
Fast recovery and restoration.
Disadvantages:
50% of disk space for mirrored data.
RAID (Contd.)
RAID 5
Data Parity Protection, Interleaved Parity.
Characteristics
Data and parity is striped and interleaved across multiple disks.
XOR logic is used to calculate parity.
Data is reconstructed on a disk failure .
Transparent to operating system.
Advantages
Provides high availability with minimum disk space (e.g., 25%) used for
parity overhead.
Disadvantages
Write performance penalty.
Performance degradation during data recovery and reconstruction.
RAID (Contd.)
Spaces
Perm Space
Perm Space is Max amount of space available for storing:
Tables, Secondary Index (SI), Permanent Journal
Perm Space defines the upper limit, not allocated at table creation time
Perm Space is released when data is deleted or when objects are dropped
Following require no Perm Space:
Views, Triggers, Macros.
All Perm Space specifications are subtracted from the creator
Perm Space is a zero sum game - the total of all Perm Space allocations must equal
the total amount of disk space available
Spaces (Contd.)
Spool Space and Temp Space
Spool Space is Max amount of work space available for requests
Spool Space is used to hold intermediate and final query result set
Spool Space is literally unused Perm Space
Spool Space specified is the upper limit for query answer set
If the query exceeds the limit, query gets aborted immediately
You do not add or subtract when spool space is given to someone else
Temp Space is also unused Perm Space used for Global Temporary Tables
Types of Tables
SET Table
MULTISET Table
Keeps Table Definition in Data Dictionary
Keeps Table Definition in Data Dictionary
No Duplicate Rows are allowed
Duplicate Rows are allowed
Functionality is Teradata Extension, not an ANSI

Standard
Functionality is as per ANSI Standard
To identify duplicate rows Teradata Parser has to do

No additional checks required before inserting data
additional Check
Use Permanent Space
Use Permanent Space
Syntax: Create Set Table ....
Syntax: Create Multiset Table ...
Types of Temporary Tables

GLOBAL TEMPORARY Table
VOLATILE Table
DERIVED Table
Local to users session but each

session has its own instance
Local to user's session
Local to Single Query
Keeps Table Definition in Data

Dictionary
Table Definition doesnt store in

Data Dictionary
Table Definition doesnt store in

Data Dictionary
Table Definition can be shared by

multiple user session, However each Table Definition and data is local to Table Definition and data is local to
user session can materialize its own
user's session
Single Query
local copy
The materialize instance of table is Table is discarded at the end of user

discarded at the end of user session
session
Table is discarded when query

finishes
Use Temp Space to store data
Use Spool Space
Use Spool Space
Survive system restart
Not survive system restart
Not survive system restart
Syntax: Create Global Temporary

Table ..
Syntax: Create Volatile Table ...
It is created part of query

Every sub-query creates Derived
Table

Teradata - Architecture II - v1 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Teradata - Architecture II - v1 1

Uploaded by

Copyright:

Available Formats

| HIGH PERFORMANCE INFORMATION SOLUTIONS

Non-Unique Primary Index (NUPI)

Each table in Teradata is required to have a Primary Index.

It is a Physical mechanism used to store and access rows .

The Primary Index plays 3 roles:

It may consist of a single column or a combination of up to 64 columns.

Defined in CREATE TABLE statement.

Primary Index (Contd.)

Changing the choice of Primary Index requires dropping and recreating

May be unique or non-unique, Values may be changed .

One AMP operation.

Primary Index (Contd.)

Secondary Index (Contd.)

Secondary Index (Contd.)

Secondary Index (Contd.)

Secondary Index (Contd.)

Secondary Index (Contd.)

Summary of Index and Keys

Values should not change

Values may be changed

Values may be changed

Values may be changed

Column should not change Column should not change

Column cannot be changed

FK must exist as PK in the

Type of Join Index

Example of Join Index

Single Table Join Index: Retrieve data based on Customer Name

Multi Table Join Index: Retrieve Order details by Customer ID

Syntax of Join Index

Create Join Index < Join Index Name > AS

Introduction to Data Protection

Applies a Read lock.

Exclusive Lock:Exclusive locks are applied to databases or tables and not to

Fallback: How It Works

Journals In Teradata (Contd.)

Journals In Teradata (Contd..)

Journals In Teradata (Contd..)

Requires user intervention for archive and recovery activity.

Journals In Teradata (Contd..)

After the loss of any AMP, a Down-AMP Recovery Journal is started

Keeps Table Definition in Data Dictionary

Keeps Table Definition in Data Dictionary

No Duplicate Rows are allowed

Duplicate Rows are allowed

Functionality is Teradata Extension, not an ANSI

Functionality is as per ANSI Standard

To identify duplicate rows Teradata Parser has to do

Use Permanent Space

Syntax: Create Set Table ....

Syntax: Create Multiset Table ...

Types of Temporary Tables

Local to users session but each

Local to user's session

Local to Single Query

Keeps Table Definition in Data

Table Definition doesnt store in

Table Definition doesnt store in

Table Definition can be shared by

The materialize instance of table is Table is discarded at the end of user

Table is discarded when query

Use Temp Space to store data

Use Spool Space