Professional Documents
Culture Documents
Agenda
Technical Summary of Teradata Database Our Development Priorities
Teradata Database
Is Relational Database Management System
Client Server architecture Support for open standards (ODBC, OLE-DB, ANSI) Support for emerging interoperability Built-in automatic parallel processing
Enables SHARED NOTHING architecture Special purpose data loads Special purpose backup utilities
Hardware Platform
Operating System
User data
Journal
Teradata is a DB Server
User Data
Journal
INTEL Node
VNET
VAMP3
VAMP4
User
Data
User
Data
User
Data
User
Data
PE2
PE1
Cache
VNET
VAMP1
VAMP2
VAMP3
VAMP4
User
Data
User
Data
User
Data
User
Data
UNIX SVR4 or NT
VNET
VAMP1 VAMP2 VAMP3 VAMP4 VAMP5 VAMP6
UNIX SVR4 or NT
Note how the VNET is used for message passing - to pass the row to its destination VAMP
PE2
PE1
Each AMP qualifies its rows autonomously PE waits on each AMP to broadcast completion
> PE issues an All-Amps send broadcast to the VNET > Each AMP sends a row to the VNET > VNET merges the qualifying rows
UNIX SVR4 or NT
PE2
PE1
UNIX SVR4 or NT
PE2
PE1
UNIX SVR4 or NT
PE3
PE4
Purpose
Fast initial data load into new table. Secondary indexes built later. Fast Update, Insert, Upsert, Delete into 1-5 tables for 1 input pass. Continuous Update, Insert, Upsert, Delete Fast Data Unload of data from tables. More traditional execution of SQL for creating tables, reports, tiny update.
Teradata is Teradata
Automatic Automatic Automatic Automatic Automatic Automatic Automatic Automatic Automatic Automatic
self balancing data placement load balancing of client sessions parallelism for data load/update/archive transaction back-out and control checkpoint/restart of load/update/archive raid disk transparency node recovery transparency workload management re-start of database after abort data connectors for pipes, messaging queuing
No Files, no TableSpaces, no Extents, no Datasets No single point of failure = Very High RAS
Desktop
Windows 9x, NT XP, W2K
Internet
Network Computers, MS Internet Explorer Netscape, Java
UNIX
NCR, Solaris, HP, AIX
Mainframes
IBM, Bull and more...
Teradata Warehouse
LAN
PE2 PE1 VNET VAMP1 VAMP2 VAMP3
Queryman TeraMiner
FastLoad
Empty single target only
TCP/IP Call Level Interface
LOGON TDP0/Vic, Winch; DROP TABLE INVOICELINE_ERROR1; DROP TABLE INVOICELINE_ERROR2; BEGIN LOADING INVOICELINE ERRORFILES INVOICELINE_ERROR1, INVOICELINE_ERROR2; DEFINE ORDERNO (CHAR(08)) , ORDERQTY (DEC(05)) , CUSTOMERNO (CHAR(08)) , ITEMNO (CHAR(08)) File = /Custdata; SHOW; INSERT INTO INVOICELINE ( OderNumber , OrderQuantity, CustmerId, ProductId) ;
FastLoad
UNIX SVR4
PE2
PE1 VNET
AMP1
AMP2
AMP3
END LOADING;
FastLoad
Disables transient journals for this job (= fast) BIG History loads (several files = several jobs)
Do Checkpoint Can re-start a job Do check the Error Tables as you go Each job is moving a files worth to Teradata Table is not useable until END LOADING (initiates Step 2).Table now useable
Can abort a single job (Drop all Tables) and start again
MultiLoad
Multiple input files Multiple target tables Logic for control of SQL processing
.BEGIN IMPORT MLOAD TABLES ACC_DATA WORKTABLES ACC_LOAD_DELTA_WT, ERRORTABLES ACC_LOAD_DELTA_ET ACC_LOAD_DELTA_UV;
.DML LABEL INSACC; INSERT INTO ACC_DATA (. .DML LABEL UPDACC DO INSERT FOR MISSING UPDATE ROWS; UPDATE ACC_DATA SET. INSERT INTO ACC_DATA SET .. .IMPORT INFILE MLOADIN LAYOUT ACCDELTA APPLY INSACC WHERE CONTROL_CDE = 'I ' APPLY UPDACC WHERE CONTROL_CDE = 'U';
UNIX SVR4
PE2
PE1 VNET
AMP1
AMP2
AMP3
.END MLOAD;
MultiLoad
Uses purpose built MLOAD journals (not Transient Journal)..Sorts to the sequence processed from the input file(s) UPSERT processing Must think in SET processing terms MultiLoad places an MLOAD lock on the Table
The table is not accessible (dirty read only )
CRM
Front-Office Operational
ERP / SCM
Back-Office Operational
Marketing E-Commerce
Sales
Customer Service
Service Provisioning
A single view of the business Analysis of detail-level data Unlimited ability to grow Real-time access to the data from front or back office operational systems Near real-time data feeds from operational systems Eliminate expensive, inefficient data marts and Operational Data Stores
Query Manager
Business Users
Business Question
Data Warehouse
SQL
action
Streams of information
Data
and optimized by the Data Warehouse Application continuously queries the Data Warehouse to analyze real-time information (which is continuously refreshed) Results are compared to trigger points If a threshold is reached, an automatic action is initiated, or a User is alerted
Data Warehouse
ACTION
Basis of Decision
TRIGGER POINT?
Information Data
Automatic Action
Teradata Scalability
Amount of Detailed Data Concurrent Users
Query Complexity
Simple Direct at the start Moderate Multi-table Join Regression analysis Query tool support Complex, 58-way table join 15 Pages, 37 From Clauses, 7 UNIONs, (Largest table >1 B rows)
ORDER ITEM SHIPPED QUANTITY SHIP DATE ITEM ITEM NUMBER QUANTITY DESCRIPTION