Professional Documents
Culture Documents
Agenda
Introduction Basic Concepts Extraction, Transformation and
Loading Schema Modeling SQL for Aggregation
Introduction
Data Warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It separates analysis workload from transactional workload and enables an organization to consolidate data from several resources.
Introduction
Who is the best customer for last quarter? How a P/E of a stock moved in the whole
year? Which department produced the maximum profits in the current financial year? (Variable allowance and bonus calculation) Producing the control chart to know about adherence to company process on the basis of defect log mechanism. Answer to all the above A Managed Data warehouse.
Introduction
Introduction Architecture
Basic Concepts
End user is only interested in the aggregate data
rather than individual transactions. So both logical and physical design in an effective way is first requirement. Entity-Relationship (ER) modeling involves identifying the things of importance (entities), properties of these things (attributes) and the relationship. Tools used in case of modeling the ER are Oracle Warehouse Builder and Oracle Designer.
Basic Concepts
Three Basic schemas used
Third Normal Form (3NF) schema Star schema Snowflake schema
Extract
Load
Transform
Store
External tables
Multitable insert Merging of SQLs Partitioning Informations Materialized view enhancements Index Enhancement Memory Manipulation
Extraction, Transformation and Loading Contd.. External Tables Access to data stored in
flat files on the OS.
queried directly using SQL as if it were in the database. No DML opertion and indexing is possible.
Database Table Access to data are stored in the database. SQLs are used for data retrieval.
indexing is possible.
temporary table and then to the main DB table. Reduces the required space during ETL.
2. Create a directory
create directory test_dir as /ora/stage/bibhu/external grant read on directory test_dir to bibhu grant write on directory test_dir to bibhu
External tables
Multitable insert
Sql merge List partitioning Materialized view enhancements Bitmap join indexes Memory enhancements
Three kinds:
Unconditional Conditional All Conditional First
Unconditional insert
For each row returned by the
subquery, each into clause will be executed without restriction
Insert FIRST
that evaluates to TRUE will be executed Oracle checks all the when clause
Oracle9i
External tables Multitable insert
Sql Merge
List partitioning Materialized view enhancements Bitmap join indexes Memory enhancements
Advantages:
Before Oracle9i, a number of DML statements or PL/SQL blocks needed. Overall loading performance is improved because it reduces the number of table scans. Look into the text attached with this slide to get a practical use of this.
merge into cost_revenue d using cr_source s on (d.inv_id = s.inv_id) when matched then update set d.prod_id = s.prod_id, d.cust_id = s.cust_id, d.cost = s.cost, d.revenue = s.revenue when not matched then insert (prod_id,cost,revenue,cust_id,inv_id) values (prod_id,cost,revenue,cust_id,inv_id
Oracle9i
External tables Multitable insert Sql Merge
Partitioning
Partitioning
What is partitioning ?
Partitioning breaks up one large table
into several more manageable pieces called partitions Tables and indexes can partitioned Use it when having large tables Advantage : manageability and performance
Application
SQL
Sales
Jan Feb
Mar
Range partitioning
JAN2004
FEB2004
JAN2004 MAY2004
MAR2004
APR2004
MAY2004
Hash Partitioning
Hash Partitioning uses maps data to partitions using hashing algorithm
PART1
PART2
PART3
HASH function
Key value
PART4
Composite Partitioning
create table cost_revenue ( nr number, logofftime date, logon_time date, user_id number, name_id number, value number ) partition by range (user_id) subpartition by hash (nr) subpartitions 4 (partition p1 values less than (11), partition p2 values less than (21), partition p3 values less than (31), partition p4 values less than (41))
RANGE (user_id)
10 PART1a 20 PART2a 30 PART3a 40 PART4a
HASH (nr)
10 PART1b
20 PART2b
30 PART3b
40 PART4b
10 PART1c
20 PART2c
30 PART3c
40 PART4c
10 PART1d
20 PART2d
30 PART3d
40 PART4d
List Partitioning
precise control over which data maps to
which partition specify a list of discrete values for the partition column and assign a group of those values to individual partitions each partition in a list partitioning scheme corresponds to a list of discrete values.
List Partitioning
useful along a column
with discrete values
continuous column most often, tables are range partitioning by time, so that each range partition contains the data for a given range of time values
Oracle9i
External tables Multitable insert Sql Merge List Partitioning
Materialized View
Enhancements
Bitmap join indexes Memory enhancements
Fresh MV
MV_CR cust_id 88230 88230 88230 88231 88231 88231 88232 88232 prod_id profit 536 537 538 536 537 538 536 537 120 230 -15 248 36 150 -96 250
88232 Stevens prod_id cost revenue 125 241 124 147 85 200 211 125 263 365 185 230
145698 537 145699 538 145700 537 145701 538 145702 536 145703 537
query rewrite
Stale MV
MV_CR cust_id 88230 88230 88230 88231 88231 88231 88232 88232
prod_id
88232 Stevens prod_id cost revenue 125 241 124 147 85 200 180 211 125 263 365 185 230 169
145698 537 145699 538 145700 537 145701 538 145702 536 145703 537 145704 538
STALE
new records
No query rewrite in 8i
grain than the entire materialized view identify which rows are affected by a certain detail table partition at least one of master tables need to partitioned
Stale MV
MV_CR cust_id 88230 88230 88230 88231 88231 88231 88232 88232
prod_id
Part. 1 145698 537 537 Part. 2 538 145700 537 Part. 3 145701 538 536
145702 538 145703 538 145699 537
Cost_revenue 88232 Stevens cr_id prod_id cust_id revenue 88230 88231 88232 88231 88230 88232 211 125 263 365 185 230
Query rewrite In 9i
Oracle9i
External tables Multitable insert Sql Merge List Partitioning Materialized View Enhancements
Bitmap index each distinct value is stored with its own bitmap
1-3
4-6
7-8
M0 0 1 0 1 1 0 F 1 1 0 1 0 0 1
with the associated rowids of the values in the other table BJI contains data from two or more tables no need to access the dimension tables and calculating the join
Oracle9i
External tables Multitable insert Sql Merge List Partitioning Materialized View Enhancements Bitmap Join Indexes
Memory Enhancements
Memory enhancements
1. SGA:
Dynamic Memory Management
2. PGA:
Automatic Memory Tuning
= = =
SGA_MAX_SIZE
DBA task: choosing correct amount of memory each server process may allocate
sort_area_size hash_area_size bitmap_merge_size create_bitmap_area_size
Advantage ?
reduces time and effort required to tune memory parameters can compensate for low or high memory usage along with controlling the maximum amount of memory the PGAs can use
Any Questions?
Thank you