Professional Documents
Culture Documents
15
Outline
Information Systems
DW Architecture
Multidimensional Data
Model
} Relational Mapping
} OLAP Operations
} Star Joins
} SQL Extensions for DW
}
}
Chapter 8:
Data Warehousing
Database Schema
Product
supplies
Supplier
(0,*)
buys
Umsatz,
Portfolio,
Sales
Portfolio
Quantity
(0,*)
Marketing
Werbung
Customer
06.02.15
Typical Questions:
}
}
}
}
}
}
Overview
Monitoring &
Administration
Application Example
Data Warehouse
Query/
Reporting
ETL
Entity
Data
Mining
Operational
Databases
OLAP Server
Analysis
External Sources
Problems
}
Metadata
Repository
OLAP Server
Data Marts
Example: Query
}
}
}
}
}
}
Analysis questions
}
}
}
}
}
}
06.02.15
Example: Report
Measure
Total
Red wine
Beer
Sales
2010
2010
2011
2011
Total
Beer
Red Wine
Total
Hesse
45
32
Thuringia
52
21
77
73
Total
97
53
150
Hesse
60
37
97
Thuringia
58
20
78
Total
118
57
175
Year
He
sse
Th
in
ur
gia
ta
To
States
10
More Definitions
}
11
time-variant data
}
Data Warehouse process, i.e. all steps of collecting & integrating data
(extraction, transformation, loading) as well as storing and analysing
Data Mart
}
Data is organized in a way that all information relating to the same real-world
event or object are linked together
Data Warehousing
}
subject-oriented
}
Business Intelligence
}
12
06.02.15
REGION
NATION
CUSTOMER
SUPPLIER
ORDERS
PARTSUPP
LINEITEM
PART
13
14
22 queries
SELECT c_name, c_custkey, o_orderkey, o_orderdate,
o_totalprice, SUM(l_quantity)
FROM customer, orders, lineitem
WHERE o_orderkey IN (
SELECT l_orderkey
FROM lineitem
GROUP BY l_orderkey
HAVING SUM(l_quantity) > :1)
AND c_custkey = o_custkey AND o_orderkey = l_orderkey
GROUP BY c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
ORDER BY o_totalprice desc, o_orderdate
15
16
06.02.15
DW Market
}
OLAP tools/servers
}
ETL tools
}
}
}
}
17
Reference Architecture
18
Staging Area
Data
Sources
Extraktion
Data Cube
Staging Area
DB
Loading
ODS
Loading
}
Analysis
Transformation
Data
Warehouse
Manager
Monitor
Metadata
Manager
Data flow
Control flow
Events
19
Repository
20
06.02.15
Basic Concepts
Dimensions
Facts and Measures
Product
Category
Article
Measures
Sales
Time
Year
Quarter
Month
Store
City
Region
21
Dimensions
}
}
}
}
22
Hierarchies in Dimensions
}
}
Dimension element:
}
State
Top
Country
Year
City
Quarter
Store
Month
Week
Day
24
06.02.15
Dimension Schema
}
}
}
Categorical Attributes
}
}
8i, 1 i n : Di ! T opD
}
}
Classification attributes
}
}
}
}
Structure of a Dimension
Status
Article
Order
27
Facts / measures
}
dimensional
Attributes
}
}
Inventory
location
primary attribute
Facts:
}
Product
category
Order
price
Brand
26
Top
classification
attributes
Dimensional Attributes
9i, 1 i n, 8j, 1 j n, i 6= j : Di ! Dj
25
Measures
}
}
}
28
06.02.15
Measures: Schema
}
Measures: Calculation
}
}
Scalar functions
}
}
Aggregate functions
}
}
}
30
}
}
}
}
}
}
}
31
2 dimensions = table
3 dimensions = cube
>3 dimensions = hybercube = multi-dimensional domain structure
Schema C of a cube
}
VALUE-PER-UNIT (VPU)
}
STOCK
}
Cube
Order-based functions
}
Measure Type
}
29
8i, 1 i k, 8j, 1 j k, i 6= j : Gi 6! Gj
}
+, -, *, /, mod
Example: sales tax = quantity * price * tax rate
C = (DS, M ) = ({D1 , . . . , Dn }, {M 1 , . . . , M m })
Orthogonality
}
32
06.02.15
ME/R: Notation
level name
attribute
name
attribute
33
ME/R: Example
34
Quantity
Multidimensional view
}
Product
group
Articles
Sales
Store
City
}
Day
Brand
Week
Costs
State
Modelling of data
Query formulation
Month
}
Quarter
Omission of transformation
Issues
}
Year
35
fact
name
36
Storage
Query formulation and execution
Information Systems | K. Sattler | TU Ilmenau 06.02.15
06.02.15
Relational Mapping
Product
Wine
Soft drinks
11/2011
12/2011
Snowflake Schema
Mapping of classification hierarchies: a separate table for
each classification level (e.g. article, product group etc.)
} Dimension tables contain
}
}
39
145
Wine
11/2011
98
Ilmenau
Ilm
u
na
245
...
Region
Year
Year_ID
Description
Brand
Brand_ID
Description
Month
Month_ID
Description
1
Week
Week_ID
Description
Sales
11/2011
de
ag
rg
bu
38
Month
Time
37
Region
1
Day
Time_ID
Date
* Month_ID
Week_ID
Sales
Product_ID
* Time_ID
Store_ID
Quantities
Revenue
*
*
Article
Product_ID
Description
Group_ID
ProductGroup
Group_ID
Description
Brand_ID
City
City_ID
Name
State_ID
Store
Store_ID
1 Name
City_ID
*
1
State
State_ID
Name
40
10
06.02.15
Star Schema
Star Schema:
}
}
}
1
Sales
Product_ID
* Time_ID
Store_ID
* Quantities
Products
Product_ID
Article
ProductGroup
Brand
Revenue
Dimension Table #1
Dim1_Key
Dim1_Attribute1
Dim1_Attribute2
Dimension Table #3
Dim3_Key
Dim3_Attribute1
Dim3_Attribute2
41
Fact Table
Dim1_Key
Dim2_Key
Dim3_Key
Dim4_Key
...
Measure1
Measure2
Measure3
...
Dimension Table #2
Dim2_Key
Dim2_Attribute1
Dim2_Attribute2
}
43
Dimension Table #4
Dim4_Key
Dim4_Attribute1
Dim4_Attribute2
Region
Store_ID
Store
City
State
42
44
11
06.02.15
Characteristics of DW applications
}
}
}
}
}
45
47
Non-compulsory constraints
Correctness is not ensured by the DBMS
Used only for query rewriting
Clauses
}
}
}
46
Products
Product_ID
Brand_ID
ProductName
Brand
Brand_ID
Name
48
12
06.02.15
Sales
Day_ID
Product_ID
Store_ID
Quantities
Revenue
Conclusions
}
}
}
49
Solution:
}
}
}
50
13