You are on page 1of 33

MRC&I-CIS/S/Dat

DATE WAREHOUSE
3/10/15

How could we implement DWH in BeDef

Goal of this briefing


2

To identify the questions to tackle


when implementing a DWH within BeDef
and analyze a possible Proof Of
Concept.
Data Warehouse Design

Table of Contents
3

Scope of this briefing


Types of reports
Data Warehouse

Performance
Maintenance
Challenges for BeDef

Reporting tool
How to handle the project

Data Warehouse Design

Scope of this briefing


4

DWH
design
principles
What do
we need?
How can
we
implement
it?
Data Warehouse Design

Types of reports
5

Reports have different goals

Discovery
Analysis
Monitoring
Prediction

Data Warehouse Design

Types of reports
6

Discov
ery

Current implementation: 90%


Ex:

Extraction,
sorting,
presentatio
n of
selected
data

Data Warehouse Design

Types of reports
7

Analysi
s
Pivottables
,
spreadshee
ts, graphs

Current implementation:

Minimal
Mostly some graphs or export to
spreadsheet

Ex:

Data Warehouse Design

Types of reports
8

Monitori
ng
KPI,
Dashboard
s, balanced
scorecards

Current implementation:

MrMgt started with reports like this


High needs

Ex:

Data Warehouse Design

Types of reports
9

Predict
ion
Data
mining,
predictive
modelling

Current implementation:

Data Warehouse Design

0%
Need is unknown

Data Warehouse
10

What

The single organizational


repository of enterprise wide
data across many or all lines of
business and subject areas.
All data of BeDef (MR, HR, ) in
one repository

Data Warehouse Design

Data Warehouse
11

Inputs

Data coming from different


Datasources

Data Warehouse Design

Structured (from DB)


Unstructured (files/sharepoint)

Data Warehouse
12

Output
s

Structured Information

Data Warehouse Design

Adapted to reporting needs


Performance of data delivery

DWH Performance
13

Performance BI mainly dependent on


DWH

Storage model
Database model
Informational model

Data Warehouse Design

DWH Storage Model


14

How
the
data is
stored
on
logical
drives

In memory vs On Disk
Row Based vs Column Based

Data Warehouse Design

DWH Storage Model


15

In Memory
vs On Disk

In Memory

Which
medium is
used for the
storage

High I/O performance


High Cost
Volatile => not durable

On Disk

Data Warehouse Design

Low I/O performance


Low Cost
durable

DWH Storage Model


16

Row vs
Column
Based
How is the
data
grouped on
storage?

Row based
UserId

FirstName

LastName

Payment

Jos

Van Pimperzele

20

2
3

Jef
Koen

Oostergast
Demeester

50
30

Column based
UserId

FirstName
LastName

Jos
Van Pimperzele

Jef
Oostergast

Koen
Demeester

Payment

20

50

30

Data Warehouse Design

DWH Storage Model


17

Row vs
Column
Based

1. Retrieve
Payment
for Jos

Row based
UserId

FirstName

LastName

Payment

Jos

Van Pimperzele

20

2
3

Jef
Koen

Oostergast
Demeester

50
30

Column based
UserId

FirstName
LastName

Jos
Van Pimperzele

Jef
Oostergast

Koen
Demeester

Payment

20

50

30

Data Warehouse Design

DWH Storage Model


18

Row vs
Column
Based

1. Retrieve
Payment
for Jos
2. Retrieve
avg
Payment

Row based
UserId

FirstName

LastName

Payment

Jos

Van Pimperzele

20

2
3

Jef
Koen

Oostergast
Demeester

50
30

Column based
UserId

FirstName
LastName

Jos
Van Pimperzele

Jef
Oostergast

Koen
Demeester

Payment

20

50

30

Data Warehouse Design

DWH Database model


19

How
the
data is
structu
red

Relational model

Dimensional model
CI
Process

Data Warehouse Design

Date
Nbr Days Call
open

MatMan

DWH Database model


20

Relational

Transactional
All queries
possible
Aggregate

Data Warehouse Design

Dimensional

Analytical
Specific per query
Aggregate

DWH Informational Model


21

How
the
data is
analyze
d

OLAP Cubes

Data Warehouse Design

Not a database model


MetaData model on top of
dimensional model

DWH Maintenance
22

Maintaining DWH comes at a cost

New data sources


New needs in reporting
Changes in datasources

The architecture of a DWH influences


this cost

Data Warehouse Design

DWH Architecture
Approaches
23

Inmon

Data Warehouse Design

Kimball

DWH Architecture
Approaches
24

Inmon

Organizational
Normalized
Top down
Easy to add
projects

Data Warehouse Design

Kimball

Departemental
Not normalized
Bottom up
Difficult to add
projects

BI reporting tool
requirements
25

Management of reports
Creation of reports
Can handle analytical DBs
(columnar/dimensional)
Price

Data Warehouse Design

Challenges in BI within BeDef


26

Shift from discovery to


analytical/monitoring reports

Shift from relational to dimensional model

Architecture choice

[pers opinion]=>Inmon (or hybrid solution)


Knowledge

normalized relational model exist

(XDE)
Favorises
Data Warehouse
Design discovery reports

Comments on handling
project
27

Who chooses the path to follow

XDE = Cross Domain Date Exploitation

All clients (eg MR should have their say)


An intial choice is hard to override (eg
Inmon/Kimball)
Bottom-up approach Cross Domain

BI reporting tool should not be the initial goal

Data Warehouse Design

Proposal action plan project


28

Needs

Current datamart
Current reporting

Identi
fy
needs

Data Warehouse Design

Proposal action plan project


29

Architecture

Choos
e
Archit
Identif ecture
y
needs

Data Warehouse Design

Proposal action plan project


30

Identif
y
needs

Proof of Concept
DWH

Test DWH
on
Choose existing
Archite infrastruc
cture ture

Data Warehouse Design

Proposal action plan project


31

Identif
y
needs

Proof of Concept
reporting

Investig
Test DWH ate BIon existing Tool
Choose infrastruct
Archite ure
cture

Data Warehouse Design

Proposal TimeLine PoC


32

Data Warehouse Design

Data Warehouse future


Project
33

Data Warehouse Design

You might also like