Professional Documents
Culture Documents
DWH
ODI
ETL Activity
Extract Transform Load
Data Migration
Data Integration
DWH
70-80
ETL Tools
ETL Approach
Informatica Power center
IBM Datastage
SAP BODS
Talend Pentaho
ELT Tools
ELT Approach
ODI
DEPT
(file)
EMP
(file)
E
D
E
emp_dept
t_emp
emp_dept
Dimensional Modeling
Create DW database and design tables
Data modeling
process of designing tables..
normalisation
E-R Modeling
De-normalisation
Dimensional Modeling
Dimensional Modeling
(de-nomalisation)
Tables in dimensional modeling are created in the form of dimensions and facts
Preferred in DWH
in dwh most of the operations are select operations
No of
App
5lk
Passed
30k
perc
10
year(1999,2000..)
gender(B/G)
Teacher
College
Subjectwise
m1
eng
sec1
20
60
70
<1
<2
50
60
Dimensional Modeling
Dimension table
Stores descriptive information
or textual attributes
or dimensional information
Fact table
contains measurable information which are known as measures
Star Schema
In star schema every dimension table is fully de-normalised
due to this when we relate dimensions and facts it will represent
a star kind of diagram
Snowflake Schema
In snowflake schema One or more than one dimension is normalised partially
(or)
In snowflake schema One or more than one dimension is normalised to some ex
Multi star/hybrid/galaxy.
group of star,snowflake schema
normalised
s normalised partially
DWH Types
Passive
data is loaded to dwh daily once
Real Time
up to date data will be loaded to dwh
Near real-time
data loaded to dwh on hourly or in regualar short period intervals
d intervals
Type-1
In Type1 Any new record => insert
any changed record update
due to this no history is maintained in dimension table.
Type-2
New record insert
Changed record also insert
and versionize old records
maintains history
In type 2 tables to understand and represent data properly we will create additioanal colum
Surrogate Key
Surrogate key is a primary key in dimension table.
we can create a dummy column in dimension table and keep inserting unique se
to act like a primary key
Current flag
Indicates which record is latest and which is older
We can update these columns with Y /N or 1/0
Y,1
N,0
latest records
oldest records
effective to date
date on which record /information is expired
always for latest records effective data can be null or a date which is bigger in c
Natural Key
Natural key is a primary key column in oltp,transactional systems
natural keys cannot be primary keys in Type2 dimension table as we insert dupli
Type-3
Type 3 maintains history in separate columns
Type3 maintains partial history
city_id
ct1
ct2
city_name
hyd
new chennai
city_id
ct1
ct2
sion table.
sion table.
n table and keep inserting unique serial numbers
h Y /N or 1/0
ransactional systems
city_name
hyd
new chennai
prev_city_name
chennai
Types of Dimensions
Slowly changing Dimension(SCD)
A dimension in which data changes very slow
fac1
Dim
fac2
Role Palying
(alias)
A single dimension refered to single fact table multiple times
(or)
A single dimension joined with another fact multiple times using alias
fac
Dim
Dim(alias)
student_id
s1
name
john
class_teacher_id
t1
temp_class_teacher_id
t2
Mini
smaller or subset
A mini diemnsion is a subset of main/big diemnsion
whenever there are few columns with slowly changing and few columns with
then we can create a separate table(mini table) to only have fastchaning co
whenever there are few columns accessed very frequently by reporting and
few columns are accessed very rare
then we can create a separate table(mini table) to only frequently accessed
Bridge Dimension
joins to diemensions
A dimension table which joins two other dimensions which are in many to m
Relation
student
student_id
s1
s2
name
xyz
pqr
One
One
Many
Many
student_id
s1
s1
s1
s2
s2
One
10 ACCOUNTING
20 RESEARCH
30 SALES
40 OPERATIONS
NEW YORK
DALLAS
CHICAGO
BOSTON
Junk Dimension
consolidated codes
A Dimension table which stores consolidation of smaller code descriptions
It also stores un known code descriptions
cust_status
cust_status
accnt_status
accnt_status
accnt_status
A
I
C
A
B
Active
Inactive
Closed
Re-Activated
Unknown
Audit Dimension
statistics
Audit Dimension stores statistical information like no of table,rows,job run ti
average run time..etc
De-generate Dimension
A dimension column which doesn't hold any business meaning
y changing/Rapidly growing/monster
xchange rates..etc
dimension
m
fac1
fac2
m
tiple times
fac
lass_teacher_id
teacher_id
t1
t2
teachers
name
reema
jinger
teacher_id
t1
t2
teachers(alias)
name
reema
jinger
class teacher
temp
o diemensions
s which are in many to many relation
course
course_id course_name
1 odi
2 obiee
3 inf
to
to
to
to
One
many
One
Many
course_id
1
2
3
1
2
Many
7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
300
500
1400
20
30
30
20
30
30
10
20
10
30
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER
idated codes
maller code descriptions
ess meaning
20
30
20
10
Fact
ETL /ETL Arch
Fact tables
Semi Additve
A measure which can be summarised/Aggregated across few dimensions and ca
few dimensions
Types of Facts
Detailed Fact/ transactional fact
Stored detailed level of information
example:
order1
day1
order1
day2
10 booked
2 cancelled
normal/Regular
stores some consolidated and calculated information
example
order_num
booked _dacancelled_dnet qty
order1
Day1
Day2
Aggregate
Aggregate table stores Pre calculated,summarised data to get the reports very q
business performance
l dimensions
Fact Loading
ETL Arch
Fact tables are most of the times only inserts the data
in some cases we can update and insert.
always dimensions have to be loaded first and then facts have to be loaded.
Every dimension table primary key will be a foreign key in fact table.
to ensure the primary and foreign key relationships from dimension to fact
we will lookup on dimensions to get the latest surrogate keys and inserts them to fact table
ETL Architecture
DWH Architecture
oltp
customer
cust_id
name
c1
xxxx
income
1289.88889
staging
cust_id
c1
name
xxxx
income
1900
cust_id
c1
name
xxxx
c1
xxxx
income
1289.889
1900
to be loaded.
nsion to fact
nd inserts them to fact table.
dim_customer
cust_id
c1
c1
name
xxxx
xxxx
income
12.89
1900
dim_customer
cust_id
name
c1
xxxx
c1
xxxx
income
1289.889
1900
ETL Architecture
Top down Approach
Bottom up approach
Layer od ETL
oltp system
Staging Area
EDW
Datamart
Reading Full data from oltp system is called FullLoad or Initial Load
generally , we will do this only time that is when we running the ETL for the first
On daily basis we will have to extract only changed records compared to last run
this is called delta extraction,incremental extraction or change data capture
ODS
Operational Data source
ODS is Integrated,volitile,current valued data.
whenever we need to consolidate all oltp data to create some operational reports
we can integrated oltp data and keep into another database which is known as ods
difference from ods dwh
ods
integrated
subject oriented
volatile
current(one day)
dwh
integrated
subject oriented
non volatile
historical
Initial Load
ning the ETL for the first time
rational reports
is known as ods
Informatica
Powercenter
IBM
Infosphere datastage
SAP
BODS
Abinitio
Abinitio
SAS
Talend
Pentaho
Talend Studio
Pentaho ETL
Microsoft
SSIS
Oracle
All above Tools follows ETL approach
Extract
Transform
Load
Oracle Data Integrator
Integrator
ch can be customised
ODI Archi
Server Components
Client Components
Repository
Repository manages metadata
metadata is data about data
any programs,code,structures or
anything explains about data
ODI Client
supervisor
ODI Studio
Designer
Operator
Topology
Security
Reposiotory
Master
Work
(Development
/Execution)
Server
Web
software file
fmw_12.1.3.0.0_infrastructure.jar
fmw_12.1.3.0.0_odi.jar
cd C:\Java\jdk1.7.0_51\bin
E:\ODI12c\Oracle\Middleware\Oracle_Home\odi\studio
create Repository
two ways to create repositories
using ODI studio.
this used to be old method before ODI12c
<<E:\ODI12c\Oracle\Middleware\Oracle_Home>>\oracle_common\bin
rcu
open command prompt in Admin mode
cd E:\ODI12c\Oracle\Middleware\Oracle_Home\oracle_common\bin
Note : set the Environment variable JAVA_HOME to java bin directory
Go to my computers
properties
Advance setting
Advance tab
Environment variables
click next
ok
jdbc:oracle:thin:@<host>:<port>:<sid>
dio
ner
tor
ogy
rity
ODI Agent
(Standalone,colocated,J2EE)
Oracle Enterprise Manager
WebLogic Server
ODI 10g
Integration with Fussion middleware
11g
es\ODI\ODI12.1.3\fmw_12.1.3.0.0_infrastructure_Disk1_1of1\fmw_12.1.3.0.0_infrastructure.jar
12.1.3\fmw_12.1.3.0.0_odi_Disk1_1of1\fmw_12.1.3.0.0_odi.jar
ommon\bin
directory
n middleware
a.java
a.class
import
main()
{
}
java a
Read data
load data
EMP_SRC78
EMP_T78
Oracle
localhost
scott
Oracle
localhost
scott
Topology
1) Configure Topology
configure all the database configuration from which we need to extract or load data
Physical Architacture
Create dataServer
here we will provide database drivers,database name and
user/password to connect to daatabase
Context
(Global)
Create physical Schema
here select the schema name from which we need to access
tables
these schemas can be used for loading table or reading
LogSchemasCOTT
Note: any object which we create in physical architecture is called as physical object
in physical schema we need to choose two schema names
schema this schema name is actual schema from where we need to extract /load tab
work schema
this schema is the palace where work table /temp tables are created.
for now we can use schema and workschema as same names.
Logical Architecture
Any object which created in Logical architecture is called as Logical object.
here we need to create logical schemas and map every logical schema to a physical sch
throgh a intermediate object called context
by default in ODI there will be one context exists by name Global
we can map one logical schema to more than one physical scheme using different cont
example
Logical
Schema
Logical1
Physical
Schema
Cont
ext
Cont
ext
Physi
cal
Dataserver
Logical1
Cont
ext
Physi
cal
SCOTT
(schema)
sh
Model
Create a datamodel under Model and map to a Logical Schema which is created above
Modeling is a process of designing tables.
Reverse Engineering process
importing the table structure from the database to ODI
Note: when we reverse engineer tables only metadata (table structure is imp
system
DataServer
(databasename
user/pwd
schema
table
scott
dept
bisample
Physical
Schema
Physical Schema
system
Context
(Global)
emasCOTT
s are created.
Dataserver
emp
Technology
Oracle
ase to ODI
table structure is imported but not data)
emp
scott
sh
dept
bisample
system
scott
emp_src
emp_tgt
KM
14
0
insert into emp_tgt
select * from emp_src
emp_
src
mapping
emp_t
gt
emp_tgt
ELT
mapping
emp_src
scott
emp
(file)
LKM
dept
(file)
c$_emp
LKM
0
c$_dept
LKM
src
ODI
emp
file
deptfil
e
RKM
SKM
JKM
Reverse Engineer KM
Services KM
Journal KM
J mapping
emp_dept
Oracle database
c$_emp
c$_dept
IKM
CKM
l
emp_dept
errortable
e$_emp_dept
empno
work tables
temp tableELT
staging
mapping
emp_src
pk
table
p_dept
Mapping
mapping components
Every mapping will have below sections
logical diagram
in logical diagram we can deisgn actual code using source ,targets an
Physical diagram
In diagram we need to choose the knowledge module which needs to
mapping execution
components
components are newly introduced in mappings to perform data transformations
sort
filter
split(multiple filters)
lookup
join
Aggregate
expressionetc
Knowledge Module
KM is a ready code template which can be used to generate code while running
below are the types of KM's
Components
LKM
IKM
CKM
JKM
SKM
Service KnowledgeModule
SKM process the data based on web services
RKM
f components
Oracle GoldenGate
ng process
Filter
Filter component filters the data based on given filter condition
we can write any valid ansi SQL where condition statement as filter
since ODI works like ELT , hence it supports SQL where conditions as filters
example
DEPTNO=10
DEPTNO=10 and SAL>0
Additionally we can also write sub query filters
example:
DEPTNO in (select distinct DEPTNO from DEPT)
SAL in (select max(sal) from EMP)
from ODI 12.1.3 there is a new component introduces to create a subquery filter
Sort
Sort component sorts the data based on the order by clause provided in condition
example
DEPTNO DESC,SAL ASC
DEPTNO DESC,SAL NULLS FIRST
ODI knowledge modules generates order by clauses in SQL statements wherever sort comp
is used
Split
1tb
int1
emp_src
dept=
10
emp_t10
int1
IKM Oracle Multi Table Insert
define_query=true
emp_src
emp_src
Split
dept=
20
deptn
ot in
10,20
emp_t20
emp_toth
es
int2
IKM Oracle Multi Table Insert
is_target=true
int3
IKM Oracle Multi Table Insert
execute=true
ubquery filter
ded in condition
m1
m2
m3
Distinct
7369 APPLE
7369 SMITH
7369 APPLE1
7499 ALLEN
7499 ALLEN
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER
CLERK
7902 17-Dec-80
SALESMAN
SALESMAN
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK
7698
7698
7698
7698
7839
7698
7839
7839
7566
7698
7788
7698
7566
7782
20-Feb-81
20-Feb-81
20-Feb-81
22-Feb-81
2-Apr-81
28-Sep-81
1-May-81
9-Jun-81
19-Apr-87
17-Nov-81
8-Sep-81
23-May-87
3-Dec-81
3-Dec-81
23-Jan-82
100
800
101
1600
1600
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300
300
300
300
500
1400
SELECT * FROM
(
SELECT E.*
,ROW_NUMBER() OVER(PARTITION BY EMPNO ORDER BY SAL DESC) SNO
--,RANK() OVER(ORDER BY SAL DESC) RK
--,DENSE_RANK() OVER(PARTITION BY DEPTNO ORDER BY SAL DESC) DRK
FROM EMP_SRC78 E
)T
WHERE SNO>1
Types of Components
Projected Components
Projected components adds the columns structure(metadata) from left side link
example:expression,distinct,set ..etc
Distinct
Distinct Component will eliminate the row duplicates of input.
This will add a distinct clause on top of input query and eliminates the rowduplicates
Note : distinct will not eliminate column duplicates
In general we can see two types of duplicates
row duplicates
where all the column values are identical
example
e1
a
100
e1
a
100
e1
a
100
column duplicates
one column value is similar , but other column values are different
example
e1
e1
e1
a
a1
a2
100
200
300
to eliminate both column and row duplicates we can use below queries
1)
SELECT * FROM EMP_SRC78
WHERE EMP_SRC78.ROWID IN (SELECT MAX(ROWID) FROM scott.EMP_SRC78 GROUP BY E
PK
20 column duplicate
20 column duplicate
20 column duplicate
30 row duplicate
30 row duplicate
30 row duplicate
30
20
30
30
10
20
10
30
20
30
20
10
1
2
3
1
2
3
1
1
1
1
1
1
1
1
1
1
1
1
7369 APPLE1
7369 APPLE
7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER
CLERK
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK
low queries
RESIDENT
7902 17-Dec-80
7698 20-Feb-81
7698 22-Feb-81
7839 2-Apr-81
7698 28-Sep-81
7839 1-May-81
7839 9-Jun-81
7566 19-Apr-87
17-Nov-81
7698 8-Sep-81
7788
###
7698 3-Dec-81
7566 3-Dec-81
7782 23-Jan-82
101
100
800
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300
300
500
1400
20
20
20
30
30
20
30
30
10
20
10
30
20
30
20
10
Aggregate
Exp
set
Joins/Lookup
Joins
natural join
inner join
left outer
right outer
full outer
cross joins
7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER
1212 APPLE
CLERK
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK
7902 17-Dec-80
7698 20-Feb-81
7698 22-Feb-81
7839 2-Apr-81
7698 28-Sep-81
7839 1-May-81
7839 9-Jun-81
7566 19-Apr-87
17-Nov-81
7698 8-Sep-81
7788
###
7698 3-Dec-81
7566 3-Dec-81
7782 23-Jan-82
800
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300
1000
300
500
1400
15
4
3
(+)
7934 MILLER
7839 KING
7782 CLARK
7902 FORD
7876 ADAMS
7788 SCOTT
7566 JONES
7369 SMITH
7900 JAMES
7844 TURNER
7698 BLAKE
7654 MARTIN
7521 WARD
7499 ALLEN
1212 APPLE
CLERK
PRESIDENT
MANAGER
ANALYST
CLERK
ANALYST
MANAGER
CLERK
CLERK
SALESMAN
MANAGER
SALESMAN
SALESMAN
SALESMAN
14
7782 23-Jan-82
17-Nov-81
7839 9-Jun-81
7566 3-Dec-81
7788
###
7566 19-Apr-87
7839 2-Apr-81
7902 17-Dec-80
7698 3-Dec-81
7698 8-Sep-81
7839 1-May-81
7698 28-Sep-81
7698 22-Feb-81
7698 20-Feb-81
1300
5000
2450
3000
1100
3000
2975
800
950
1500
2850
1250
1250
1600
1000
20
30
30
20
30
30
10
20
10
30
20
30
20
10
99
0
1400
500
300
10
10
10
20
20
20
20
20
30
30
30
30
30
30
99
3 (+)
4
ACCOUNTI NEW YORK
ACCOUNTI NEW YORK
ACCOUNTI NEW YORK
RESEARCHDALLAS
RESEARCHDALLAS
RESEARCHDALLAS
RESEARCHDALLAS
RESEARCHDALLAS
SALES
CHICAGO
SALES
CHICAGO
SALES
CHICAGO
SALES
CHICAGO
SALES
CHICAGO
SALES
CHICAGO
Err:509
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO=D.DEPTNO
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E INNER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)
Err:509
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO=D.DEPTNO(+)
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E LEFT OUTER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE D.DEPTNO(+)=E.DEPTNO
Err:509
SELECT E.*,D.DEPTNO,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO(+)=D.DEPTNO
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E RIGHT OUTER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)
Err:509
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO(+)=D.DEPTNO(+)
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO(+)=D.DEPTNO
UNION
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO=D.DEPTNO(+)
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E FULL OUTER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)
Err:509
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
Err:509
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO=D.DEPTNO
Joins
Join Components joins two or more than two input tables
these tables can be of same database or different database.
below are the type of joins
inner join
matched data of both tables.
we can use cross join when there is a single record in one of the table
natural join
joining two table based on primary and foreign keys without giving jo
supported only using ansi SQL
We can choose join component to generate ansi SQL syntaxes or database spec
ANSI SQL
ansi SQL's are standard sql's which can be supported in any database
certified by ANSI
example: SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E FULL OUTER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)
join Order
We can have one or more than one join component in a single mappin
we can choose join order for every join component to specify the orde
while sql's are generated
Note:
during all types of joins if any record of first tables matches mutliple records in s
then first table record will match with all records of second table
exmple
emp_id
e1
ename
xxx
job_id
j1
jo_id
j1
desc
manager
j1
e1
e1
xxx
xxx
j1
j1
j1
j1
manag
manager
manag
Lookup
Lookup component is similar to join but have many options
Lookup can take two inputs to join the tables
driving tables
lookup table
unmatched policy
when there are no macthed records found in lookup table for a record on driving
then we choose below options
drop records
it is requal to inner join
take defualt values eual to left outer join
OIN DEPT_SRC78 D
er of columns
er of columns
er of columns
Files
Structured
Semi structured
Un structured
Structured
Fixed width
Delimited
10|ACCOUNTING|NEW YORK
20|RESEARCH|DALLAS
30|SALES|CHICAGO
40|OPERATIONS|BOSTON
delimited character
optional quotes
Header
footer
positions
LKM will extract data from source technology to ODI staging / ODI work schema/target tech
that is the reason , LKM is required only when source and target technologies are different
LKM naming convesion following below names
example
LKM
<source technology>
To
<target technology>
LKM
File
To
Oracle
for every RDBMS technology , we can use two types of LKM 's
one with specific to technology using the Advance features of that technology
second with SQL technology where SQL is ANSI SQL supported environment
SQL technology can be used for any RDBMS , however it will generates only ANSI SQL synta
where technology specific knowledge modules like Oracle /teradata they will use advace fe
this will create a work table with prefix C$ and loads data from file to C$ table using jython
it is not recommended for large volumes
drop work table
C$_0DEPT_FILE79
create table SCOTT.C$_0DEPT_FILE79
insert file data into c$ table using jython programs and ANSI SQL
File
c$_
File
c$_
15
LKM
IKM
c$_
File
external table
External table is a concept is database where data of the table is organised externally.
since external table is mapped to a file which is sitting ontop of OS platform so we can dire
load to final target.
LKM File to Oracle (sql loader)
File
c$_
sql loader
this KM uses Oracle sql loader utility to read from file and load to work table.
it supports only date,decimal and character data types
for other datatypes we need to choose other knowledge modules
KM will create a control file to run the sql loader and runs through java program
Option2)
drag all sources to mapping
combine them using set component union all
and load to target
Option 3)
l
create a variable to read the multiple file names one after one and load them to
technology
ironment
ANSI SQL
target
target
target
rganised externally.
platform so we can directly read data from file using sql syntaxes and
target
work table.
java program
packages
cat *.txt>final.txt
Sequences
Native
sequence will start the number given in start with commond and increment with the numbe
create sequence test start with 0 increment by 1
to call the sequence we need to use nextval keyword in database
if it is odi we can use below syntax
#ProjectName.seqName_NEXTVAL
:PROJDEV78.seqNative_NEXTVAL
currval key word will give the current value of sequence.
Standard Sequence
Agent
standard sequence will increment the number when it is trigger on ODI Agent.
any mapping runs in a elt method, where data is move from sourcce to target directly , the
if it is used in etl style method then for every record a new number will ge generated
In standard , always sequence starts with 1 , we cannot specify starting number
Specific Sequence
CREATE TABLE SPEC_SEQ78(PROCESS VARCHAR2(20),SN NUMBER)
Specific sequence uses the number specified in above table with the help of given informat
table name
column name
where condition
spec_seq78
sn
process='HR'
while incrementing the numbers it follows the standard sequence approach like elt vs etl
when a new number in incremented , odi will update the number in above table.
project sequence
any sequence created within project is called project sequence . These sequences can be u
uence in ODI.
database
elt
etl
ODI Agent.
ce to target directly , there will be only on time incrememt on numbers
r will ge generated
rting number
N NUMBER)
n above table.
Variable
Variable in ODI can hold a value, which can be used further in mappings.
to assign values to variables, we need to write a database sql in ODI refresh command.
the out put of the sql is assigned to variable.
Variable can hold only one value at a time(it cannot hold array of values)
we can refresh,set,increment and evaluate variables through packages)
we can use variables to Implement incremental/delta extraction
read multiple files and load to single target
database name,user names,password and for any other values
deptno=#PROJDEV
78.varDEPTNO
src
varDEPTNo=
tgt
IKM
Intenration Knowledge Modules
IKM SQL Insert
generates ANSI SQL syntaxes to insert data to target
it will not load data in bulk mode so processing will be slow
can be used for any ansi certified database
Similar to IKM SQL Insert, However supports hints for insert and select
example
append,parallel
INSERT
UPDATE
COMMIT IS CONFIGURABLE
Supports Data Validation
Incremental Updates perform insert and update operations at a time
in this case
all new records are inserted
and existing changed records are updated
During the mapping execution
IKM will create one temporoty table by name I$
then updates the flag in the I$ table to identify and indicate the new records,changed recor
then inserts new records to target
and updates only changed records on target
IKM Oracle Incremental Update
IKM SQL merge performs incremental update operations (new record inert /exist
using SQL MERGE command
SQL Merge command will inserts new records
and updates all existing records(including changed and un changed record)
this is not recommended when there are more no of un changed records in sourc
Similar to SQL Merge , however we can keep oracle hints to improve the perform
this is not recommended when there are more no of un changed records in sourc
trol options
SELECT hints.
un changed record)
tions available
nged records
7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER
1212 APPLE
1111
9999
1
2
2
3
3
3
1 mill
CLERK
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK
7902 17-Dec-80
7698 20-Feb-81
7698 22-Feb-81
7839 2-Apr-81
7698 28-Sep-81
7839 1-May-81
7839 9-Jun-81
7566 19-Apr-87
17-Nov-81
7698 8-Sep-81
7788
###
7698 3-Dec-81
7566 3-Dec-81
7782 23-Jan-82
800
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300
1000
100
100
300
500
1400
20
30
30
20
30
30
10
20
10
30
20
30
20
10
99
10
10
SMITH
ALLEN
WARD
JONES
MARTIN
BLAKE
CLARK
SCOTT
KING
TURNER
ADAMS
JAMES
FORD
MILLER
APPLE
7369 APPLE
7499 APPLE
7521 APPLE
7566 APPLE
7654 APPLE
7698 APPLE
7782 APPLE
7788 APPLE
7839 APPLE
7844 APPLE
7876 APPLE
7900 APPLE
7902 APPLE
7934 APPLE
1212 APPLE
1111 APPLE
9999 APPLE
CLERK
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK
7902
7698
7698
7839
7698
7839
7839
7566
7698
7788
7698
7566
7782
17-Dec-80
20-Feb-81
22-Feb-81
2-Apr-81
28-Sep-81
1-May-81
9-Jun-81
19-Apr-87
17-Nov-81
8-Sep-81
###
3-Dec-81
3-Dec-81
23-Jan-82
800
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300
1000
100
100
300
500
1400
20
30
30
20
30
30
10
20
10
30
20
30
20
10
99
10
10
src
7782 CLARK
7839 KING
7934 MILLER
7369 SMITH
7566 JONES
7788 SCOTT
7876 ADAMS
7902 FORD
MANAGER
PRESIDENT
CLERK
CLERK
MANAGER
ANALYST
CLERK
ANALYST
7839
9-Jun-81
17-Nov-81
7782 23-Jan-82
7902 17-Dec-80
7839 2-Apr-81
7566 19-Apr-87
7788
###
7566 3-Dec-81
2450
9999
1300
800
2975
3000
1100
3000
10
10
10
20
20
20
20
20
2450
9999
1300
800
2975
3000
1100
3000
10
10
10
20
20
20
20
20
I$
7782 CLARK
7839 KING
7934 MILLER
7369 SMITH
7566 JONES
7788 SCOTT
7876 ADAMS
7902 FORD
MANAGER
PRESIDENT
CLERK
CLERK
MANAGER
ANALYST
CLERK
ANALYST
7839
9-Jun-81
17-Nov-81
7782 23-Jan-82
7902 17-Dec-80
7839 2-Apr-81
7566 19-Apr-87
7788
###
7566 3-Dec-81
1m
7782 CLARK
7839 KING
7934 MILLER
N
U
N
I
I
I
I
I
1m
tgt
MANAGER
PRESIDENT
CLERK
7839
7782
9-Jun-81
17-Nov-81
23-Jan-82
2450
5000
1300
10
10
10
7782 APPLE
7839 KING
7934 MILLER
7369 SMITH
7566 JONES
7788 SCOTT
7876 ADAMS
7902 FORD
MANAGER
PRESIDENT
CLERK
CLERK
MANAGER
ANALYST
CLERK
ANALYST
7839
7782
7902
7839
7566
7788
7566
9-Jun-81
17-Nov-81
23-Jan-82
17-Dec-80
2-Apr-81
19-Apr-87
23-May-87
3-Dec-81
999
5000
1300
800
2975
3000
1100
3000
10
10
10
20
20
20
20
20
1
2
3
7782 CLARK
7839 KING
7934 MILLER
MANAGER
PRESIDENT
CLERK
7839
9-Jun-81
17-Nov-81
7782 23-Jan-82
2450
5000
1300
4
5
6
7
8
9
1
2
3
7782 APPLE
7369 SMITH
7566 JONES
7788 SCOTT
7876 ADAMS
7902 FORD
7782 CLARK
7839 KING
7934 MILLER
MANAGER
CLERK
MANAGER
ANALYST
CLERK
ANALYST
MANAGER
PRESIDENT
CLERK
7839 9-Jun-81
7902 17-Dec-80
7839 2-Apr-81
7566 19-Apr-87
7788
###
7566 3-Dec-81
7839 9-Jun-81
17-Nov-81
7782 23-Jan-82
999
800
2975
3000
1100
3000
2450
5000
1300
for scd2
select Add row on change for all columns
in case for some columns if we need to override the change(no history) then
choose Override on change
We can choose all columns as "Add row on change"
this is complete SCD2
We can choose some columns as "Add row on change" and some columns as "Ove
in this case it is hybrid
10
10
10
1
1
1
17-Apr-15
17-Apr-15
17-Apr-15
1-Jan-00
1-Jan-00
1-Jan-00
10
20
20
20
20
20
10
10
10
1
1
1
1
1
1
0
1
1
21-Apr-15
21-Apr-15
21-Apr-15
21-Apr-15
21-Apr-15
21-Apr-15
17-Apr-15
17-Apr-15
17-Apr-15
1-Jan-00
1-Jan-00
1-Jan-00
1-Jan-00
1-Jan-00
1-Jan-00
21-Apr-15
1-Jan-00
1-Jan-00
e columns as "Override"
Check KM
flow control
recycle
I$_EMP_T78
insert into SCOTT.I$_EMP_T78
SNP_CHECK_TAB
E$_EMP_T78
delete
invalid
source
SNP_CHECK_TAB
3
I$_EMP_T78
Logics
statistics
1
8
2
E$
5
7839
7934
7788
7782
7782
7369
7369
7782
7902
7499 ALLEN
7521 WARD
7654 MARTIN
7698 BLAKE
7844 TURNER
7900 JAMES
SALESMAN
SALESMAN
SALESMAN
MANAGER
SALESMAN
CLERK
7698
7698
7698
7839
7698
7698
20-Feb-81
22-Feb-81
28-Sep-81
1-May-81
8-Sep-81
3-Dec-81
1600
1250
1250
2850
1500
950
300
500
1400
0
30
30
30
30
30
30
SNP_CHECK_TAB
statistics
4
emp_t
7566 JONES
7876 ADAMS
MANAGER
CLERK
5
6
E$
KING
MILLER
SCOTT
APPLE
APPLE
BIGATA
SMITH
CLARK
FORD
PRESIDENT
CLERK
ANALYST
17-Nov-81
7782 23-Jan-82
7566 19-Apr-87
5000
1300
3000
MANAGER
CLERK
CLERK
MANAGER
ANALYST
7839 9-Jun-81
7902 17-Dec-80
7902 17-Dec-80
7839 9-Jun-81
7566 3-Dec-81
999
800
800
2450
3000
0
0
10
78
99
20
55
40
60
10
10
20
10
10
20
20
10
99
7839
7788
2-Apr-81
###
2975
1100
30
80
20
20
CDC
cust_id name
c1
xyz
c2
abc
c2
C
D
add C
hyd S
hyd o
ft
w
a
r
e
2 24th
last_upd_dt
22nd
23rd
1m
1m
1k
chang
ed
record
odi
1)
2)
3)
abc
hyd
23rd
1k
Sub78
j$
Sub78
Sub78
Sub78
Sub78
Sub78
Sub78
SUNOPSIS
SUNOPSIS
SUNOPSIS
SUNOPSIS
SUNOPSIS
SUNOPSIS
1I
1I
1I
1I
1I
1I
0I
0I
0I
0I
0I
0I
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
7499
7521
7654
7698
7844
7900
7499
7521
7654
7698
7844
7900
CDC/ journal
simple
consistent set
emp
empid
e1
ename
abc
deptno
10 e1
abc
10
j
dept
deptno loc
10 hyd
10 hyd
c1
xyz
hyd
ng ODI variable
rmatica power exchange..ibm cdc
22nd
d tables combination
scd-1/2
empid
e1
ename
xyz
deptno
loc
10 hyd
Procedure
src
tgt
AGent
T
Proc
task1
task2
task3
task4
task5
Isolation level
uncommited
committed
repeatable
serializable
insert
insert
insert
insert
insert
into
into
into
into
into
emp
emp
emp
dept
dept
5 tran0
10 tran0
3 tran0
2 tarn1
4 tarn1
nocommit
nocommit
commit
nocommit
commit
Scenario
001
scenario
In real- time always we need to generate scenarios and execute the code, so that any accid
on mapping will not be impacting the tested/certified code
If work repository type is development , then we can create source code like (mapping,proc
and also keep scenarios
Versioning
Versioning craetes copies of the source in development repository
we can compare the code from one version to another
and also restore to oldest version of code
Packages
We can add source code components like
mapping,procedure,variable,sequences
We can add any scenario created for mapping,procedure,variable,sequences,packages
We cannot add a package to another package. But create a scenario for package and then
when we add any scenario, by default they are configured for synchronous mode
means they all run serial
by changing them to asynchronous, we can run them parallel.
for every package , me must have starting step (indcates with green arrow)
In addition to placing odi developed components in package we can add odi bultin tools to
ngs,procedure,packages..etc
s latest code
executable)
001
002
java a
sequences,packages
io for package and then we add the package scenario to another package
hronous mode
ODI Tools
OdiOutFile 1
output content to another file
echo "empno,ename,loc" > test.txt
OdiFileCopy
copy file within one server
copies sub directories also
cp a.txt b.txt
OdiFileDelete
delete the files from directory
rm *.txt
also can delete the files which have been modifies during x time to y time
this requires find command in unix
OdiFileMove
move the file from one directory to another
if the file is moved within same directory, then it will work like rename
mv a.txt b.txt
mv /a/a.txt /b/a.txt
OdiMkDir 5
creates directory
mkdir test
OdiFileAppend 6
cat file1.txt file2.txt file3.txt > finalfile.txt
cat f*.txt > finalfile.txt
OdiSqlUnload 7
File transfer
Agent
ODI
scp
remote
unix
local
unix
ftp
windows
unix
ftp
windows
windows
ftp
unix
windows
get
put
mget
mput
ODI
OdiDeleteScen
deleet scenario
OdiExportAllScen
export one or all scenarios.
we can use this when we need to take a backup of all scenarios regularly
or export all screnarions to migrate code to another repository
OdiExportEnvironmentInformation
exports ODI environment to a csv file
OdiExportLog
Agents
Load plans
Agents
local(no Agent)
Agent creation
RCU must be configuration
cd E:\ODI12c\Oracle\Middleware\Oracle_Home\oracle_common\common\bin
start Agent
start nodemanager
cd E:\ODI12c\Oracle\Middleware\Oracle_Home\user_projects\domains\odi12_domain\bin
startnodemanager
start Weblogic
cd E:\ODI12c\Oracle\Middleware\Oracle_Home\user_projects\domains\odi12_domain\bin
startWebLogic
verify below link
http://localhost:7001/console/
create a Agent in topology
ODI Studio
LocalAgent
RCU Rep
Agent1
Agent2
Enterprise manager
Weblogic
node
cluster1
cluster2
manager
localhost:7001/console
localhost:7001/em
mmon\bin
ins\odi12_domain\bin
ins\odi12_domain\bin
Load plan
debug
solutions
smart imp/exp
context
In each physical design we can configure different knowledge modules and different config
example
in initial load select Oracle insert with truncates
in daily load select incremental update without truncate
context
Context can help to have one single logical schema to point to multiple physical schemas
example
One logical schema can point to three physical schema through different contxt
1 for development database
using dev context
1 for testing database
using test context
1 for production database
using prod context
while executing the mapping we can choose the appropriate context to run on specific env
physic
al
log schema
model
dev
Dev
(context)
mapping
physic
al
test
(context)
test
physic
al
prod
(context)
physic
al
prod
test
(context)
Debug
Debug can help trouble shooting the mapping during the execution
when we run the mapping in debug session, first it will give the blue print of complete map
here we can add the break point at any step to see ,check and trouble shoot the code.
import /export
We can perform import and export in two ways
regural import /export: in this case only select objects /components or exported and import
when we are exporting the mapping for the first time , we can choose smart option so that
if only certain selected components to be exported then we choose normal export.
move the code from development enviromnent to test and form test to production
in this case we may have to export all the code from development repository to testing rep
when we export the code , all the components will be saved as xml files.
duplicate
synonym mode insert
synonym mode update
synonym mode insert/update
Solutions
solution is a group of scenarios which needs to be migrated from one repository to another
Loadplans
Load plan is a collection of scenarios with plan of execution path with dependencies
Load plans can help executing scenarios in serial and parallel.
We can schedule Loadplans to run automatically on given time and intervals.
Additionally we can set the exceptions and advance dependency calculation
Load plans has steps to execute scenarios
root step
Run scenario step
serial
parallel
case
when
else
Exception steps
Restart Type
Restart requires some time to execute all the scenarios
including the scenarios which are already successful
in this case choose Restart type as execute all child
some times we need to start only from the last failure point
in this case choose to run from failure tasks
Schedule
Load plans can be scheduled to run on their own as per set timing
11g
filter
drag a column to outside space
join
link two tables and select join option
lookup
link two tables and select lookup option
lookup supports only two options when there are multiple matches
a) error on multiple match
b) join will all records
derivations
all the derivations we need to write on the target directly
for all aggregate calculations we need to keep on target.
complex logic
in case if any logic requires after target,
we need to place a temp/dummy target in first mapping(interface)
and call this in another interface(mapping)
A interface which contains temp target is called as yellow interface
physical designs
physic
al
physic
al
physic
al
ports if there are any dependent objects required to run the mapping/packages
ose smart option so that all the dependent objects are exported
e normal export.
st to production
repository to testing repository and from test to production repository
ith dependencies
d intervals.
alculation
nue further
scenarios
st failure point
le matches
g(interface)
w interface