You are on page 1of 178

Informatica PowerMart /

PowerCenter 7 Basics
Education Services

PC6B-20030512

Informatica Corporation, 2003. All rights reserved.

Course Objectives
At the end of this course you will:
Understand how to use all major PowerCenter 7
components
Be able to perform basic Repository administration tasks
Be able to build basic ETL Mappings and Mapplets
Be able to create, run and monitor Workflows
Understand available options for loading target data
Be able to troubleshoot most problems

Extract, Transform, and Load


Operational Systems
RDBMS Mainframe

Decision Support
Data
Warehouse

Other

Aggregate Data
Transaction level data
Cleanse Data
Optimized for Transaction
Consolidate Data
Response Time
Apply Business Rules
De-normalize
Current
Normalized or DeTransform
Normalized data

Extract

ETL

Aggregated data
Historical

Load

PowerCenter Architecture

PowerCenter 7 Architecture
Server

native

native
Targets

Sources

TCP/IP

Repository
Server

Heterogeneous
Targets

TCP/IP

Heterogeneous
Targets

Repository
Agent

native
Repository Designer Workflow Workflow Rep Server
Manager
Manager Monitor Administrative

Repository

Console

Not Shown: Client ODBC Connections for Source and Target metadata

PowerCenter 7 Components
PowerCenter Repository
PowerCenter Repository Server
PowerCenter Client
Designer
Repository Manager
Repository Server Administration Console
Workflow Manager
Workflow Monitor
PowerCenter Server
External Components
Sources
Targets
6

Repository Topics
By the end of this section you will be familiar with:
The purpose of the Repository Server and Agent
The Repository Server Administration Console GUI
interface
The Repository Manager GUI interface
Repository maintenance operations
Security and privileges
Object sharing, searching and locking
Metadata Extensions
7

Repository Server
Each Repository has an independent architecture for the
management of the physical Repository tables
Components: one Repository Server, and a Repository
Agent for each Repository
Server
Repository
Server
Repository
Agent
Repository

Repository Server
Administration Console

Manager
Client overhead for Repository management is greatly
reduced by the Repository Server

Repository
8

Repository Server Features


Manages connections to the Repository from client
applications
Can manage multiple Repositories on different
machines on a network
Uses one Repository Agent process to insert, update
and fetch objects from the Repository database tables,
for each Repository it manages
Maintains object consistency by controlling object
locking

The Repository Server runs on the same system running the Repository Agent

Repository Server Administration Console


Use Repository Administration console to Administer Repository Servers and
Repositories through Repository Server. Following tasks can be performed:
Add, Edit and Remove Repository Configurations
Export and Import Repository Configurations
Create a Repository
*Promote a local Repository to a Global Repository
Copy a Repository
Delete a Repository from the Database
Backup and Restore a Repository
Start, Stop, enable and Disable a Repositories
View Repository connections and locks
Close Repository connections.
Upgrade a Repository
10

Repository Server Administration Console

Information
Nodes

HTML View

Console Tree
Hypertext Links to
Repository
Maintenance Tasks
11

Repository Management
Perform all Repository
maintenance tasks through
Repository Server from the
Repository Server Administration
Console
Create the Repository
Configuration
Select Repository Configuration
and perform maintenance tasks:
Create

Notify Users

Delete

Propagate

Backup

Register

Copy from

Restore

Disable

Un-Register

Export Connection

Upgrade

Make Global
12

Repository Manager
Use Repository manager to navigate through multiple folders and
repositories. Perform following tasks:
Manage the Repository
Launch Repository Server Administration Console for this

purpose

Implement Repository Security


Managing Users and Users Groups
Perform folder functions
Create, Edit, Copy and Delete folders
View Metadata
Analyze Source, Target, Mappings and Shortcut dependencies.
13

Repository Manager Interface

Navigator
Window

Main Window

Dependency Window
Output Window

14

Users, Groups and Repository Privileges


Steps:
Create groups
Create users
Assign users to
groups
Assign privileges to
groups
Assign additional
privileges to users
(optional)
15

Managing Privileges
Check box assignment of privileges

16

Folder Permissions
Assign one user as the
folder owner for first tier
permissions

Select one of the owners


groups for second tier
permissions

All users and groups in


the Repository will be
assigned the third tier
permissions
17

Object Locking
Object Locks preserve Repository integrity
Use the Edit menu for Viewing Locks and Unlocking
Objects

18

Object Searching

(Menu- Analyze Search)


Keyword search

Limited to keywords
previously defined in
the Repository
(via Warehouse
Designer)

Search all

Filter and search


objects

19

Object Sharing
Reuse existing objects
Enforces consistency
Decreases development time
Share objects by using copies and shortcuts
COPY

SHORTCUT

Copy object to another folder

Link to an object in another folder

Changes to original object not


captured

Dynamically reflects changes to


original object

Duplicates space

Preserves space

Copy from shared or unshared folder

Created from a shared folder

Required security settings for sharing objects:


Repository Privilege:
Originating Folder Permission:
Destination Folder Permissions:

Use Designer
Read
Read/Write
20

Adding Metadata Extensions


Allows developers and partners to extend the metadata
stored in the Repository
Accommodates the following metadata types:
Vendor-defined - Third-party application vendor-created

metadata lists

For example, Applications such as Ariba or PowerConnect for

Siebel can add information such as contacts, version, etc.

User-defined - PowerCenter/PowerMart users can define

and create their own metadata

Must have Administrator Repository or Super User


Repository privileges

21

Sample Metadata Extensions

Sample User Defined


Metadata, e.g. - contact
information, business user

Reusable Metadata Extensions can also be created in the Repository Manager


22

Design Process

1. Create Source definition(s)


2. Create Target definition(s)
3. Create a Mapping
4. Create a Session Task
5. Create a Workflow from Task components
6. Run the Workflow
7. Monitor the Workflow and verify the results
23

Source Object Definitions


By the end of this section you will:
Be familiar with the Designer GUI interface
Be familiar with Source Types
Be able to create Source Definitions
Understand Source Definition properties
Be able to use the Data Preview option

24

Source Analyzer

Designer Tools

Analyzer Window
Navigation
Window

25

Methods of Analyzing Sources

Repository

Import from Database


Import from File
Import from Cobol File
Import from XML file
Create manually

Relational

XML file

Source
Analyzer

Flat file

COBOL file
26

Analyzing Relational Sources


Source Analyzer

Relational Source
ODBC

Table
View
Synonym
DEF

Repository
Server
TCP/IP

Repository
Agent

native

DEF

Repository
27

Analyzing Relational Sources


Editing Source Definition Properties

28

Analyzing Flat File Sources


Source Analyzer
Mapped Drive
NFS Mount
Local Directory

Flat File
DEF

Fixed Width or
Delimited
Repository
Server
TCP/IP

Repository
Agent

native

DEF

Repository
29

Flat File Wizard


Three-step
wizard
Columns can
be renamed
within wizard
Text, Numeric
and Datetime
datatypes are
supported
Wizard
guesses
datatype
30

XML Source Analysis


Source Analyzer

Mapped Drive
NFS Mounting
Local Directory

.DTD File
DEF

DATA

Repository
Server
TCP/IP

Repository
Agent

native

DEF

In addition to the DTD file, an


XML Schema or XML file
can be used as a Source
Definition

Repository
31

Analyzing VSAM Sources


Source Analyzer
Mapped Drive
NFS Mounting
Local Directory

Repository
Server

.CBL File
DEF

DATA

TCP/IP

Repository
Agent

native

DEF

Supported Numeric Storage Options:


COMP, COMP-3, COMP-6

Repository
32

VSAM Source Properties

33

Target Object Definitions


By the end of this section you will:
Be familiar with Target Definition types
Know the supported methods of creating Target
Definitions
Understand individual Target Definition properties

34

Creating Target Definitions


Methods of creating Target Definitions
Import from Database
Import from an XML file
Manual Creation
Automatic Creation

35

Automatic Target Creation


Drag-anddrop a
Source
Definition
into
the
Warehouse
Designer
Workspace

36

Import Definition from Database


Can Reverse engineer existing object definitions from
a database system catalog or data dictionary
Warehouse
Designer

Database
ODBC

Repository
Server

DEF
TCP/IP

Table
View
Synonym

Repository
Agent

native
Repository DEF
37

Manual Target Creation


1. Create empty definition

2. Add desired columns

3. Finished target definition

ALT-F can also be used to create a new column


38

Target Definition Properties

39

Target Definition Properties

40

Creating Physical Tables

DEF
DEF

DEF

Execute SQL
via
Designer

LOGICAL

PHYSICAL

Repository target table


definitions

Target database
tables

41

Creating Physical Tables


Create tables that do not already exist in target database
Connect - connect to the target database
Generate SQL file - create DDL in a script file
Edit SQL file - modify DDL script as needed
Execute SQL file - create physical tables in target database

Use Preview Data to verify


the results (right mouse
click on object)
42

Transformation Concepts
By the end of this section you will be familiar with:
Transformation types and views
Transformation calculation error treatment
Null data treatment
Informatica data types
Expression transformation
Expression Editor
Informatica Functions
Expression validation
43

Transformation Types
Informatica PowerCenter 7 provides 23 objects for data
transformation
Aggregator: performs aggregate calculations
Application Source Qualifier: reads Application object sources as ERP

Custom: Calls a procedure in shared library or DLL


Expression: performs row-level calculations
External Procedure (TX): calls compiled code for each row
Filter: drops rows conditionally
Joiner: joins heterogeneous sources
Lookup: looks up values and passes them to other objects
Normalizer: reorganizes records from VSAM, Relational and Flat File
Rank: limits records to the top or bottom of a range
Input: Defines mapplet input rows. Available in Mapplet designer
Output: Defines mapplet output rows. Available in Mapplet designer
44

Transformation Types
Router: splits rows conditionally
Sequence Generator: generates unique ID values
Sorter: sorts data
Source Qualifier: reads data from Flat File and Relational Sources
Stored Procedure: calls a database stored procedure
Transaction Control: Defines Commit and Rollback transactions
Union: Merges data from different databases
Update Strategy: tags rows for insert, update, delete, reject
XML Generator: Reads data from one or more Input ports and
outputs XML through single output port
XML Parser: Reads XML from one or more Input ports and outputs
data through single output port
XML Source Qualifier: reads XML data

45

Transformation Views
A transformation has
three views:
Iconized - shows the
transformation in
relation to the rest of
the mapping
Normal - shows the
flow of data through
the transformation
Edit - shows
transformation ports
and properties; allows
editing
46

Edit Mode
Allows users with folder write permissions to change
or create transformation ports and properties
Define port level handling

Define transformation
level properties

Enter comments
Make reusable

Switch
between
transformations

47

Expression Transformation
Perform calculations using non-aggregate functions
(row level)
Passive Transformation
Connected
Ports
Mixed
Variables allowed
Create expression in an
output or variable port

Click here to invoke the


Expression Editor

Usage
Perform majority of
data manipulation
48

Expression Editor
An expression formula is a calculation or conditional statement
Used in Expression, Aggregator, Rank, Filter, Router, Update Strategy
Performs calculation based on ports, functions, operators, variables,
literals, constants and return values from other transformations

49

Informatica Functions - Samples


ASCII
CHR
CHRCODE
CONCAT
INITCAP
INSTR
LENGTH
LOWER
LPAD
LTRIM
RPAD
RTRIM
SUBSTR
UPPER
REPLACESTR
REPLACECHR

Character Functions
Used to manipulate character data
CHRCODE returns the numeric value
(ASCII or Unicode) of the first character
of the string passed to this function

For backwards compatibility only - use || instead

50

Informatica Functions
TO_CHAR (numeric)
TO_DATE
TO_DECIMAL
TO_FLOAT
TO_INTEGER
TO_NUMBER

Conversion Functions

ADD_TO_DATE
DATE_COMPARE
DATE_DIFF
GET_DATE_PART
LAST_DAY
ROUND (date)
SET_DATE_PART
TO_CHAR (date)
TRUNC (date)

Date Functions

Used to convert datatypes

Used to round, truncate, or compare


dates; extract one part of a date; or
perform arithmetic on a date
To pass a string to a date function,
first use the TO_DATE function to
convert it to an date/time datatype
51

Informatica Functions
ABS
CEIL
CUME
EXP
FLOOR
LN
LOG
MOD
MOVINGAVG
MOVINGSUM
POWER
ROUND
SIGN
SQRT
TRUNC

Numerical Functions
Used to perform mathematical
operations on numeric data
Scientific Functions
Used to calculate
geometric values
of numeric data

COS
COSH
SIN
SINH
TAN
TANH

52

Informatica Functions
ERROR
ABORT
DECODE
IIF

Special Functions
Used to handle specific conditions within a session;
search for certain values; test conditional
statements
IIF(Condition,True,False)

ISNULL
IS_DATE
IS_NUMBER
IS_SPACES

SOUNDEX
METAPHONE

Test Functions
Used to test if a lookup result is null
Used to validate data

Encoding Functions
Used to encode string values
53

Expression Validation
The Validate or OK button in the Expression Editor will:
Parse the current expression
Remote port searching (resolves references to ports in

other transformations)

Parse transformation attributes


e.g. - filter condition, lookup condition, SQL Query

Parse default values


Check spelling, correct number of arguments in
functions, other syntactical errors

54

Variable Ports
Use to simplify complex expressions
e.g. - create and store a depreciation formula to be

referenced more than once


Use in another variable port or an output port expression
Local to the transformation (a variable port cannot also be an
input or output port)
Available in the Expression, Aggregator and Rank
transformations

55

Informatica Data Types


NATIVE DATATYPES

Specific to the source and target


database types

PowerMart / PowerCenter internal


datatypes based on ANSI SQL-92

Display in source and target tables


within Mapping Designer

Display in transformations within


Mapping Designer

Native

TRANSFORMATION DATATYPES

Transformation

Native

Transformation datatypes allow mix and match of source and target


database types
When connecting ports, native and transformation datatypes must be
compatible (or must be explicitly converted)

56

Datatype Conversions
Integer

Decimal

Double

Char

Integer

Decimal

Double

Char

Date
Raw

Date

Raw

All numeric data can be converted to all other numeric datatypes,


e.g. - integer, double, and decimal
All numeric data can be converted to string, and vice versa
Date can be converted only to date and string, and vice versa
Raw (binary) can only be linked to raw
Other conversions not listed above are not supported
These conversions are implicit; no function is necessary
57

Mappings
By the end of this section you will be familiar with:
Mapping components
Source Qualifier transformation
Mapping validation
Data flow rules
System Variables
Mapping Parameters and Variables

58

Mapping Designer
Transformation Toolbar

Mapping List

Iconized Mapping

59

Pre-SQL and Post-SQL Rules


Can use any command that is valid for the database
type; no nested comments
Can use Mapping Parameters and Variables in SQL
executed against the source
Use a semi-colon (;) to separate multiple statements
Informatica Server ignores semi-colons within single
quotes, double quotes or within /* ...*/
To use a semi-colon outside of quotes or comments,
escape it with a back slash (\)
Workflow Manager does not validate the SQL
60

Data Flow Rules


Each Source Qualifier starts a single data stream
(a dataflow)
Transformations can send rows to more than one
transformation (split one data flow into multiple pipelines)
Two or more data flows can meet together -- if (and only if)
they originate from a common active transformation
Cannot add an active transformation into the mix
ALLOWED

DISALLOWED

Active

Passive
T

Example holds true with Normalizer in lieu of Source Qualifier. Exceptions are:
Mapplet Input and Joiner transformations

61

Connection Validation
Examples of invalid connections in a Mapping:
Connecting ports with incompatible datatypes
Connecting output ports to a Source
Connecting a Source to anything but a Source

Qualifier or Normalizer transformation


Connecting an output port to an output port or
an input port to another input port
Connecting more than one active
transformation to another transformation
(invalid dataflow)

62

Mapping Validation
Mappings must:
Be valid for a Session to run
Be end-to-end complete and contain valid expressions
Pass all data flow rules
Mappings are always validated when saved; can be validated
without being saved
Output Window will always display reason for invalidity

63

Workflows
By the end of this section, you will be familiar with:
The Workflow Manager GUI interface
Workflow Schedules
Setting up Server Connections
Relational, FTP and External Loader

Creating and configuring Workflows


Workflow properties
Workflow components
Workflow Tasks
64

Workflow Manager Interface


Task
Tool Bar

Workflow
Designer
Tools

Workspace

Navigator
Window

Output Window
Status Bar
65

Workflow Manager Tools


Workflow Designer
Maps the execution order and dependencies of Sessions,

Tasks and Worklets, for the Informatica Server

Task Developer
Create Session, Shell Command and Email tasks
Tasks created in the Task Developer are reusable

Worklet Designer
Creates objects that represent a set of tasks
Worklet objects are reusable

66

Workflow Structure
A Workflow is set of instructions for the Informatica Server
to perform data transformation and load
Combines the logic of Session Tasks, other types of Tasks
and Worklets
The simplest Workflow is composed of a Start Task, a Link
and one other Task
Link

Start
Task

Session
Task

67

Workflow Scheduler Objects


Setup reusable schedules to
associate with multiple
Workflows

Used in Workflows and

Session Tasks

68

Server Connections
Configure Server data access connections
Used in Session Tasks
Configure:
1. Relational
2. MQ Series
3. FTP
4. Custom
5. External Loader

69

Relational Connections (Native )


Create a relational (database) connection
Instructions to the Server to locate relational tables
Used in Session Tasks

70

Relational Connection Properties


Define native
relational (database)
connection
User Name/Password
Database connectivity
information
Rollback Segment
assignment (optional)

Optional Environment SQL


(executed with each use of
database connection)
71

FTP Connection
Create an FTP connection
Instructions to the Server to ftp flat files
Used in Session Tasks

72

External Loader Connection


Create an External Loader connection
Instructions to the Server to invoke database bulk loaders
Used in Session Tasks

73

Task Developer
Create basic Reusable building blocks to use in any Workflow
Reusable Tasks
Session
Command
Email

Set of instructions to execute Mapping logic


Specify OS shell / script command(s) to run
during the Workflow
Send email at any point in the Workflow
Session
Command
Email

74

Session Task
Server instructions to runs the logic of ONE specific Mapping
e.g. - source and target data location specifications,
memory allocation, optional Mapping overrides,
scheduling, processing and load instructions
Becomes a

component of a
Workflow (or
Worklet)
If configured in
the Task
Developer,
the Session Task
is reusable
(optional)

75

Command Task
Specify one (or more) Unix shell or DOS (NT, Win2000) commands to run at
a specific point in the Workflow
Becomes a component of a Workflow (or Worklet)
If configured in the Task Developer, the Command Task is reusable
(optional)

Commands can also be referenced in a Session through the Session


Components tab as Pre- or Post-Session commands

76

Command Task

77

Additional Workflow Components


Two additional components are Worklets and Links
Worklets are objects that contain a series of Tasks

Links are required to connect objects in a Workflow

78

Developing Workflows
Create a new Workflow in the Workflow Designer
Customize
Workflow name

Select a
Server

79

Workflow Properties
Customize Workflow
Properties
Workflow log displays

Select a Workflow
Schedule (optional)
May be reusable or
non-reusable
80

Workflows Properties
Create a User-defined Event
which can later be used
with the Raise Event Task

Define Workflow Variables that can


be used in later Task objects
(example: Decision Task)

81

Building Workflow Components


Add Sessions and other Tasks to the Workflow
Connect all Workflow components with Links
Save the Workflow
Start the Workflow

Save

Sessions in a Workflow can be independently executed

Start Workflow

82

Workflow Designer - Links


Required to connect Workflow Tasks
Can be used to create branches in a Workflow
All links are executed -- unless a link condition is used
which makes a link false
Link 1

Link 3

Link 2
83

Session Tasks
After this section, you will be familiar with:
How to create and configure Session Tasks
Session Task properties
Transformation property overrides
Reusable vs. non-reusable Sessions
Session partitions

84

Session Task
Created to execute the logic of a mapping (one mapping only)
Session Tasks can be created in the Task Developer
(reusable) or Workflow Developer (Workflow-specific)
Steps to create a Session Task
Select the Session button from the Task Toolbar or
Select menu Tasks | Create

Session Task Bar Icon

85

Session Task - General

86

Session Task - Properties

87

Session Task Config Object

88

Session Task - Sources

89

Session Task - Targets

90

Session Task - Transformations


Allows overrides of
some transformation
properties
Does not change the
properties in the
Mapping

91

Session Task - Partitions

92

Monitor Workflows
By the end of this section you will be familiar with:
The Workflow Monitor GUI interface
Monitoring views
Server monitoring modes
Filtering displayed items
Actions initiated from the Workflow Monitor
Truncating Monitor Logs

93

Monitor Workflows
The Workflow Monitor is the tool for monitoring Workflows
and Tasks
Review details about a Workflow or Task in two views
Gantt Chart view
Task view

Gantt Chart view

Task view
94

Monitoring Workflows
Perform operations in the Workflow Monitor

Restart -- restart a Task, Workflow or Worklet


Stop -- stop a Task, Workflow, or Worklet
Abort -- abort a Task, Workflow, or Worklet
Resume -- resume a suspended Workflow after a
failed Task is corrected

View Session and Workflow logs


Abort has a 60 second timeout
If the Server has not completed processing and

committing data during the timeout period, the


threads and processes associated with the Session
are killed

Stopping a Session Task means the Server stops reading data

95

Monitoring Workflows
Task View

Status Bar

Start
Completion
Task

Workflow

Worklet

Time

Time

Start, Stop, Abort, Resume


Tasks,Workflows and Worklets
96

Monitor Window Filtering


Task View provides filtering
Monitoring filters
can be set using
drop down menus
Minimizes items
displayed in
Task View

Right-click on Session to retrieve the


Session Log (from the Server to the
local PC Client)
97

Debugger
By the end of this section you will be familiar with:
Creating a Debug Session
Debugger windows & indicators
Debugger functionality and options
Viewing data with the Debugger
Setting and using Breakpoints
Tips for using the Debugger

98

Debugger Features
Debugger is a Wizard driven tool
View source / target data
View transformation data
Set break points and evaluate expressions
Initialize variables
Manually change variable values

Debugger is
Session Driven
Data can be loaded or discarded
Debug environment can be saved for later use

99

Debugger Interface
Debugger windows & indicators

Debugger Mode
indicator

Solid yellow
arrow Current
Transformation
indicator
Flashing
yellow
SQL
indicator

Transformation
Instance
Data window

Debugger
Log tab
Session Log tab

Target Data window


100

Filter Transformation
Drops rows conditionally

Active Transformation
Connected
Ports
All input / output
Specify a Filter condition
Usage
Filter rows from
flat file sources
Single pass source(s)
into multiple targets
101

Aggregator Transformation
Performs aggregate calculations

Active Transformation
Connected
Ports
Mixed
Variables allowed
Group By allowed
Create expressions in
output or variable ports
Usage
Standard aggregations
102

Informatica Functions
Aggregate Functions
AVG
COUNT
FIRST
LAST
MAX
MEDIAN
MIN
PERCENTILE
STDDEV
SUM
VARIANCE

Return summary values for non-null data


in selected ports
Use only in Aggregator transformations
Use in output ports only
Calculate a single value (and row) for all
records in a group
Only one aggregate function can be
nested within an aggregate function
Conditional statements can be used with
these functions
103

Aggregate Expressions
Aggregate
functions are
supported
only
in the
Aggregator
Transformation

Conditional
Aggregate
expressions
are supported

Conditional SUM format: SUM(value, condition)


104

Aggregator Properties
Sorted Input Property

Instructs the
Aggregator to
expect the data
to be sorted

Set Aggregator
cache sizes (on
Informatica Server
machine)

105

Sorted Data
The Aggregator can handle sorted or unsorted data
Sorted data can be aggregated more efficiently, decreasing
total processing time

The Server will cache data from each group and


release the cached data -- upon reaching the first
record of the next group
Data must be sorted according to the order of the
Aggregator Group By ports
Performance gain will depend upon varying factors

106

Incremental Aggregation
Trigger in Session
Properties,
Performance Tab

MTD
calculation

Cache is saved into $PMCacheDir: aggregatorname.DAT


aggregatorname.IDX
Upon next run, files are overwritten with new cache information
Example: When triggered, PowerCenter Server will save
new MTD totals. Upon next run (new totals), Server will
subtract old totals; difference will be passed forward
Best Practice is to copy these files in case a rerun of data is ever required.
Reinitialize when no longer needed, e.g. at the beginning new month processing

107

Joiner Transformation
By the end of this section you will be familiar with:
When to use a Joiner Transformation
Homogeneous Joins
Heterogeneous Joins
Joiner properties
Joiner Conditions
Nested joins

108

Homogeneous Joins
Joins that can be performed with a SQL SELECT statement:
Source Qualifier contains a SQL join

Tables on same database server (or are synonyms)


Database server does the join work
Multiple homogenous tables can be joined

109

Heterogeneous Joins
Joins that cannot be done with a SQL statement:
An Oracle table and a Sybase table

Two Informix tables on different database servers


Two flat files
A flat file
and a
database
table

110

Joiner Transformation
Performs heterogeneous joins on records from
different databases or flat file sources
Active Transformation
Connected
Ports
All input or input / output
M denotes port comes
from master source
Specify the Join condition
Usage
Join two flat files
Join two tables from
different databases
Join a flat file with a
relational table

111

Joiner Conditions

Multiple
join
conditions
are supported

112

Joiner Properties
Join types:
Normal
(inner)
Master outer
Detail outer
Full outer
Set
Joiner Cache

Joiner can accept sorted data (configure the join condition to


use the sort origin ports)

113

Mid-Mapping Join
The Joiner does not accept input in the following situations:

Both input pipelines begin with the same Source Qualifier


Both input pipelines begin with the same Normalizer
Both input pipelines begin with the same Joiner
Either input pipeline contains an Update Strategy

114

Sorter Transformation
Can sort data from relational tables or flat files
Sort takes place on the Informatica Server machine
Multiple sort keys are supported
The Sorter transformation is often more efficient than
a sort performed on a database with an ORDER BY
clause

115

Lookup Transformation
By the end of this section you will be familiar with:
Lookup principles
Lookup properties
Lookup conditions
Lookup techniques
Caching considerations

116

How a Lookup Transformation Works


For each Mapping row, one or more port values are looked
up in a database table
If a match is found, one or more table values are returned
to the Mapping. If no match is found, NULL is returned
Lookup
value(s)

Lookup transformation
Return value(s)

117

Lookup Transformation
Looks up values in a database table and provides
data to other components in a Mapping
Passive Transformation
Connected / Unconnected
Ports
Mixed
L denotes Lookup port
R denotes port used as a
return value (unconnected
Lookup only)
Specify the Lookup Condition
Usage
Get related values
Verify if records exists or
if data has changed

118

Lookup Properties
Override
Lookup SQL
option
Toggle
caching
Native
Database
Connection
Object name

119

Additional Lookup Properties


Set cache
directory

Make cache
persistent

Set
Lookup
cache sizes

120

Lookup Conditions
Multiple conditions are supported

121

To Cache or not to Cache?


Caching can significantly impact performance
Cached
Lookup table data is cached locally on the Server
Mapping rows are looked up against the cache
Only one SQL SELECT is needed

Uncached
Each Mapping row needs one SQL SELECT

Rule Of Thumb: Cache if the number (and size) of


records in the Lookup table is small relative to the
number of mapping rows requiring lookup
122

Target Options
By the end of this section you will be familiar with:
Row type indicators
Row operations at load time
Constraint-based loading considerations
Rejected row handling options

123

Target Properties
Session Task
Select target
instance
Row loading
operations
Error handling

Properties Tab
124

Constraint-based Loading
Maintains referential integrity in the Targets
pk1
fk1, pk2

fk2

pk1
fk1, pk2
fk2

Example 1
With only One Active source, rows
for Targets 1-3 will be loaded
properly and maintain referential
integrity

Example 2
With Two Active sources, it is not
possible to control whether rows for
Target 3 will be loaded before or
after those for Target 2

The following transformations are Active sources: Advanced External Procedure,


Source Qualifier, Normalizer, Aggregator, Sorter, Joiner, Rank, Mapplet (containing
any of the previous transformations)
125

Update Strategy Transformation


By the end of this section you will be familiar with:
Update Strategy functionality
Update Strategy expressions
Refresh strategies
Smart aggregation

126

Update Strategy Transformation


Used to specify how each individual row will be used to
update target tables (insert, update, delete, reject)
Active Transformation
Connected
Ports
All input / output
Specify the Update
Strategy Expression
Usage
Updating Slowly
Changing Dimensions
IIF or DECODE logic
determines how to
handle the record
127

Target Refresh Strategies


Single snapshot: Target truncated, new records

inserted

Sequential snapshot: new records inserted

Incremental: Only new records are inserted.

Records already present in the target are ignored

Incremental with Update: Only new records are

inserted. Records already present in the target are


updated

128

Router Transformation
Rows sent to multiple filter conditions

Active Transformation
Connected
Ports
All input/output
Specify filter conditions
for each Group
Usage
Link source data in one
pass to multiple filter
conditions

129

Router Transformation in a Mapping

130

Parameters and Variables


By the end of this section you will understand:
System Variables
Creating Parameters and Variables
Features and advantages
Establishing values for Parameters and Variables

131

System Variables
SYSDATE

Provides current datetime on the


Informatica Server machine
Not a static value

$$$SessStartTime

Returns the system date value as a


string. Uses system clock on machine
hosting Informatica Server
format of the string is database type
dependent
Used in SQL override
Has a constant value

SESSSTARTTIME

Returns the system date value on the


Informatica Server
Used with any function that accepts
transformation date/time data types
Not to be used in a SQL override
Has a constant value

132

Mapping Parameters and Variables


Apply to all transformations within one Mapping
Represent declared values
Variables can change in value during run-time
Parameters remain constant during run-time
Provide increased development flexibility
Defined in Mapping menu
Format is $$VariableName or $$ParameterName

133

Mapping Parameters and Variables


Sample declarations
Userdefined
names

Set the
appropriate
aggregation
type

Set optional
Initial Value

Declare Variables and Parameters in the Designer Mappings menu


134

Functions to Set Mapping Variables


SetCountVariable -- Counts the number of
evaluated rows and increments or decrements a
mapping variable for each row
SetMaxVariable -- Evaluates the value of a mapping
variable to the higher of two values
SetMinVariable -- Evaluates the value of a mapping
variable to the lower of two values
SetVariable -- Sets the value of a mapping variable
to a specified value

135

Unconnected Lookup
Will be physically unconnected from other transformations
There can be NO data flow arrows leading to or from an
unconnected Lookup
Lookup function can be set within any
transformation that supports expressions
Lookup data is
called from the
point in the
Mapping that
needs it

Function in the
Aggregator calls the
unconnected Lookup
136

Conditional Lookup Technique


Two requirements:
Must be Unconnected (or function mode) Lookup
Lookup function used within a conditional statement
Condition

Row keys
(passed to Lookup)

IIF ( ISNULL(customer_id),:lkp.MYLOOKUP(order_no))
Lookup function

Conditional statement is evaluated for each row


Lookup function is called only under the pre-defined
condition

137

Conditional Lookup Advantage


Data lookup is performed only for those rows which
require it. Substantial performance can be gained
EXAMPLE: A Mapping will process 500,000 rows. For two
percent of those rows (10,000) the item_id value is NULL.
Item_ID can be derived from the SKU_NUMB.

IIF ( ISNULL(item_id), :lkp.MYLOOKUP (sku_numb))

Condition
(true for 2 percent of all
rows)

Lookup
(called only when condition is
true)

Net savings = 490,000 lookups


138

Connected vs. Unconnected Lookups


CONNECTED LOOKUP

UNCONNECTED LOOKUP

Part of the mapping data flow

Separate from the mapping data


flow

Returns multiple values (by linking


output ports to another
transformation)

Returns one value (by checking the


Return (R) port option for the output
port that provides the return value)

Executed for every record passing


through the transformation

Only executed when the lookup


function is called

More visible, shows where the


lookup values are used

Less visible, as the lookup is called


from an expression within another
transformation

Default values are used

Default values are ignored


139

Heterogeneous Targets
By the end of this section you will be familiar with:
Heterogeneous target types
Heterogeneous target limitations
Target conversions

140

Definition: Heterogeneous Targets


Supported target definition types:
Relational database
Flat file
XML
ERP (SAP BW, PeopleSoft, etc.)
A heterogeneous target is where the target types are
different or the target database connections are different
within a single Session Task

141

Step One: Identify Different Target Types

Oracle table

Tables are EITHER in two


different databases, or
require different (schemaspecific) connect strings
One target is a flatfile load

Oracle table

Flat file

142

Step Two: Different Database Connections

The two database


connections WILL
differ
Flatfile requires
separate location
information

143

Target Type Override (Conversion)


Example: Mapping has SQL Server target definitions.
Session Task can be set to load Oracle tables instead,
using an Oracle database connection.

Only the following overrides are supported:


Relational target to flat file target
Relational target to any other relational database type
SAP BW target to a flat file target

CAUTION: If target definition datatypes are not compatible with datatypes in newly
selected database type, modify the target definition

144

Mapplet Designer

Mapplet Designer Tool

Mapplet
Transformation
Icons

Mapplet Output
Transformation
145

Mapplet Advantages
Useful for repetitive tasks / logic
Represents a set of transformations
Mapplets are reusable
Use an instance of a Mapplet in a Mapping
Changes to a Mapplet are inherited by all instances
Server expands the Mapplet at runtime

146

Active and Passive Mapplets


Passive Mapplets contain only passive transformations
Active Mapplets contain one or more active
transformations

CAUTION: changing a passive Mapplet into an active


Mapplet may invalidate Mappings which use that
Mapplet
Do an impact analysis in Repository Manager first

147

Using Active and Passive Mapplets

Passive

Active

Multiple Passive
Mapplets can populate
the same target
instance

Multiple Active Mapplets


or Active and Passive
Mapplets cannot
populate the same
target instance

148

Reusable Transformations
By the end of this section you will be familiar with:
Reusable transformation advantages
Reusable transformation rules
Promoting transformations to reusable
Copying reusable transformations

149

Reusable Transformations
Define once - reuse many times
Reusable Transformations

Can be a copy or a shortcut


Edit Ports only in Transformation Developer
Can edit Properties in the mapping
Instances dynamically inherit changes
Be careful: It is possible to invalidate mappings by
changing reusable transformations

Transformations that cannot be made reusable


Source Qualifier
ERP Source Qualifier
Normalizer used to read a Cobol data source

150

Promoting a Transformation to Reusable

Place a
check in the
Make
reusable box

This action
is not
reversible

151

Sequence Generator Transformation


Generates unique keys for any port on a row

Passive Transformation
Connected
Ports
Two predefined
output ports,
NEXTVAL and
CURRVAL
No input ports allowed
Usage
Generate sequence numbers
Shareable across mappings
152

Sequence Generator Properties

Number
of
Cached
Values

153

Dynamic Lookup
By the end of this section you will be familiar with:
Dynamic lookup theory
Dynamic lookup advantages
Dynamic lookup rules

154

Additional Lookup Cache Options

Make cache
persistent
Cache File Name Prefix
Reuse cache by
name for another
similar business
purpose

Recache from Database


Overrides other
settings and Lookup
data is refreshed

Dynamic Lookup Cache


Allows a row to know about the
handling of a previous row
155

Persistent Caches
By default, Lookup caches are not persistent
When Session completes, cache is erased
Cache can be made persistent with the Lookup
properties
When Session completes, the persistent cache is
stored on server hard disk files
The next time Session runs, cached data is loaded
fully or partially into RAM and reused
Can improve performance, but stale data may pose
a problem
156

Dynamic Lookup Cache Advantages


When the target table is also the Lookup table,
cache is changed dynamically as the target load
rows are processed in the mapping
New rows to be inserted into the target or for
update to the target will affect the dynamic Lookup
cache as they are processed
Subsequent rows will know the handling of
previous rows
Dynamic Lookup cache and target load rows
remain synchronized throughout the Session run
157

Update Dynamic Lookup Cache


NewLookupRow port values
0 static lookup, cache is not changed
1 insert row to Lookup cache
2 update row in Lookup cache

Does NOT change row type


Use the Update Strategy transformation before or after
Lookup, to flag rows for insert or update to the target
Ignore NULL Property
Per port
Ignore NULL values from input row and update the cache

using only with non-NULL values from input

158

Example: Dynamic Lookup Configuration


Router Group Filter Condition should be:
NewLookupRow = 1

This allows isolation of insert rows from update rows


159

Concurrent and Sequential Workflows


By the end of this section you will be familiar with:
Concurrent Workflows
Sequential Workflows
Scheduling Workflows
Stopping, aborting, and suspending Tasks and
Workflows

160

Multi-Task Workflows - Sequential


Tasks can be run sequentially:

Tasks shows are all Sessions, but they can also be


other Tasks, such as Commands, Timer or Email Tasks

161

Multi-Task Workflows - Concurrent


Tasks can be run concurrently:

Tasks shows are all Sessions, but they can also be


other Tasks such as Commands, Timer or Email Tasks.
162

Multi-Task Workflows - Combined


Tasks can be run in a combination concurrent and
sequential pattern within one Workflow:

Tasks shows are all Sessions, but they can also be


other Tasks such as Commands, Timer or Email Tasks
163

Additional Transformations
By the end of this section you will be familiar with:
The Rank transformation
The Normalizer transformation
The Stored Procedure transformation
The External Procedure transformation
The Advanced External Procedure transformation

164

Rank Transformation
Filters the top or bottom range of records

Active Transformation
Connected
Ports
Mixed
One pre-defined
output port
RANKINDEX
Variables allowed
Group By allowed
Usage
Select top/bottom
Number of records
165

Normalizer Transformation
Normalizes records from relational or VSAM sources

Active Transformation
Connected
Ports
Input / output or output
Usage
Required for VSAM
Source definitions
Normalize flat file or
relational source
definitions
Generate multiple
records from one record
166

Normalizer Transformation
Turn one row
YEAR,ACCOUNT,MONTH1,MONTH2,MONTH3, MONTH12
1997,Salaries,21000,21000,22000,19000,23000,26000,29000,29000,34000,34000,40000,4500
0
1997,Benefits,4200,4200,4400,3800,4600,5200,5800,5800,6800,6800,8000,9000
1997,Expenses,10500,4000,5000,6500,3000,7000,9000,4500,7500,8000,8500,8250

Into multiple rows

167

Stored Procedure Transformation


Calls a database stored procedure

Passive Transformation
Connected/Unconnected
Ports
Mixed
R denotes port will
return a value from the
stored function to the
next transformation
Usage
Perform transformation
logic outside PowerMart /
PowerCenter
168

External Procedure Transformation (TX)


Calls a passive procedure defined in a dynamic linked
library (DLL) or shared library

Passive Transformation
Connected/Unconnected
Ports
Mixed
R designates return
value port of an
unconnected
transformation
Usage
Perform transformation
logic outside PowerMart /
PowerCenter

Option to allow partitioning

169

Advanced TX Transformation
Calls an active procedure defined in a dynamic linked
library (DLL) or shared library

Active Transformation
Connected Mode only
Ports
Mixed
Usage
Perform
transformation logic
outside PowerMart /
PowerCenter
Sorting, Aggregation
Option to allow partitioning

170

Transaction Control Transformation


Allows custom commit types (source- or targetbased) and user-defined conditional commits

Passive Transformation
Connected Mode Only
Ports
Input and Output
Properties
Continue
Commit Before
Commit After
Rollback Before
Rollback After
171

Transaction Control Functionality


Commit Types
Target Based Commit Commit Based on approximate number of records
written to target
Source Based Commit
Ensures that a source record is committed in all
targets
User Defined Commit
Uses Transaction Control Transform to specify
commits and rollbacks in the mapping based on
conditions
Set the Commit Type (and other specifications) in the
Transaction Control Condition
172

Versioning
View Object Version Properties
Track Changes to an Object
Check objects in and out
Delete or Purge Object version
Apply Labels and Run queries
Deployment Groups

173

Informatica Business Analytics Suite


Modular
Plug-&-Play
Approach

Packaged
Analytic Solutions

Custom Built
Analytic Solutions
174

Informatica Warehouses / Marts


Informatica Warehouse
Customer
Relationship

Finance

Human
Resources

Supply Chain

Sales

G/L

Compensation

Planning

Marketing

Receivables

Scorecard

Sourcing

Service

Payables

Inventory

Web

Profitability

Quality

Common Dimensions
Customer

Product

Organization

Supplier
Time

Geography

Employee
175

Inside the Informatica Warehouse


Business Intelligence

Informatica Warehouse
Advanced
Calculation
Engine

Analytic
Data Model

Business Adapters (Extract)


Data Source Connectivity with Minimal

Analytic Bus (Transform)


Transaction consolidation and

Warehouse Loader

Transform

standardization
Source independent interface

Warehouse Loader (Load)


Type I, II slowly changing dimensions
History and changed record tracking

Analytic Bus

Extract

Load

SAP

Load
Structural/Functional Knowledge of
Sources

Business Adapters

Analytic Data Model


Industry Best Practice Metrics
Process-centric model & conformed

dimensions

ORCL

i2

SEBL

PSFT

Custom

Advanced Calculation Engine


Pre-aggregations for rapid query response
Complex calculation metrics (e.g.

statistical)

176

PowerConnect Products
Family of enterprise software products that allow
companies to directly source and integrate ERP, CRM,
real-time message queue, mainframe, AS/400, remote
data and metadata with other enterprise data

PowerConnect for MQSeries (real time)


PowerConnect for TIBCO (real time)
PowerConnect for PeopleSoft
PowerConnect for SAP R/3
PowerConnect for SAP BW
PowerConnect for Siebel
PowerConnect for Mainframe
PowerConnect for AS/400
PowerConnect for Remote Data
PowerConnect SDK
177

178

You might also like