BODI Training v1.1

Business Objects Data Integrator
Training BODI Version XI3

1
Audience Application Developers Consultants Database Administrators working on data extraction, data warehousing, or data integration.
2/18/2011
Assumptions You understand your source data systems, RDBMS, business intelligence and e-commerce messaging concepts. You are familiar with SQL (Structured Query Language). You are familiar with Microsoft Windows or UNIX platforms to effectively use Data Integrator.
2/18/2011
Business Objects Data Integration Platform The Data Integration Platform consists of Data Integrator: data movement and management server Rapid Marts: suite of packaged data caches for speedy delivery and integration of data
2/18/2011
Business Objects Data Integration Platform
2/18/2011
Rapid Mart SAP R/3 Modules
Account Payable ----> FI-FInance Account Receivable ----> FI-FInance Cost Center ----> CO-Controlling Human Resources ----> HR-Human Resources Inventory ----> MM-Materials Movement Plant Maintenance ----> PM-Plant Maintenance Production Planning ----> PP-Production Planning Project Systems ----> PS-Project Systems Purchasing ----> SD-Sales and Distribution Sales ----> SD-Sales and Distribution 2/18/2011
6
Data Integrator
DI is a data movement and integration platform
2/18/2011
Data Integrator Architecture
2/18/2011
Data Integrator operating system platforms

DI Designer runs on the following Windows platforms: NT 2000 Professional 2000 Server 2000 Advanced Server 2000 Datacenter Server XP All other DI components run on the above Windows platforms and the following UNIX platforms: Solaris 2.7 and 2.8 (Sun OS releases 5.7 and 5.8) HP-UX version 11.00 (PA_RISC 2.0), and 11.1 IBM AIX 4.3.3.75 with maintenance level 4330-10, and AIX 5.1
2/18/2011
Data Integrator Components
Standard components are:

DI Job Server DI Engine DI Designer DI Repository DI Access Server DI Administrator DI Metadata Reports tool DI Web Server DI Service DI SNMP Agent
10
2/18/2011
Data Integrator Component Relationships
2/18/2011
11
Data Integrator Components DI Job Server starts the data movement engine that integrates data from multiple heterogeneous sources, performs complex data transformations, and manages extractions and transactions. can move data in either batch or real-time mode and uses distributed query optimization, multithreading, in-memory caching, in-memory data transformations, and parallel pipelining to deliver high data throughput and scalability.
2/18/2011
12
Data Integrator Components DI Engine When DI jobs are executed, the Job Server starts DI engine processes to perform data extraction, transformation, and movement. DI engine processes use parallel-pipelining and in-memory data transformations to deliver high data throughput and scalability.
2/18/2011
13
Data Integrator Components DI Designer allows for defining data management applications which consist of data mappings, transformations, and control logic. a development tool with a graphical user interface. It enables developers to create objects, then drag, drop, and configure them by selecting icons in flow diagrams, table layouts and nested, workspace pages. 2/18/2011
14
DI Repository
a set of tables that hold user-created and predefined system objects, source and target metadata, and transformation rules. It is set up on an open client/server platform to facilitate the sharing of metadata with other enterprise tools. Each repository is stored on an existing RDBMS. associated with one or more DI Job Servers.
15
2/18/2011

There are two types of repositories: A local repository is used by an application designer to store definitions of DI objects (like projects, jobs, work flows, and data flows) and source/target metadata. A central repository is an optional component that can be used to support multi-user development. The central repository provides a shared object library allowing developers to check objects in and out of their local repositories.
2/18/2011
16

DI Access Server The Access Server is a real-time, request-reply message broker that collects message requests, routes them to a realtime service, and delivers a message reply within a userspecified time frame. The Access Server queues messages and sends them to the next available real-time service across any number of computing resources. This approach provides automatic scalability because the Access Server can initiate additional real-time services on additional computing resources if traffic for a given real-time service is high. Multiple Access Servers can also be configured.
2/18/2011
17

DI Administrator browser-based administration of DI resources, including: Scheduling, monitoring and executing batch jobs Configuring, starting and stopping real-time services Configuring Job Server, Access Server, and repository usage Configuring and managing adapters Managing users Publishing batch jobs and real-time services via Web services
2/18/2011
18

DI Metadata Reports tool This provides browser-based reports on DI metadata, which is stored in the repository. Reports are provided for: Repository summary Job analysis Execution statistics Impact analysis
2/18/2011
19
Data Integrator Components DI Web Server

supports browser access to the Administrator and the Metadata Reports tool. The Windows service name for this server is DI Web Server service; The UNIX equivalent is a daemon named the Tomcat server. This is the servlet engine used by the DI Web Server.
2/18/2011
20
Data Integrator Components DI Service The DI Service is installed when DI Job and Access Servers are installed. The DI Service starts Job Servers and Access Servers when you reboot your system. The Windows service name is DATA INTEGRATOR Service. The UNIX equivalent is a daemon named AL_JobService.
2/18/2011
21
Data Integrator Components DI SNMP Agent DI error events can be communicated using SNMP-supported applications for better error monitoring. The DI SNMP agent monitors and records information about the Job Servers and jobs running on the computer where the agent is installed.
2/18/2011
22
Data Integrator Management Tools

License Server The License Server allows you to centrally control license validation for DI components and licensed extensions. Repository Manager The Repository Manager allows you to create, upgrade, and check the versions of local and central repositories. Server Manager The Server Manager allows you to add, delete, or edit the properties of Job Servers and Access Servers. It is automatically installed on each computer on which you install a Job Server or Access Server.
2/18/2011
23
Data Integrator Objects

All entities you create, modify, or work with in DI Designer are called objects. The local object library shows objects such as source and target metadata, system functions, projects, and jobs. DI has two types of objects:
Reusable objects
Have a single definition All calls to the object refer to the object definition Changes in the object definition get propagated to all calls to the object definition
Single-use objects
Objects that are defined only within the context of a single job or data flow E.g. Scripts
2/18/2011
24
Data Integrator Object Relationships
2/18/2011
25
Projects
A reusable object that allows you to group jobs. highest level of organization offered by DI. used to group jobs that have schedules that depend on one another or that you want to monitor together. Only one project can be open at a time. Projects cannot be shared among multiple users.
2/18/2011
26
Jobs A job is the only object that is executed. The following objects can be included in a job definition:
Data flows Transforms Work flows Scripts Conditionals While Loops Try/catch blocks.
2/18/2011
27
Datastores represent connections between DI and databases or applications, directly or through adapters. allow DI to access metadata from a database or application and hence to read from or write to a database or an application. DI datastores can connect to: Databases and mainframe file systems. Applications that have pre-packaged or userwritten DI adapters. SAP R/3, SAP BW, PeopleSoft, J.D. Edwards One World, and J.D. Edwards World. 2/18/2011
28
File Formats DI can use data stored in files for data sources or data targets. File format objects can describe files in: Delimited format Characters such as commas or tabs separate each field Fixed width format The column width is specified by the user SAP R/3 format
2/18/2011
29
Data Flows
Data flows extract, transform, and load data; reading sources, transforming data, and loading targets, occurs inside a data flow. A data flow can be added to a job or a work flow. From inside a work flow, a data flow can send and receive information to and from other objects through input and output parameters.
2/18/2011
30
Data Flows
Input Parameters
Source(s)
Data Transformation Operations
Target(s)
Output Parameters
2/18/2011
31
Work Flows
A work flow defines the decision-making process for executing data flows. The purpose of a work flow is to prepare for executing data flows and to set the state of the system after the data flows are complete. The following objects can be elements in work flows:
Work f lows Data flows Scripts Conditionals While loops Try/catch blocks
2/18/2011
32
Work Flows
Control Operations
Data Flow
Control Operations
2/18/2011
33
Conditionals Conditionals are single-use objects used to implement if/then/else logic in a work flow. To define a conditional, you specify a condition and two logical branches: If A Boolean expression that evaluates to TRUE or FALSE. You can use functions, variables, and standard operators to construct the expression. Then Work flow elements to execute if the If expression evaluates to TRUE. Else (Optional) Work flow elements to execute if the If expression evaluates to FALSE. 2/18/2011
34
Conditionals
Work Flow Conditional True If Process Successful False Else Send E-mail Then Run Work Flow
2/18/2011
35
While Loops The while loop is a single-use object that you can use in a work flow. The while loop repeats a sequence of steps as long as a condition is true.
2/18/2011
36
While Loops
False While Number != 0 True Step 1 Step 2
2/18/2011
37
Try / Catch Blocks A try/catch block is a combination of one try object and one or more catch objects that allow you to specify alternative work flows if errors occur while DI is executing a job. Try/catch blocks:
Catch classes of exceptions thrown by DI, the DBMS, or the operating system Apply solutions that you provide Continue execution
Try and catch objects are single-use objects. 2/18/2011

38
Try / Catch Blocks Categories of available exceptions are:

Database access errors Email errors Engine abort errors Execution errors File access errors Microsoft connection errors Parser errors Predefined transform errors Repository access errors Resolver errors System exception errors User transform errors
2/18/2011
39
Scripts
Scripts are single-use objects used to call functions and assign values to variables in a work flow. A script can contain the following statements:
Function calls If statements While statements Assignment statements Operators
40
2/18/2011
Types Of Lookup Functions

Retrieves a value in a table or file based on the values in a different source table or File.
1) Lookup 2) Lookup_Ext 3) Lookup_Seq
2/18/2011
41
Variables Variables are symbolic placeholders for values.

Local Variables
Local variables are local to the work flow in which they are defineda local variable defined in a work flow is available for use in any of the single-use objects in the work flow. The value of the local variable can be passed as a parameter into another work flow or data flow called in the work flow.
2/18/2011
42
Variables
Global Variables
Global variables are global within a job. Once a name for a global variable is used in a job that name becomes reserved for the job. Global variables are exclusive within the context of the job in which they are created. Setting parameters is not necessary when you use global variables.
2/18/2011
43
Parameters
Parameters are expressions passed to a work flow or data flow when the work flow or data flow is called. Parameters can be defined to pass values into and out of work flows, data flows, and custom functions
2/18/2011
44
Transforms
The following transforms are available from the object library on the Transforms tab.
-- Case -- Effective_Date -- History_Preserving -- Map_Operation -- Pivot (Columns to Rows) -- Reverse Pivot (Rows to Columns) -- SQL -- Date_Generation -- Hierarchy_Flattening -- Key_Generation -- Merge -- Query -- Row_Generation -- Table_Comparison
2/18/2011
45
Query Transform
Retrieves a data set that satisfies conditions that you specify. A query transform is similar to a SQL SELECT statement.
2/18/2011
46
Query Transform
2/18/2011
47
Query Transform
Input Schema
Output Schema
Options
2/18/2011
48
Case Transform
Specifies multiple paths in a single transform (different rows are processed in different ways). Simplifies branch logic in data flows by consolidating case or decision making logic in one transform. Paths are defined in an expression table.
2/18/2011
49
Case Transform
2/18/2011
50
Case Transform
2/18/2011
51
SQL Transform
Performs the indicated SQL query operation. Use this transform to perform standard SQL operations for things that cannot be performed using other built-in transforms.
2/18/2011
52
SQL Transform
2/18/2011
53
SQL Transform
2/18/2011
54
Merge Transform
Combines incoming data sets, producing a single output data set with the same schema as the input data sets.
2/18/2011
55
Merge Transform
2/18/2011
56
Merge Transform
2/18/2011
57
Row_Gen Transform
Produces a data set with a single column. The column values start from zero and increment by one to a specified number of rows.
2/18/2011
58
Row_Gen Transform
2/18/2011
59
Key_Generation Transform
Generates new keys for new rows in a data set. The Key_Generation transform looks up the maximum existing key value from a table and uses it as the starting value to generate new keys.
2/18/2011
60
2/18/2011
61
2/18/2011
62
Date_Generation Transform
Produces a series of dates incremented as you specify. Use this transform to produce the key values for a time dimension target. From this generated sequence you can populate other fields in the time dimension (such as day_of_week) using functions in a query.
2/18/2011
63
2/18/2011
64
Date_Generation
2/18/2011
65
Table_Comparison Transform
Compares two data sets and produces the difference between them as a data set with rows flagged as INSERT or UPDATE. The Table_Comparison transform allows you to detect and forward changes that have occurred since the last time a target was updated.
2/18/2011
66
Table_Comparison Transform
2/18/2011
67
Map_Operation Transform
Allows conversions between data manipulation operations. The Map_Operation transform allows you to change operation codes on data sets to produce the desired output. For example, if a row in the input data set has been updated in some previous operation in the data flow, you can use this transform to map the UPDATE operation to an INSERT. The result could be to convert UPDATE rows to INSERT rows to preserve the existing row in the target. 2/18/2011
68
Map_Operation Transform
2/18/2011
69
Table_Comparison & Map_Operation Transforms
2/18/2011
70
History_Preserving Transform
The History_Preserving transform allows you to produce a new row in your target rather than updating an existing row. You can indicate in which columns the transform identifies changes to be preserved. If the value of certain columns change, this transform creates a new row for each row flagged as UPDATE in the input data set.
2/18/2011
71
Pivot Transform (Columns to Rows)

Creates a new row for each value in a column that you identify as a pivot column. The Pivot transform allows you to change how the relationship between rows is displayed. For each value in each pivot column, DI produces a row in the output data set. You can create pivot sets to specify more than one pivot column.
2/18/2011
72
Pivot Transform (Columns to Rows)

Region Sales-2001 Sales-2002 Sales-2003 North 200 300 400 East 300 600 700 West 350 800 770 South 800 200 3750
Region North North North 2/18/2011
Year 2001 2002 2003
Sales 200 300 400

73
Reverse Pivot Transform (Rows to Columns)

Creates one row of data from several existing rows. The Reverse Pivot transform allows you to combine data from several rows into one row by creating new columns. For each unique value in a pivot axis column and each selected pivot column, DI produces a column in the output data set.
2/18/2011
74
Reverse Pivot Transform (Rows to Columns)

Region North North North Year 2001 2002 2003 Sales 200 300 400
Region 2001 2002 2003 North 200 300 400
2/18/2011
75
Functions
Functions operate on single values, such as values in specific columns in a data set. You can use functions in the following operations: Queries Scripts Conditionals You can use :
Built-in functions (DI functions) Custom functions (user-defined functions) Database and application functions (functions specific to DBMS)
2/18/2011
76
Procedures
DI supports the use of stored procedures for Oracle, Microsoft SQL Server, Sybase, and DB2 databases. You can call stored procedures from the jobs you create and run in DI
2/18/2011
77
Debugging
Execute a job in the Data Scan mode View and analyze the output data in the Data Scan window Compare and analyze different data samples
2/18/2011
78
Debugging Data Scan Mode
2/18/2011
79
Debugging Analyzing The Output

Object List Scan Date and Time
Schema Area
Data Area
80
2/18/2011
Migration and Repositories

The development process you use to create your ETL application involves three distinct phases: design, test, and production. Each phase may require a different computer in a different environment, and different security settings for each. To control the environment differences, each phase may require a different repository.
81
2/18/2011
Design Repository
Export to Test Repository
Test Repository
Export to Production Repository
Production Repository 2/18/2011

82
When moving objects from one phase to another, export jobs from your source repository to either a file or a database, then import them into your target repository.
2/18/2011
83
Exporting Objects to a Database
You can export objects from the current repository to another repository. However, the other repository must be the same version as the current one. The export process allows you to change environment-specific information defined in datastores and file formats to match the new environment.
2/18/2011
84
Exporting/Importing Objects to/from a File You can also export objects to a file. If you choose a file as the export destination, DI does not provide options to change environment specific information. Importing objects or an entire repository from a file overwrites existing objects with the same names in the destination repository. You must restart DI after the import process completes.
2/18/2011
85
Parallel Execution The maximum number of parallel DI engine processes in the Job Server options (Tools > Options> Job Server > Environment). This helps in running the transforms in parallel.
2/18/2011
86
Parallel Work Flows / Data Flows
2/18/2011
87

BODI Training v1.1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BODI Training v1.1

Uploaded by

Copyright:

Available Formats

Business Objects Data Integrator

Training BODI Version XI3

Business Objects Data Integration Platform

Rapid Mart SAP R/3 Modules

Data Integrator Architecture

Data Integrator operating system platforms

Data Integrator Components

Standard components are:

Data Integrator Component Relationships

Data Integrator Components

Data Integrator Components

Data Integrator Components

Data Integrator Components

Data Integrator Components

Data Integrator Components DI Web Server

Data Integrator Management Tools

Data Integrator Objects

Data Integrator Object Relationships

Data Transformation Operations

False While Number != 0 True Step 1 Step 2

Try and catch objects are single-use objects. 2/18/2011

Try / Catch Blocks Categories of available exceptions are:

Types Of Lookup Functions

1) Lookup 2) Lookup_Ext 3) Lookup_Seq

Variables Variables are symbolic placeholders for values.

Table_Comparison & Map_Operation Transforms

Pivot Transform (Columns to Rows)

Pivot Transform (Columns to Rows)

Region North North North 2/18/2011

Year 2001 2002 2003

Sales 200 300 400

Reverse Pivot Transform (Rows to Columns)

Reverse Pivot Transform (Rows to Columns)

Region 2001 2002 2003 North 200 300 400

Debugging Data Scan Mode

Debugging Analyzing The Output

Migration and Repositories

Migration and Repositories

Export to Test Repository

Production Repository 2/18/2011

Migration and Repositories

Exporting Objects to a Database

Parallel Work Flows / Data Flows

You might also like