You are on page 1of 40

BUSINESS INTELLIGENCE/DATA

INTEGRATION/ETL/INTEGRATION
AN INTRODUCTION
Presented by: Gautam Sinha

What is Business Intelligence


Business Intelligence (BI) encompasses the processes,
tools, and technologies required to transform
enterprise data into information, and information
into knowledge that can be used to enhance decisionmaking and to create actionable plans that drive
effective business activity.
BI can be used to acquire
Tactical insight to optimize business processes by
identifying trends, anomalies, and behaviors that
require management action.
Strategic insight to align multiple business
processes with key business objectives through
integrated performance management and analysis.

What is Business Intelligence


Business Intelligence (BI) is about getting the
right information, to the right decision makers, at
the right time.
BI is an enterprise-wide platform that supports
reporting, analysis and decision making.
BI leads to:
fact-based decision making
single version of the truth
BI includes reporting and analytics.

BI is not a single computer system, but framework for leveraging data for tactical and
strategic use

Used for:

How BI Works Together


Data Input

Disparate Data Sources


OLTP

Extract
Transform
Load

Single
Reporting
Repository

AIMSPC

OLTP

Real-time
Dashboards

Static and
Ad-hoc Reporting
TIMS DW

RECBASS

OLTP

ATRRS
Other Possible Data Sources
RATSS
RFMSS

Graphical
Data Analysis

Components of BI
Data Integration ( Informatica, DataStage)

Data Reporting ( Cognos, Business Objects)

Data Integration
Data integration involves combining data residing in
different sources and providing users with a unified
view of these data.This process becomes significant in
a variety of situations both commercial (when two
similar companies need to merge their database) and
scientific (combining research results from different
bioinformatics repositories, for example).
Data integration appears with increasing frequency as
the volume and the need to share existing data
explodes It has become the focus of extensive
theoretical work, and numerous open problems remain
unsolved. In management circles, people frequently
refer to data integration as "Enterprise Information
Integration" (EII).

How to enable Data Integration

USING ETL PROCESS

ETL ( Extract Transform Load)


ETL stands for extract, transform and
load, the processes that enable
companies to move data from multiple
sources, reformat and cleanse it, and
load it into another database, a data
mart or a data warehouse for analysis,
or on another operational system to
support a business process

ETL ( Extract Transform Load)


A Properly designed ETL system extracts data

from the source systems, enforces data quality


and consistency standards, conforms data so
that separate sources can be used together, and
finally delivers data in a presentation-ready
format so that application developers can build
applications and end users can make
decisions ETL makes or breaks the data
warehouse Ralph Kimball

ETL ( Extract Transform Load)

ETL ( Extract Transform Load)

ETL Process Flow

ETL Process Flow

ETL Glossary

Source System
A database, application, file, or other storage facility from which the
data in a data warehouse is derived.
Mapping
The definition of the relationship and data flow between source and
target objects.
Metadata
Data that describes data and other structures, such as objects,
business rules, and processes. For example, the schema design of a
data warehouse is typically stored in a repository as metadata, which
is used to generate scripts used to build and populate the data
warehouse. A repository contains metadata.
Staging Area
A place where data is processed before entering the warehouse

ETL Glossary

Cleansing
The process of resolving inconsistencies and fixing the anomalies in
source data, typically as part of the ETL process.
Transformation
The process of manipulating data. Any manipulation beyond copying
is a transformation. Examples include cleansing, aggregating, and
integrating data from multiple sources.
Transportation
The process of moving copied or transformed data from a source to a
data warehouse.
Target System
A database, application, file, or other storage facility to which the
"transformed source data" is loaded in a data warehouse.

ETL Tools

Informatica 8.6 What & How to work?


What is Informatica 8.6?
Informatica is an ETL tool that delivers an
open, scalable data integration solution
addressing the complete life cycle for data
warehouse
and
analytic
application
development.
Informatica provides an environment that can
extract data from multiple sources, transform
the data according to the business logic that
is
built
in
the
Informatica
Client
application and load the transformed data
into files or relational targets.

Informatica 8.6 PowerCenter

PowerCenter provides an environment that allows you to


load data into a centralized location, such as a data
warehouse or operational data store (ODS). You can extract
data from multiple sources, transform the data according
to business logic you build in the client application, and
load the transformed data into file and relational
targets.

Informatica Architecture 8.6

Informatica Architecture 8.6- Data Flow

Informatica Architecture 8.6- Components

PowerCenter - Components

PowerCenter - Components

Informatica PowerCenter Domain

PowerCenter - Domain

PowerCenter Admin Console

PowerCenter Application Services

Informatica-Power Center
Repository Service

Informatica-Power Center
Integration Service

PowerCenter Client Components


The Informatica Client is used to manage users, define sources
and targets, building
mappings and mapplets with the transformation logic, and
create sessions to run the
mapping logic.
The Informatica Client has the following main applications:
Repository Manager
Designer
Workflow Manager
Workflow Monitor

PowerCenter Repository

PowerCenter Client Components

PowerCenter Client Components


Repository Manager: This is used to create and administer the metadata
repository.

The repository users and groups are created through the


Repository Manager.
Assigning privileges and permissions, managing folders in the
repository and managing locks on the mappings are also done
through the Repository Manager

Informatica/Power Center Client


Components
Designer: The Designer has five tools that are used to analyze
sources, design target schemas and build the Source to Target
mappings. These are
1.

Source Analyzer: This is used to either import or create the source definitions.

2.

Target Designer: This is used to import or create target definitions.

3.

Mapping Designer: This is used to create mappings that will be run by the
Informatica Server to extract, transform and load data.

4.

Transformation Developer: This is used to develop reusable transformations


that can be used in mappings.

5.

Mapplet Designer: This is used to create sets of transformations referred to as


Mapplets which can be used across mappings.

Informatica/Power Center Client


Components

What is WORKFLOW MANAGER?


Its a tool where you define a set of instructions called a
workflow to execute mappings you build in the Designer.

What are workflow manager tools?


It consists of three tools to help you develop a workflow.
Task Developer. Use the Task Developer to create tasks
you want to execute in the workflow.
Workflow Designer. Use the Workflow Designer to create a
workflow by connecting tasks with links. You can also
create tasks in the Workflow Designer as you develop the
workflow.
Worklet Designer. Use the Worklet Designer to create a
worklet.

Load Design Process


1.

Create Source definition(s)

2.

Create Target definition(s)

3.

Create a Mapping

4.

Create a Session Task

5.

Create a Workflow from Task components

6.

Run the Workflow and verify the results

Informatica Transformations
Informatica Transformations

In Informatica,Transformations help to transform the source data


according to the requirements of target system and it ensures the
quality of the data being loaded into target.

Following are the list of Transformations available in Informatica:


Aggregator Transformation
Expression Transformation
Filter Transformation
Joiner Transformation
Lookup Transformation
Normalizer Transformation
Rank Transformation
Router Transformation
Sequence Generator Transformation
Sorter Transformation
Update Strategy Transformation

Informatica Transformations

Aggregator Transformation
Aggregator transformation is an Active and Connected transformation. This
transformation is useful to perform calculations such as averages and Sums

Expression Transformation
Expression transformation is a Passive and Connected transformation. This can
be used to calculate values in a single row before writing to the Target

Filter Transformation
Filter transformation is an Active and Connected transformation. This can be
used to filter rows in a mapping that do not meet the condition.

Joiner Transformation
Joiner Transformation is an Active and Connected transformation. This can be
used to join two sources coming from two different locations or from same
location

Rank Transformation
Rank transformation is an Active and Connected transformation. It is used
to select the top or bottom rank of data

Any Suggestions

You might also like