You are on page 1of 24

Ab Initio

Welcome to
Ab Initio

10/15/08
Dell Confidential

WELCOME
TO

1
Information Technology

Points on Board
What is Ab initio
What is a Co>operating System
What is EME
How Ab initio provides ETL Process
What is .DML,.XFR etc.
Parallelism
Transformations
What is a Sandbox/Project
What is a watcher
What are phases and check points
What is a Subgraph
Questions

10/15/08
Dell Confidential

2
Information Technology

What is Ab initio

Ab initio is an ETL Tool


13 of fortune 50 companies are using Ab initio in different areas like
telecommunication, finance, health care, retail, shipping etc.
Whether managing big data or supporting business critical activities, Ab initio
solutions are constructed and deployed extremely quickly and deliver excellent
performance and scalability.
Ab initio has proved to be best of breed ETL tool for virtually every kind of
data processing.

10/15/08
Dell Confidential

3
Information Technology

Ab initio Architecture

Ab initio Architecture

10/15/08
Dell Confidential

4
Information Technology

Co>Operating System

1. The Co>Operating System is core software that unites a network


of computing resources-CPUs, storage disks, programs, datasetsinto a production-quality data processing system with scalable
performance reliability.
2.The Co>Operating System is layered on top of the native operating
systems of a collection of computers. It provides a distributed
model for process execution, file management, process
monitoring, checkpointing, and debugging.
3.The Graphical Development Environment (GDE) provides a
graphical user interface into the services of the Co>Operating
System.
10/15/08

EME

EME Enterprise Meta Environment (DataStore)


EME is a collection of different projects metadata which
maintains the version control information of the metadata of the
Individual projects.
EME is a web browser interface to the co>Operating system
Where In we can watch all the production runs and statistics.

10/15/08
Dell Confidential

6
Information Technology

How Ab initio provides ETL Process

Ab initio can connect to all the databases with specific dbconfig files for that
particular database where in we provide info to connect to database like :db
version,username,password etc.
DB Components unloads the source table records into Ab initio,performs
transformations and then loads the data into the target. The data may also be in the
flatfile format
Ab initio provides excellent features for data manipulation as per the
requirement. Ex : User defined fns, built-in fns for different data types.

10/15/08
Dell Confidential

7
Information Technology

Parallelism
Pipeline Parallelism
A graph with multiple components running simultaneously on the same data uses
pipeline parallelism.
Each component in the pipeline continuously reads from upstream components,
processes data, and writes to downstream components. Since a downstream
component can process records previously written by an upstream component, both
components can operate in parallel.

Component Parallelism
A graph with multiple processes running simultaneously on separate data uses
component parallelism.

Data Parallelism
A graph that deals with data divided into segments and operates on each segment
simultaneously uses data parallelism. Nearly all commercial data processing tasks
can use data parallelism. To support this form of parallelism, Ab Initio software
provides Partition components to segment data, and Departition components to
merge segmented data back together .

10/15/08

DML & XFR..

Record format of the records is a file with an extension of .dml.


Transform function for the above record is a file with an extension of .xfr.

10/15/08
Dell Confidential

9
Information Technology

DML

10/15/08

10

XFR

10/15/08

11

Transform Components
Ab initio provides task specific transform components like
Sort
Dedup Sort
Join
Reformat
Rollup
Filter by expression
Database components
Dataset components
Replicate
Runprogram

10/15/08
Dell Confidential

12
Information Technology

Transform Components
Partition components
Departition components

10/15/08
Dell Confidential

13
Information Technology

SANDBOX

A sandbox is a collection of graphs and related files that are stored


in a single directory tree, and treated as a group for purposes of
version control, navigation, and migration.
A sandbox can be a file system copy of a DataStore project.

10/15/08
Dell Confidential

14
Information Technology

SANDBOX..

10/15/08
Dell Confidential

15
Information Technology

Project

A project is a collection of graphs and related files in a DataStore that


are stored in a single directory tree and treated as a group for
purposes of version control, navigation, and migration.
A project has attributes that describe its behavior, including
parameters, an import list, a directory list, and an extension list.
Projects can be linked to other projects to share common files.

10/15/08
Dell Confidential

16
Information Technology

WATCHER
A watcher lets you view the data that has passed through a Flow.

10/15/08
Dell Confidential

17
Information Technology

PHASES & CHECKPOINTS


A phase is a stage of a graph that runs to completion
before the start of the next stage. By dividing a graph into phases,
you can save resources, avoid deadlock, and safeguard against
failures.
A checkpoint is a phase that acts as an intermediate stopping point
in a graph to safeguard against failures. By assigning phases with
checkpoints to a graph, you can recover completed stages of the
graph if failure occurs.

10/15/08
Dell Confidential

18
Information Technology

PHASES & CHECKPOINTS

10/15/08
Dell Confidential

19
Information Technology

SUBGRAPHS

A subgraph is a graph fragment. Just like graphs, subgraphs can


contain components and flows. Subgraphs are useful for grouping a
graph into subtasks and reusing them.

10/15/08
Dell Confidential

20
Information Technology

SUBGRAPHS

10/15/08
Dell Confidential

21
Information Technology

What are Benefits of Data warehouse

10/15/08
Dell Confidential

22
Information Technology

References

Ab Initio Software
Discussion Forums
www.ittoolbox.com

10/15/08
Dell Confidential

23
Information Technology

Q&A

?
10/15/08
Dell Confidential

?
?

?
?

?
?

?
24
Information Technology

You might also like