You are on page 1of 9

ISSN:2229-6093

T Gopinath et al, Int.J.Computer Technology & Applications,Vol 5 (2),411-419

SCATTERED ACROSS DATA MART TROUBLES TRIUMPH OVER IN THE COURSE


OF DATA ACQUISITION THROUGH THE DECISION SUPPORT SYSTEM
{T.Gopinath, Damodar Naidu.M, Lenin.Y, S.Rakesh, Sandeep.M}
{tgopisv@gmail.com, damodarmaddina@gmail.com, lenin_19912009@yahoo.com, somarakesh.26@gmail.com,
sandeep.mylari@gmail.com}

Abstract
A data mart is a simple form of a data warehouse
that is focused on a single subject such as Sales,
Finance and Marketing. Given their single-subject
focus, Data marts are generally used to draw data
from only a few sources. The sources could be
internal operational systems, a central data
warehouse, or external data. A data warehouse is
unlike a data mart, deals with multiple subject areas
and is typically implemented and controlled by a
central organizational unit, it is called a central or
enterprise data warehouse. This basic definition
limits the size of a data mart or the complexity of the
decision-support data. On the other hand data marts
are typically smaller and less complex than data
warehouses. Hence, they are typically easier to build
and maintain. Extract, Transform and Load (ETL)
is a fundamental process to populate data marts
within the subject for the specific data warehouse
information. Data mart facing the problems mainly
on platform support, ODBC connectivity data
integration, extraction, transformation and Meta
data management. These are all problems will
overcome by ETL tool which offers native platform
support for the major database platforms. Most of
the ETL tool has directly support for bulk copy
programs (BCP) and various ODBC data formats.
ETL tool also offer to support for varied application
integration along with the Moving data from
various source environments into designated targets
is a key feature. Business rules are added as data is
extracted, ETL conditional and mathematical
transformations are achieved and Meta data
management is a key problem that the better tools
address to varying degrees.

Load OLTP, EAI, Informatica, Meta data,


Platform Support, Meta data management.

1.1 Data Warehouse and Data Mart

The term data warehousing generally refers to the


combination of many different databases across an
entire enterprise. A data warehouse is the data
repository of an enterprise. It is generally used for
research and decision support. The data warehouse
consists of integrated data, i.e. preliminary "cleaning"
of the data is necessary to ensure rationalization and
standardization. A database, or collection of
databases, designed to help managers make strategic
decisions about their business. The data warehouse
characteristics are subject-oriented, it is of integrated
data, non-volatile etc. Development of a data
warehouse includes development of systems to
extract data from operating systems plus installation
of a warehouse database system that provides
managers flexible to access the data. The term Data
mart refers to a sub entity of data warehouses
containing the data of the data warehouse for a
particular sector of the company (department,
division, service, product line, etc.).The data mart is a
subset of the data warehouse that is usually oriented
to a specific business line or team. Whereas a data
warehouse combines databases across an entire
enterprise, data marts are usually smaller and focus
on a particular subject or department. Some data
marts are called dependent data marts and are subsets
of larger data warehouses.

Keywords
Data mart, Data warehouse, ETL, Database, Data
repository, Data integration, Extract, Transform,

IJCTA | March-April 2014


Available online@www.ijcta.com

411

ISSN:2229-6093

T Gopinath et al, Int.J.Computer Technology & Applications,Vol 5 (2),411-419

Figure: 1.1 Data Warehouse and Data Mart


Figure :1.2.1.2 Independent Marts

1.2.1.3 Hybrid Data Marts

1.2 Data mart


Data Warehouse focuses on enterprise wide data
across many or all subject areas. Data mart is
restricted to a single business process or single
business group. Union of all data mart is equal to data
warehouse. Nothing in these basic definitions limits
the size of a data mart or the complexity of the
decision-support data that it contains. Nevertheless,
data marts are typically smaller and less complex
than data warehouses. Hence, they are typically
easier to build and maintain. The following table
summarizes the basic differences between a data
warehouse and a data mart.

1.2.1 Dependent,
Hybrid Data Marts

Independent,

A hybrid data mart allows you to combine input from


sources other than a data warehouse. This could be
useful for many situations, especially when you need
Ad-hoc integration, such as after a new group or
product is added to the organization.

and

1.2.1.1 Dependent Marts


A dependent data mart allows you to unite your
organization's data in one data warehouse. This gives
you the usual advantages of centralization.
Figure: 1.2.1.3 Hybrid Data Marts
1. A hybrid data mart transformation data to
combine input from sources other than a data
warehouse.
2. Extracting the data from hybrid data mart based
on required conditions.
3. After extracting load into as a departmental data
marts.
Figure:1.2.1.1 Dependent Marts

2.1 Introduction to ETL


1.2.1.2 Independent Marts
An independent data mart is created without the use
of a central data warehouse. This could be desirable
for smaller groups within an organization.

IJCTA | March-April 2014


Available online@www.ijcta.com

An ETL tool is a type of software used to populate


databases or data warehouses from heterogeneous
data sources. ETL stands for:

412

ISSN:2229-6093

T Gopinath et al, Int.J.Computer Technology & Applications,Vol 5 (2),411-419

Extract Extract data from data sources


Transform Transformation of data in order
to correct errors, make some data cleansing,
change the data structure, make them
compliant to defined standards, etc.
Load Load transformed data into a target
DBMS, service, file format
An ETL tool should manage the insertion of new data
and the updating of existing data. It should also be
able to perform transformations from an OLTP
system to another OLTP system and from an OLTP
system to analytical data warehouse.

2.2 Informatica process


Power Center improves the flexibility of your ETL
process with the ability to extract more enterprise
data-types than any other technology on the market.
Informatica Platform delivers successful ETL
initiatives with access to virtually any enterprise datatype including:
Structured, unstructured, and semi-structured
data.
Relational, mainframe, file, and standardsbased data.
Message queue data.
Automate most ETL processes for fewer
errors and greater productivity.
Power Center makes your ETL tool jobs easier with
cross-functional tools, reusable components, and an
enterprise-wide platform that automates many ETL
processes. For data warehousing and ETL
developers, that means fewer ETL errors and
emergency fixes, less risk of rework, faster
development time, and greater productivity.

2.3 Running procedure


Step 1: Install informatica
Step2: First to create the services, the services are
repository service and integration services

Step3: After creating the services. Go to client and


perform the designing, mapping, transforms and
workflow analysis.

2.4
ETL
Process
Responsibilities

Roles

&

In order to complete this task of integrating the raw


data received from NSE & BSE, KLXY Limited
allots responsibilities to Data Models, DBAs and
ETL Developers. During this entire ETL process,
many IT professionals may involve, but we are
highlighting the roles of these three personals only
for easy understanding and better clarity.
Data Models analyze the data from these two
sources (Record Layout 1 & Record Layout
2), Design Data Models, and then generate
scripts to create necessary tables and the
corresponding records.
DBAs create the databases and tables based
on the scripts generated by the data models.
ETL developers map the extracted data from source
systems and load it into target systems after applying
the required transformations.

2.4 Leverage your Manufacturing Data


to improve Operations and meet
Customer Demand faster
Problem on manufacturing:
The manufacturing industry continually undergoes
changes
in
customer
demand,
regulatory
requirements, and supplier management. These shifts
along with the need to drive innovation faster, are
pressuring manufacturers to maximize the potential
of their manufacturing data.

IJCTA | March-April 2014


Available online@www.ijcta.com

413

ISSN:2229-6093

T Gopinath et al, Int.J.Computer Technology & Applications,Vol 5 (2),411-419

Explanation:
Despite the need to do more with less, manufacturers
are expected to be faster, smarter, and more flexible.
When they meet customer demand and boost
operational efficiencies, manufacturers are rewarded
with higher customer satisfaction, lower costs, and
higher profits. But for some, issues with their
manufacturing data and a lack of data governance
impede their best business and technology strategies.

Transaction
Control
Statements
(COMMIT,
ROLLBACK)
SQL TRANSFORMATION
The following options can be used to configure an
SQL
transformation

Informatica solutions for manufacturing enhance


your brand and boost innovation by effectively
managing manufacturing data on customers,
products, inventory, raw materials, suppliers, bills of
materials (BOM), pricing, and sales.
Maximize account revenue with relevant cross-sell
and up-sell offers through the right channels and
deliver a higher level of customer service.
Streamline supply chain operations, maximize
product availability, and accelerate fulfillment by
collaborating better with suppliers and
customers and employing lean and agile
methods.
Optimize channel operations and eliminate
coverage gaps and conflicts.
Boost performance and increase efficiency by
achieving inactive manufacturing data from
applications
Deliver manufacturing data from new and
existing data sources, in the cloud or on premise,
to any manufacturing analytics or operational
application.
SQL Transformation is a connected transformation
used to process SQL queries in the midstream of a
pipeline. We can insert, update, delete and retrieve
rows from the database at run time using the SQL
transformation.

Active/Passive: By default, SQL transformation is an


active transformation. You can configure it as passive
transformation.

The SQL transformation processes external SQL


scripts or SQL queries created in the SQL editor. You
can also pass the database connection information to
the SQL transformation as an input data at run time.
The following SQL statements can be used in the
SQL transformation.
Data Definition Statements (CREATE, ALTER,
DROP, TRUNCATE, RENAME)
Data Manipulation Statements (INSERT, UPDATE,
DELETE, MERGE)
Data Retrieval Statement (SELECT)
Data Control Language Statements (GRANT,
REVOKE)

IJCTA | March-April 2014


Available online@www.ijcta.com

Mode: SQL transformation runs either in Script


mode or Query mode.

Database Type: The type of database that the SQL


transformation connects to.
Connection type: You can pass database connection
information or you can use a connection object.
We will see how to create an SQL transformation in
script mode, query mode and passing the dynamic
database
connection
with
examples.
Creating SQL Transformation in Query Mode
Query Mode: The SQL transformation executes a
query that defined in the query editor. You can pass
parameters to the query to define dynamic queries.
The SQL transformation can output multiple rows
when the query has a select statement. In query
mode, the SQL transformation acts as an active
transformation.
You can create the following types of SQL queries
Static SQL query: The SQL query statement does
not change, however you can pass parameters to the
SQL query. The integration service runs the query
once and runs the same query for all the input rows.
Dynamic SQL query: The SQL query statement and
the data can change. The integration service prepares
the query for each input row and then runs the query.
SQL Transformation Example Using Static SQL
Query
Q1) Lets say we have the products and Sales table
with the below data.
Table name: Products
PRODUCT
--------SAMSUNG
LG

414

ISSN:2229-6093

T Gopinath et al, Int.J.Computer Technology & Applications,Vol 5 (2),411-419

IPHONE
Table Name: Sales
PRODUCT QUANTITYPRICE
---------SAMSUNG 2 100
LG
3 80
IPHONE
5 200
SAMSUNG 5 500
Create a mapping to join the products and sales table
on product column using the SQL Transformation?
The output will be
PRODUCT QUANTITY PRICE
-----------------------------------------SAMSUNG 2
100
LG
3
80
IPHONE
5
200
SAMSUNG 5
500
Solution:
Just follow the below steps for creating the SQL
transformation
to
solve
the
example
Create a new mapping, drag the products source
definition to the mapping.
Go to the toolbar -> Transformation -> Create ->
Select the SQL transformation. Enter a name and
then click create.

In the same "SQL Ports" Tab, go to the SQL query


and enter the below SQL in the SQL Editor.
Select product, quantity, price from sales where
product=?product?
Here product is the parameter binding variable which
takes its values from the Input port. Now connect the
source qualifier transformation ports to the input
ports of SQL transformation and target input ports to
the SQL transformation Output ports. The complete
mapping flow is shown below.

Select the execution mode as query mode, DB type as


Oracle, connection type as static. This is shown in the
below image. Then click OK.

Edit the SQL transformation, go to the "SQL Ports"


tab and add the input and output ports as shown in the
below image. Here for all the ports, you have to
define Data Type (informatica specific data types)
and Native Type (Database specific data types).

IJCTA | March-April 2014


Available online@www.ijcta.com

Create the workflow, session and enter the


connections for source, target. For SQL
transformation also enter the source connection.
After you run the workflow, the integration service
generates the following queries for SQL
transformation
Select product, quantity, price from sales where
product=SAMSUNG
Select product, quantity, price from sales where
product=LG`
Select product, quantity, price from sales where
product=IPHONE
Dynamic SQL query: A dynamic SQL query can
execute different query statements for each input
row. You can pass a full query or a partial query to

415

ISSN:2229-6093

T Gopinath et al, Int.J.Computer Technology & Applications,Vol 5 (2),411-419

the SQL transformation Input ports to execute the


dynamic SQL queries.
SQL Transformation Example Using Full
Dynamic
Query
Q2) I have the below source table which contains the
below data.
Table Name: del_Tab
Del_statement
-----------------Delete from sales WHERE Product=LG
Delete from product WHERE Product=LG
Solution:
Just follow the same steps for creating the SQL
transformation
in
the
example
1.
Now go to the "SQL Ports" tab of SQL
transformation and create the input port as "Query
Port". Connect this input port to the Source Qualifier
Transformation.
In the "SQL Ports" tab, enter the SQL query as
~Query Port~. The tilde indicates a variable
substitution for the queries.
As we dont need any output, just connect the SQL
Error port to the target.
Now create workflow and run the workflow.
SQL Transformation Example Using Partial
Dynamic
query
Q3) In the example 2, you can see the delete
statements are similar except the table name. Now we
will pass only the table name to the SQL
transformation. The source table contains the below
data.
Table Name: Del_Tab
Tab_Names
-------------Sales
Products
Solution:
Create the input port in the SQL transformation as
Table_Name and enter the below query in the SQL
Query window.
Delete
FROM
~Table_Name~WHERE
Product=LG

IJCTA | March-April 2014


Available online@www.ijcta.com

SQL TRANSFORMATION IN SCRIPT MODE


EXAMPLES - INFORMATICA
This is continuation to my previous post on SQL
Transformation in Query Mode. Here we will see
how to use SQL transformation in script mode.
Script
Mode
In a script mode, you have to create the SQL scripts
in a text file. The SQL transformation runs your SQL
scripts from these text files. You have to pass each
script file name from the source to the SQL
transformation Script Name port. The script file name
should contain a complete path to the script file. The
SQL transformation acts as passive transformation in
script mode and returns one row for each input row.
The output row contains results of the query and any
database
error.
SQL Transformation default ports in script mode
In script mode, By default three ports will be created
in SQL transformation. They are
Script Name (Input port) : Receives the name of the
script to execute for the current row.
Script Result (output port) : Returns PASSED if the
script execution succeeds for the row. Otherwise
FAILED.
Scrip error (Output port) : Returns errors that occur
when a script fails for a row.
Rules and Guidelines for Script mode
You have to follow the below rules and guidelines
when using the SQL transformation in script mode:
You can run only static SQL queries and cannot run
dynamic SQL queries in script mode. You can
include multiple SQL queries in a script.
You need to separate each query with a semicolon.
The integration service ignores the output of select
statements in the SQL scripts. You cannot use
procedural languages such as oracle PLSQL or
Microsoft/Sybase T-SQL in the script.
You cannot call a script from another script. Avoid
using Nested scripts. The script must be accessible to
the integration service. You cannot pass arguments to
the script. You can use mapping variables or
parameters in the script file name. You can use static
or dynamic database connection in the script mode.

416

ISSN:2229-6093

T Gopinath et al, Int.J.Computer Technology & Applications,Vol 5 (2),411-419

Create SQL Transformation in Script Mode


We will see how to create SQL transformation in
script mode with an example. We will create the
following sales table in oracle database and insert
records into the table using the SQL transformation.
ScriptName:sales_ddl.txt
Create
Table
Sales(Sale_id
Number,
Product_name varchar2(30),PriceNumber);
ScriptName:sales_dml.txt
Insert into sales values(1,SAMSUNG,2000);
Insert into sales values(2,LG,1000);
Insert into sales values(3,NOKIA,5000);
I created two script files in the $PMSourceFileDir
directory. The sales_ddl.txt contains the sales table
creation statement and the sales_dml.txt contains the
insert statements. These are the script files to be
executed
by
SQL
transformation.
We need a source which contains the above script file
names. So, I created an another file in the
$PMSourceFileDir directory to store these script file
names.
File name: Script_name.txt
---------------------------------$PMSourceFileDir/sales_ddl.txt
$PMSourceFileDir/sales_ddl.txt
Now we will create a mapping to execute the script
files using the SQL transformation. Follow the below
steps
to
create
the
mapping.
Go to the mapping designer tool, source analyser and
create the source file definition with the structure as
the $PMSourceFileDir/Script_names.txt file.
The flat file structure is shown in the below image.

Go to the Warehouse designer or Target designer and


create a target flat file with result and error ports.
This is shown in the below image.

IJCTA | March-April 2014


Available online@www.ijcta.com

Go to the mapping designer and create a new


mapping. Drag the flat file into the Mapping
designer. Go to the Transformation in the toolbar,
Create, select the SQL transformation, enter a name
and click on create. Now select the SQL
transformation options as script mode and DB type as
Oracle and click ok.

The SQL transformation is created with the default


ports.
Now connect the source qualifier transformation
ports to the SQL transformation input port.
Drag the target flat file into the mapping and connect
the SQL transformation output ports to the target.
Save the mapping. The mapping flow image is shown
in the below picture.

Go to the workflow manager, create a new mapping


and session. Edit the session. For source, enter the
source file directory, source file name options as
$PMSourceFileDir\
and
Script_names.txt
respectively. For the SQL transformation, enter the
oracle database relational connection.
Save the workflow and run it. This will create the
sales table in the oracle database and inserts the
records. Importing the table from varies databases.
For example importing table from ORACLE database
through ODBC connections.

417

ISSN:2229-6093

T Gopinath et al, Int.J.Computer Technology & Applications,Vol 5 (2),411-419

Conclusion

The use of an ETL tool or the process of spooling


data out to a flat file, transforming data using scripts
or a programming language and the use of a native
utility like SQL Load to put the data into its
respective tables.ETL Tool are mainly required for
Cleansing of operational data, Frequent data message
during transformation, Duplication and/or migration
are necessary, Tables at target should be
populated/updated taking data from different tables
from different databases, Target database should
contain the ETL procedures/packages for system
integrity, Data repository containing metadata and
Data mining is important for OLAP or analytical
cubes these are all operation ETL Tool well
efficiently performed. ETL Tool are using represents
a real problem for need to find a standard conceptual
model for representing in simplified way the
extraction, transformation, and loading (ETL)
processes. Some approaches have been introduced to
handle this problem. We have classified these
approaches into three categories. First is modeling
based on mapping expressions and guidelines, second
is modeling based on conceptual constructs, and the
final category, is modeling based on UML
environment. We have explained each model in some
detail.

[7]DATA WAREHOUSEING concept, Techniques


, Products and Applications, 3rd edition, C.S.R
Prabhu
Copy right 2008 PHI Learning Private Limited, New
Delhi. All rights reserved.
ISBN-978-81-203-3627-8
[8]Introduction to SQL, Rick F.VanderLans, Pearson
Education
Bernstein, P, Rahm, E., 2000. Data warehouse
scenarios for model
[9]management. In: Proceedings of the 19th
International Conference on Conceptual Modeling
(ER00), LNCS, vol. 1920, Salt Lake City, USA, pp.
115.
[10]Berson, A., Smith, S.J., 1997. Data Warehousing,
Data Mining, and
OLAP. McGraw-Hill.
[11]Demarest, M, 1997. The Politics of Data
Warehousing. <http://
www.hevanet.com/demarest/marc/dwpol.html>.

AUTHORS PROFILE

References
[1]Sibashis Nanda, Informatica DataStage Cognos
Micro Strategy ETL Testing Quick Reference.
[2]Publisher:
Mindprobooks.com,
ISBN:
9789382359265.
[3]Fundamentals of Database Management Systems,
M. L. Gillenson, Wiley Student Edition.
[4]Database System Concepts, Silberschatz, Korth,
McGraw hill, V edition.
[5]Database Systems Using Oracle: A Simplified
guide to SQL and PL/SQL, Shah, PHI.
[6]vicentRainardi, building a data warehouse : with
example in SQL server, Copy right USA
ISBN 978-81-8128-971-1.

IJCTA | March-April 2014


Available online@www.ijcta.com

First Author- T GOPINATH Pursuing Master of


Computer Application in JNTU, ANANTHAPUR at
Siddharth
Institute
of
Engineering
and
Technology, Puttur, Andhra Pradesh, India. His
field of interest is on Java Technologies and Data
ware housing and Data mining.

418

ISSN:2229-6093

T Gopinath et al, Int.J.Computer Technology & Applications,Vol 5 (2),411-419

Second A uthor DAMODAR NAIDU.M

Pursuing MCA in JNTU, ANATHAPUR at Narayana


Engineering college, Nellore, Andhra Pradesh,
India. His field of interest is java Technologies and
AIX Administrator. damodarmaddina@gmail.com

Fifth Author SandeepMylari MCA , MBA ,( MHA)


Working as Assistant Professor and Incharge Head Of
The Dept. in the dept Of MCA at Narayana
Engineering College , Nellore, Andhra Pradesh ,India.
Areas Of Interest : Software Testing , Networking ,
Management By Objectives, Human Resource
Management.

Third Author LENIN.Y Pursing MCA in JNTU,


ANATHAPUR. In Narayana Engineering college,
Nellore, Andhra Pradesh, India. His field of
interest is java Technologies and Database
Administrator.

Fourth Author S RAKESH Pursuing Master of


Computer Application in JNTU, ANANTHAPUR at
Siddharth Institute of Engineering and Technology,
Puttur, Andhra Pradesh, India. His field of interest is
on Java Technologies and Oracle .

IJCTA | March-April 2014


Available online@www.ijcta.com

419

You might also like