You are on page 1of 24

Informatica Interview Questions

http://www.geekinterview.com/Interview-Questions/Data-
Warehouse/Informatica/page6

Mapping LogicGive any one logic you have implemented in mappinig except SCD
Type2 (real time logic)?
--------------------------------------------------------------------------------
Look override
Explain the suppress comment in lookup override?
Answer: Lookup override is nothing but just simply overriding the default sql which is
generated by lookup by the runtime,
default SQL contains SELECT and ORDER BY caluse and the order by clause orders the
data based on the ports order which appear
--------------------------------------------------------------------------------
Data Loading While Session Failed
If there are 10000 records and while loading, if the session fails in between, how will
you load the remaining data?
Answer: OPB Server you can find how many records are loaded in your target and set
session properties
Answer: Using performance recovery option in Session property.
Recover session from the last commit level.
--------------------------------------------------------------------------------
Session Performance
How would you improve session performance?
Answer: Session performance can be improved by allocating the cache memory in a way
that it can execute all the transformation
within that cache size.Mainly only those transformations are considered as a bottleneck
for performance which uses CACHE.
We are using three transformation in our mapping-
1) Aggregator - using 1 MB for Indexing and 2MB for Data - Lets assume that after using
the Aggregator cache calculator the derived value is 6MB.
Likewise use the cache calculator for calculating the cache size of all the
transformation(which uses chache)
2) Joiner - 5MB
3) Look-up - 7MB
So, combining the total cache size will be 6MB+5MB+7MB = 18MB
So, minimum of 18MB must be allocated to the session cache. If we are allocating less
memory to the session than the Integration service might fail the session itself.
So, for optimizing the session performance CACHE plays an important role.

Thanks. If you don't find my answer satisfactory please read


"Session Cache" chapter from Informatica guide, you will have a better picture.
--------------------------------------------------------------------------------
Informatica Tracking Levels
What are the Tracking levels in Informatica transformations? Which one is efficient
and which one faster, and which one is best in Informatica Power Center 8.1/8.5?

Informatica FAQ – Collected by Abhik Basak


Answer: Also for efficient & faster depends on your requirement.If you want minimum
information to be written to the session logs you can use the Terse tracing level.From
Informatica Transformation Guide:"To add a slight performance boost, you can ...Read

Answer: I guess you are asking about the tracing level. When you configure a
transformation, you can set the amount of detail the Integration Service writes in the session
log. PowerCenter 8.x supports 4 types of tracing level:

1.Normal: Integration Service logs initialization and status information, errors encountered,
and skipped rows due to transformation row errors. Summarizes session results, but not at
the level of individual rows.

2.Terse: Integration Service logs initialization information and error messages and notification
of rejected data.

3. Verbose Initialization: In addition to normal tracing, Integration Service logs additional


initialization details, names of index and data files used, and detailed transformation statistics.

4.Verbose Data: In addition to verbose initialization tracing, Integration Service logs each row
that passes into the mapping. Also notes where the Integration Service truncates string data
to fit the precision of a column and provides detailed transformation statistics.
Allows the Integration Service to write errors to both the session log and error log when you
enable row error logging.
When you configure the tracing level to verbose data, the Integration Service writes row data
for all rows in a block when it processes a transformation.

By default, the tracing level for every transformation is Normal.

-------------------------------------------------------------------------------

Transaction Control
Explain Why it is bad practice to place a Transaction Control transformation upstream
from a SQL transformation?
Answer: First Thing as mentioned in the Other Posts it drops all the incoming transaction
control boundries.Besides you can use Commit and Rollback statements within SQL
Transformation Script to control the transactions, thus avoiding the need for the
Transaction thus avoiding the need for the Transaction Control Transformation
--------------------------------------------------------------------------------
Update Strategy Transformation
Why the input pipe lines to the joiner should not contain an update strategy
transformation?
Answer: Update Strategy flags each row for either Insert, Update, Delete or Reject. I
think when you use it before Joiner, Joiner drops all the flagging details.This is a curious
question though, but I can not imagine how would one expect to deal with the scenario it
would make it really complicated to join the rows flagged for different database operations
(Update Insert Delete) and then decide which operation to perform. To avoid this I think
Informatica prohibited Update Strategy Transformation to be used before Joiner
transformation.

--------------------------------------------------------------------------------

Informatica FAQ – Collected by Abhik Basak


Passive Router Transformation
Router is passive transformation, but one may argue that it is passive because in case if
we use default group (only) then there is no change in number of rows. What
explanation will you give?
Answer: Basically Router is not a passive transformation. Its an active Transformation.A
transformation that allows rows to pass through it is an Active Transformation. In this
case Router does allow rows to pass through it. ...
Answer: I think there is fair bit of confusion over whether router is active transformation or
passive transformation. The answer to that is router is an ACTIVE transformation.
My answer to the question (PASSIVE ROUTER TRANSFORMATION) was based on the context it
was asked. Router without any specified groups would have just the default group in it hence
would act as passive transformation and solve no actual purpose. It will not play with the
number of records and further more it will hit performance also. Inability in conveying the
resolution properly is regretted.
Answer: Passive transformation is that where no. of inputs is equal to no. of outputs and from
a router transformations no data will be rejected and you will be getting each and every record
which is coming in to it. So Router is passive
Answer: Router is Active transformation it is used for multiple conditions purpose.
Why it is active transformation means suppose one ROW satisfied two or more conditions that
row will comes out more than one same row to the router so the row will come here number of
rows.
--------------------------------------------------------------------------------
Update Strategy
In which situation do we use update strategy?
Answer: By Default all the rows entering the Integration Service are marked as Insert.
But if you want to flag rows based on business logic for update or delete or insert, then
you have to use Update Strategy ...
Answer: Update strategy transformation is used not only to update the existing row/s but it
can also be used to update a table by in serting news rows deteting rows or updating the
existing row or deleting the row in to the table.
Answer: In addition to the four constants like DD_Update DD_Insert DD_Delete DD_Reject we
must also set the session property - Treat Source Rows as -- "DATA DRIVEN"
If you do not choose Data Driven when a mapping contains an Update Strategy or Custom
Transformation the Workflow Manager display a warning. When you run the session the
Integration Service does not follow instructions in the UPS to determine how to flrag the rows
hence fails the session.
Answer: We use update strategy when we need to alter the Database Operation
(Insert/Update/Delete) based on the data passing through and some logic. Update Strategy
alllows us to Insert (DD_INSERT) Update (DD_UPDATE) and Delete (DD_Delete) based on logic
specified in the Update Strategy condition field.
For update strategy to come into effect we must set a session level property "Treat Source As "
user defined.
Answer: Update Strategy is used in case fo Updating a record if the same exists in the table.
For that we need to have a primary key in the Informatica level even if the same is not present
in database level. If no key is defined we need to use 'update override' in the target. Also when
using this transformation click the 'Data Driven' property in the Session.
--------------------------------------------------------------------------------
Unconnected Lookup
In which situation do we use unconnected lookup?
Answer: Unconnected Lookup is used when lookup is not used in every record. Lookup
data is called at the point in mapping that needs it. It can be used in any transformation

Informatica FAQ – Collected by Abhik Basak


that support expression. Use look function within conditional statement, If condition is
satisfied then lookup function is called.
Answer: Unconnected lookup should be used when we need to call same lookup multiple
times in one mapping. For example, in a parent child relationship you need to pass mutiple
child id's to get respective parent id's.
One can argue that this can be achieved by creating resusable lookup as well. Thats true, but
reusable components are created when the need is across mappings and not one mapping. Also,
if we use connected lookup multiple times in a mapping, by default the cache would be
persistent.
Answer: 1. When the return value is one then go for unconnected.
2. Most important one is you want return value in some conditions/cases then go for
Unconnected to increase the performance of job.

For eg: you have 1 Million records and you want to have some return value for only half million
based on some condition if commission < 100 then we want return value. This kind of cases will
go for this.

Answer:The advantage of using an Unconnected Lookup is that the Lookup will be executed
only when a certain condition is met and therefore improve the performance of your mapping.
If the majority of your records will meet the condition then an unconnected lookup would not
be an advantage. An example would be if I need to populate or update a value only for
Customers living in California. I would execute my lookup only when State 'CA'.

------------------------------------------

Dependency Problems
What are the possible dependency problems while running session?
Answer: Dependency problems means when we run a process, the process output is input
to other process. Then first process is stopped then it causes problem or stop
running other process. One process is depending on other other. If one process get
effected then other process effect. This is called problem dependency.

Error handling Logic


How do you handle error logic in Informatica?
What are the transformations that you used while handling errors? How did you reload
those error records in target?
Answer: Bad files contains column indicator and row indicator.
Row indicator: It generally happens when working with update strategy transformation.
The writer/target rejects the rows going to the target
Columnindicator:
D -valido - overflown - nullt - truncateWhen the data is with nulls, or overflow it will be
rejected to write the data to the target
The reject data is stored on reject files. You can check the data and reload the data in to
the target using reject reload utility.

Target Data Loading


How & Where will you can see data loading in targets dynamically?
Answer: We can check the target recorded loaded dynamically by validating the query on
target records with time period.
1) We need to run the target records related query for first load with time period and find the

Informatica FAQ – Collected by Abhik Basak


record count of data any duplicate records are there we need to check.
2) Whenever process data for second load again we need to run the target records related
query with time period and find the record count and check the duplicate records find any
exiting records updated.

Turn Off Version


What happens if you turn off version?
Answer: You would not be able to track the changes done to the respective
mappings/sessions/workflows. ...

Sequence Generator
Why sequence generator is not directly connected to joiner transformation?
Answer: Remember Joiner tx can accept input from only 2 pipelines.Now only following 2
scenerios possible to have both joiner and Seq Gen tx:
1)2 input pipelines are already connected and then we want to connect Seq Gen to joiner: In
this scenerio, the input from Seq Gen will be a input from third pipeline thus, informatica will
thro error on that.

2)Only 1 input pipeleine is connected and we want to connect Seq Gen: In this case,
ifnromatica will allow you to connect the Seq Gen but then u wont have the mathing
CONDITION. thus, this scenerio is practically impossible.

So, both the above scenerios of having Seq Gen with Joiner arent PRACTICALLY possible.

Answer: Sequence generator transformation is basically used to generate a unique number for
each record which is updated or inserted in the target table. Also it is used if we want to
create a unique key in the table and on the basis of this unique key we are going to identify the
record within the table. this is the same as the concept of primary key of the table.

Answer: Sequence Generator transformation is a Passive and Connected transformation. It is


used to create unique primary key values or cycle through a sequential range of numbers or to
replace missing keys.

It has two output ports to connect transformations. By default it has two fields CURRVAL and
NEXTVAL(You cannot add ports to this transformation). NEXTVAL port generates a sequence of
numbers by connecting it to a transformation or target. CURRVAL is the NEXTVAL value plus
one or NEXTVAL plus the Increment By value.

Generally Sequence Generators are connected to the targets to minimize the wastage of the
generated numbers.

Joiner Transformation Cache


What are the caches available in Joiner transformation?What is stored in Joiner
cache(data cache) and Index cache?
Answer: For Unsorted Input joiner transformation,
i) Data cache stores all master rows in the join condition with unque index keys
ii) Index cache stores all master rows.
For Sorted Input with Different Sources,
i) Data cache stores 100 master rows in the join condition with unique index keys.
ii) Index cache Stores master rows that correspond to the rows stored in the index cache.

Informatica FAQ – Collected by Abhik Basak


If the master data contains multiple rows with the same key, the Integration Service
stores more than 100 rows in the data cache.

For Sorted Input with the Same Source,


i) Data cache Stores all master or detail rows in the join condition with unique keys.
Stores detail rows if the Integration Service processes the detail pipeline faster than the
master pipeline. Otherwise, stores master rows. The number of rows it stores depends on
the processing rates of the master and detail pipelines. If one pipeline processes its rows
faster than the other, the Integration Service caches all rows that have already been
processed and keeps them cached until the other pipeline finishes processing its rows.
ii) Index cache Stores data for the rows stored in the index cache. If the index cache
stores keys for the master pipeline, the data cache stores the data for master pipeline. If
the index cache stores keys for the detail pipeline, the data cache stores data for detail
pipeline.

Answer: Joiner Transformation cache are consists four types. there persisted cache static
cache dynamic cache and sharing cache.

The Informatica server stores the conditional values in index cache or joiner cache and output
values are stored in data cache.

DTM Buffer Size Default Buffer Blocksize


What is DTM buffer size, Default buffer blocksize. If any performance issue happens
to session, which one we have to increase and which one we have decrease.
Answer: DTM buffer size is memory you allocate to DTM process (12 MB)Buffer Block
size is Size of heaviest Source/Target* number of rows that can be moved at a
time(should be minimum 20 (64 KB)And Informatica bydeault assign it to be as for 83
sources and Targets(Buffer Memory)
So we should increase/decrease size accordingly
if more then 83 sources and Target then we should increase DTM and if source or target are
heavy we should go with increasing buffer Block size.

Answer: Workflow manage allocate default of 12MB size to session. you can increase this
according to you requirement by specifying them in the form of bytes or you can set to AUTO it
will take accordingly.

you need to increase the DTM buffer size when you are dealing with large set of character data
and you configured it to run on unicode mode. increase the DTM buffer to 24 MB.

$ & $$ in Mapping or Parameter File


What is the difference between $ & $$ in mapping or parameter file? In which cases
they are generally used?
Answer: $ : For Session level Varable/Parameter(You can check when you provide
connection string name for a Session)
$$ : For Mapping level Varable/Parameter(You can check when you will create a
parameter/Variable in Mapping level, it will come by default as ...

Answer:

Informatica FAQ – Collected by Abhik Basak


$ - is used to denote Server level Variables/Parameters.
$$ - is used to denote Mapping level Variable/Parameter.

This is the convention that we generally follow to differentiate between Server Level/Mapping
Level variables or parameters.

Thus anything related to a server or execution related activities we use $ this means whenever
we define any file name or any database name we use a single $ and whenever there is some
variable related to mapping or mapplet we use $$

Data Flow
How the data flows from dimensions to fact table?
Answer: Data in reality does not flow from dimensions to facts or visa versa.
Dimensions contain textual descriptive information in a business unit of an Enterprise.
This kind of data is usually contained in the where clauses of your query. Informations
such a Customer name address etc is dimensional. Facts are additive quantitative data or
facts that relates to the dimensions. These are stored in fact table (and have relationship with
the dimensions). For example a customer may have specific sales data in the form of qantities
of certain items purchase at certain store locations at certian dates. In this typical scenario you
would end up with a sales fact table in relationship with a few dimensions such as customer
informaton store location date product etc. As you can see these dimensions are the filters of
your query when you are seeking to get at the facts. It is true that the Primary keys of the
dimensions migrate to the fact table to become a cooporate primary key in the fact table. So
in the ETL process dimensions are loaded first and then the PKs of the dimensions migrate to
the Fact table to establish relationship between the dimensional attibuates and the facts. In
that simplistic way one could speak of a flow from dimensions to facts but in reality there is no
flow only a relationship that facilitates the star schema and ultimately rapid query access to
the facts through your particular BI presentation interface.

Answer:
In a Data Warehouse Consider there are two tables D and F where F is a fact table and D is a
dimension table.

A dimension table is the one which contains the Primary key and a Fact table is the one which
contains corresponding Foreign Key.

So it is obvious that data will be firstly loaded in a Primary key table and then it will be loaded
in the foreign key table this means data will flow from Dimension table to a Fact table.

Another Explanation could be if you see the definition of a fact table the source of the data is
always a dimension table if a fact table is deriving data from say three dimension tables then
the three dimension tables will be loaded and then as per the definition the corresponding
foreign keys data will be loaded in the fact table.

How can use worklet variables and work flow variables


can any one tell me How can use worklet variables and work flow variables
Answer: Regarding Usage, there is no difference between workflow and worklet
variables.Worklet variables are used in worklets.

Pre-Worklet Variable Assignment : You can assign the value of the workflow variable to the
worklet variable before the worklet starts

Informatica FAQ – Collected by Abhik Basak


Post-Worklet Variable Assignment : You can assign the value of the worklet variable to the
Parent workflow variable once the worklet completes.

Folder Access

Same machine is using more than one user at different time, how to identify whether an
particular user used a folder or not

Answer: You can check out this options in Repository Manager.Open the Edit tab and select the
option Show User Connections where you can find the information related to the users ,
connection_id and the time they logged in and the host address ...etc. If you were on the real
time machine you should have the admin privileges to view this

Mapplet Transformations

What are the transformations not used in mapplet and why?

Answer: Normalizer transformationXML source qualifierXML filesOther targets.If you need to


use sequence generator transformation use the reusable sequence gen t/rIf you need to use
stored procedure transformation, make the stored procedure type as normal. ...

Answer: The following should not include in a mapplet.

• Normalizer transformations

• Cobol sources

• XML Source Qualifier transformations

• XML sources

• Target definitions

• Pre- and post- session stored procedures

• Other mapplets

Answer:A mapplet can't be used in another mapplet.Because if you try to drag and drop one
mapplet from the left hand side under the mapplet subfolder to the mapplet designer
workspace it won't allow you to do so, but if you try to drag and drop one mapplet to one
mapping,i.e., in the mapping designer then it comes to the workspce.This means a mapplet can
only be used in a mapping but can't be used in another mapplet.That's why mapplet is known
as the reusable form of mapping.

Load data from a pdf to a table using informatica.

Can we load data from a pdf to a table using informatica?

Answer: Yes, you can load data from a pdf to a table using UDO[Unstructured Data Option]
Transformation in Informatica from PC811 onwards ...

Informatica FAQ – Collected by Abhik Basak


Dimension Table Vs Fact Table ?

What is the main difference in the logic when you create a mapping for a dimension table
with that of a fact table in Informatica.

Answer: You can load the dimension table directly, but you cannot load the fact table directly,
You need to look up the dimension table, b'coz fact tables contains foreign keys, which are the
primary keys in dimension table. You can load the dimensions and facts into one mapping
using target load plan

Answer:The main difference between dimension and the fact table is that Dimension preserves
the historical data (like in case of type2) we will have to use update strategy and other
transformations to make that happen but fact will be a direct load with few one or more
lookups from the dimension and also since the fact and dimenision has the foriegn key
relationship the dimension has to be loaded first before the fact

Dimension Table features

1. It provides the context /descriptive information for a fact table measurements.


2. Provides entry points to data.
3. Structure of Dimension - Surrogate key one or more other fields that compose the natural
key (nk) and set of Attributes.

4. Size of Dimension Table is smaller than Fact Table.

5. In a schema more number of dimensions are presented than Fact Table.

6. Surrogate Key is used to prevent the primary key (pk) violation(store historical data).

7. Values of fields are in numeric and text representation.

Fact Table features

1. It provides measurement of an enterprise.

2. Measurement is the amount determined by observation.

3. Structure of Fact Table - foreign key (fk) Degenerated Dimension and Measurements.

4. Size of Fact Table is larger than Dimension Table.

5. In a schema less number of Fact Tables observed compared to Dimension Tables.

6. Compose of Degenerate Dimension fields act as Primary Key.

7. Values of the fields always in numeric or integer form.

Informatica FAQ – Collected by Abhik Basak


Server will be disconnected

Hi, Every thing is fine .But my problem is in work flow monitor the server will be
disconnected.Even thought both oracle and informatica are running

Answer: First you click my computer properties In the properties you click the computer name
tab.

IN the computer name tab you see the Full computer name that is your server name you
should give that name (server name)in theserver configuration(Wrokflow manager) and then
you click the change button in the computer name tab and then click the More.. button in the
computer name chages form after that you see the NetBIOS computer name: that is your Host
name/IP Address you must give that name in the Host name/IP Address column(workflow
manager) and then you give the path of the informatica server in the $PMRootDir column
C:Program FilesInformatica PowerCenter 7.1.1Server same as exactly.

If you installed in the another drive means change the drive name. and the you click the
resolve server. you will find the IP address of your system and then in Informatica server
setup(configure informatica service) you give that Full computer name in the Server Name
and then give NetBIOS computer name in the TCP/IP Host Address coloumn and also give this
name to Repository Server Host Name column. after that you start informatica server service.

Delete Informatica repository

I am using Informatica 7.1.1, How to delete the existing repository, mappings?

Answer: To delete repository- Delete it from Informatica Server Administrative ConsoleTo


delete Mapping - Delete it from designer.To delete folder - Delete it from Repository Manager.

What is difference between source base and target base commit

Answer: Suppose we have choose 10000 source to be commit Source based Commitment is
it will read 10000 rows and pass through many transformation (active and passive) so upto
what out of 10000 rows it reaches the target it will commit And if we have 10000 target to be
commit
Target based Commitment is suppose writer has buffer of 7500 rows so it will read till 7500
rows and write that to target but it won't commit and then when writter buffer will be again fill
with 75000 it will commit to target(total 15000) so it will goes like that

Answer:These are the 3 types of commits possible in INFA

• Target-based commit. The PowerCenter Server commits data based on the number of

target rows and the key constraints on the target table. The commit point also depends

on the buffer block size the commit interval and the PowerCenter Server configuration

for writer timeout.

Informatica FAQ – Collected by Abhik Basak


• Source-based commit. The PowerCenter Server commits data based on the number of

source rows. The commit point is the commit interval you configure in the session

properties.

• User-defined commit. The PowerCenter Server commits data based on transactions

defined in the mapping properties. You can also configure some commit and rollback

options in the session properties.

Cached Lookup and an Uncached Lookup

What is the difference between using a cached lookup and an uncached lookup?

Answer: In Cache Lookup a cache will be created of that Lookup table and IS query once for
mapping rows for Uncache lookup No cache will be build and IS query for each Mapping
rows.So for performance Go for Cache lookup if Lookup table size< Mapping rows Go for
UnCache lookup if Lookup table size> Mapping rows

Answer:

For a cached lookup the entire rows (lookup table) will be put in the buffer and compare these
rows with the incomming rows.
where as uncached lookup for every input row the lookup will query the lookup table and get
the rows.

XML Source and Target

Hi I'm working on XML files using them as source and target,I never worked on XML before.
Did anyone worked on XML? If so could you please help me,its little bit urgent. I want to know
how to import

Answer: To import XML file as a source:


1.To identify a source go to tools---> source analyzer
2.From the sources menu option select 'import xml file'. XML Wizard window will come.
3.In the XML Wizard window the option unstructured will come.Click 'yes' there.
4.Next click on 'Entity Relationship' if you deal with relational type of target.
then click o.k.
5.Save the repository.
5.Then the XML source will be listed on the L.H.S under the sources subfolder under a file
souce head

Similarly you can do to import XML file as a target in the target designer following the same
above steps.
While executing the workflow in the session you need to give the path for this source XML file
properly:
e.g. Source directory: c:files
Source filename: emp_xml.xml

Informatica FAQ – Collected by Abhik Basak


Union Transformation

Why Union transformation is active while it returns same number of rows as in the
sources?

Answer: An active transformation can change the number of rows that pass through it. A
passive transformation does not change the number of rows that pass through it.Union
Transformation does UNION ALL; hence it changes the number of rows as Duplicates are not
...

------------------------------------------------------------------------------------------------------------
What is meant by source is changing incrementally?
I guess your question is related to Slowly Changing Dimensions(SCDs)!!
If you take an ex that there is an object called X the price of X in 2005 is 10 in 2006 is 20 in
2007 is 30 and now it is 40 so along with time the price is changing so whenever if we want to
capture all these changes we go for SCDs.
There are 3 types of SCDs:
1)Current Information:we just update the current price
2)History:all the previous prices and the current price
3)Partial History:just the price in 2007 and 2008 i.e. previous and the present
--------------------------------------------------------------------------------------------------------

Load the remaining rows


Suppose there are 100,000 rows in the source and 20,000 rows are loaded to
target. Now in between if the session stops after loading 20,000 rows how will
you load the remaining rows?
Informatica server has 3 methods torecover the sessions:
(1)run the session again if the Informatica server has not issued a comit
(2)truncate the target tables and run the session again if the session is not recoverable
(3)consider perform recovery if the Informatica server has issued at least one commit

So for your question,use Perform recovery to load the records from where the session fails.

Failed Workflow
When you run the sessions in a sequence what will happen when the first
workflow fails and later you want to start with the second one and followed by
others. Where will you do changes or necessary steps to run the workflow.
Answer:If you configure a session in a sequential batch to stop on failure you can run recovery
starting with the failed session.The Infomatica server completes the session and then runs the
rest of the batch.
If you donot configure a session in a sequential batch to stop on failure the remaining sessions
in the batch completes recover the failed session as a standalone session.

---------------------------------------------------------------------------------------------------------------------

Transformation to Load 5 Flat files

Informatica FAQ – Collected by Abhik Basak


What is the method of loading 5 flat files of having same structure to a single
target and which transformations will you use?

This can be handled by using the file list in informatica. If we have 5 files
in different locations on the server and we need to load in to single target
table. In session properties we need to change the file type as Indirect.

(Direct if the source file contains the source data. Choose Indirect if the
source file contains a list of files.

When you select Indirect the PowerCenter Server finds the file list then reads
each listed file when it executes the session.)
am taking a notepad and giving following paths and filenames in this notepad and saving this
notepad as emp_source.txt in the directory /ftp_data/webrep/

/ftp_data/webrep/SrcFiles/abc.txt
/ftp_data/webrep/bcd.txt
/ftp_data/webrep/srcfilesforsessions/xyz.txt
/ftp_data/webrep/SrcFiles/uvw.txt
/ftp_data/webrep/pqr.txt

In session properties i give /ftp_data/webrep/ in the


directory path and file name as emp_source.txt and file type as Indirect.

-------------------------------------------------------------------------------------------------------

Fact Table - Grain Facts


Answer:Facts are tables that refer to the dimension tables for details. Facts always hold the
foreign keys. Grains are the maximum possible information that can be derived and the
maximum possible level of information that can be derived or prodicted form any
dimension.please let me know if you have any other option.

Answer:This is about granularity. that is what level of data detail should be made avaialble in
the dimension model. A reference to ATOMIC DATA has to be made here.
Atomic data is the most expressive data and should be the foundation for the fact table.
It is the most detailed information collected and such data can not be sub-divided further. It is
highly dimensional.
The more detailed and atomic the fact measurement the more things we know for sure.
It provides maximum analytic flexibility because it can be constrained and rolled up.

Mapping variable - Mapping Parameter


What will happen when Mapping variable and Mapping parameter in Informatica
is not defined or given? Where do you use mapping variable and mapping
parameter?
If mapping parameters and variables are not given we cannot call values n different mappings.
We define mapping parameters and variables in mapping designer.

Assignment task in informatica


What is Assignment task in informatica? In what situation this task will be
executed? Where this task exits?

Informatica FAQ – Collected by Abhik Basak


The Assignment task allows you to assign a value to a user-defined workflow variable. To use an
Assignment task in the workflow, first create and add the Assignment task to the workflow.
Then configure the Assignment task to assign values or expressions to user-defined variables.
After you assign a value to a variable using the Assignment task, the PowerCenter Server uses
the assigned value for the variable during the remainder of the workflow.

Parameter file
How to create parameter file and how to use it in a mapping explain with
example?
Suppose we have certain variable or parameter whoes value which change frequently so we will
change that value in Parameter file( it is more feasible as no change in Mapping)

Please place your parameter file in the server "srcfiles" with data in it.In mapping designer
window of powercenter designer click on "Mapping" and then "Parameter and variable".Add all
the parameter here one by one.

Now you can able to see the variable with "$$" added in the above will be available in your
mapping.This variable inturn picks value from the parameter file.Donot forget to give
"parameter filename" in the "property" tab of task in workflow manager.

Informatica sessions
what is the difference between a session and a task?
Sessions : Set of instructions to run a mapping
Task: session is type of task. other than that informatica several type of task like
Assignment,Command,Control,Decision,Email,Event-Raise,Event-Wait,Timer,session

Informatica objects File Format


What is the file extension and format of the files for the Informatica objects like
Mappings, sessions etc in the repository ?
Yes but when we export these objects from informatica they are saved in .XML format.

How do you maintain Historical data and how to retrieve the historical data?
You can maintain the historical data by desing the mapping using Slowly changing dimensions
types.

If you need to insert new and update old data best go for Update strategy.

If you need to maintain the histrory of the data for ex The cost of the product change happen
frequently but you would like to maintain all the rate history go to SCD Type2.

The design change as per your requirement.If you make your question more clear I can provide
your more information.

Informatica FAQ – Collected by Abhik Basak


what is the difference between reusable transformation and mapplets?
Reusable transformation is a single transformatin which can be resuable & mapplet is a set of
transformations which can be reusable

In what all transformations the mapplets cant be used in informatica??


The mapplets cant use these following transformations
xml source qualifier
normalizer
non-reusable sequence generator (it can use only reusable sequence generator)

What is one disadvantage of using an unconnected (sometimes called function


mode) Lookup transformation?

Unconnected lookup does not support User defined values and it also does not support
Dymamic Cache

What is the difference between source definition database and source qualifier?
A source definition database contain the datatypes that are used in the orginal database from
which the source is extracted. where the source qualifier is used to convert the source
definition datatypes to the informatica datatypes. which is easy to work with.

Source definition in database is simply the table definition but the source qulifier is one of the
transformation in informaitca from which we can specify the data selection criteria to the
database.

For example if you are using only 3 columns out of the available 10 columns in a table along
with some filter conditons to load data in the target table. We can simply select those 3
coulmns and include the filter conditions either in the query or in the filter condition in the
source qualifier properties.

Have you implmented Lookup in your mapping, If yes give some example?

we do. we have to update or insert a row in the target depending upon the data from the
sources. so inorder to split the rows either to update or insert into the target table we use the
lookup transformation in reference to target table and compared with source table.

When do you use Normal Loading and the Bulk Loading, Tell the difference?

Informatica FAQ – Collected by Abhik Basak


Answer:If we use SQL Loder connections then it will be to go for Bulk loading. And if we use
ODBC connections for source and target definations then it is better to go for Normal loading.
If we use Bulk loading then the session performence will be increased.
how means... if we use the bulk loading the data will be BYPASS through the DATALOGS. So
automatically performence will be increased.

Answer:
Normal Load: It loads the records one by one Server writes log file for each record So it takes
more time to load the data.

Bulk load : It loads the number of records at a time it does not write any log files or tracing
levels so it takes less time.

Answer:
You would use Normal Loading when the target table is indexed and you would use bulk loading
when the target table is not indexed. Running Bulk Load in an indexed table will cause the
session to fail.

what are the main issues while working with flat files as source and as targets ?

1. We can not use SQL override. We have to use transformations for all our requirements
2. Testing the flat files is a very tedious job
3. The file format (source/target definition) should match exactly with the format of data file.
Most of the time erroneous result come when the data file layout is not in sync with the actual
file.
(i) Your data file may be fixed width but the definition is delimited----> truncated data
(ii) Your data file as well as definition is delimited but specifying a wrong delimiter (a) a
delimitor other than present in actual file or (b) a delimiter that comes as a character in some
field of the file--->wrong data again
(iii) Not specifying NULL character properly may result in wrong data
(iv) there are other settings/attributes while creating file definition which one should be very
careful
4. If you miss link to any column of the target then all the data will be placed in wrong fields.
That missed column wont exist in the target data file.

What is meant by named cache?At what situation we can use it?

By default there will be no name for the cache in lookup transformation. Everytime you run
the session the cache will be rebuilt. If you give a name to it it is called Persistent Cache. In
this case the first time you run the session the cache will be build and the same cache is used
for any no. of runs. This means the cache doesn't have any changes reflected to it even if the
lookup source is changed. You can rebuilt it again by deleting the cache

What does the first column of bad file (rejected rows) indicate? Explain
First column of the bad file indicates row -indicator and second column indicates column
indicator

Informatica FAQ – Collected by Abhik Basak


Row indicator : Row indicator tells the writer what to do with the row of wrong data

Row indicator meaning rejected by


-----------------------------------------------------------------------------
0 Insert target/writer
1 update target/writer
2 delete target/writer
3 reject writer

If the row indicator is 3 ,the writer rejects the row because the update starategy expression is
marked as reject

Explain pmcmd?

pmcmd means powermart command prompt which used to perform the tasks from

command prompt and not from Informatica GUI window


It performs the following tasks

1.start and stop the sessions and batches

2.process the sesson recovery

3.stop the informatica server

4.check whether informatica server working or not.

Explain Session Recovery Process?

Answer: when the informatica server starts a recovery session it reads the
opb_srvr_recovery table and notes the rowid of the last row commited to the target
database. when it starts recovery process again it starts from the next row_id. if session
recovery should take place atleast one commit must be executed.
Answer:You have 3 steps in session recovery

If Informatica server performs no commit run the session again


At least one commit perform recovery
perform recovery is not possible truncate the target table and run the session again.

How do you fine tune the mapping and what methods we follow for different
types of transformations?
A Mapping can be fine tuned for performance at the following levels :

1.Optimize the target.

2.Optimize the source

3.Optimize the mapping

4.Optimize the transformation

5.Optimize the session

Informatica FAQ – Collected by Abhik Basak


6.Optimize the grid deployments

7.Optimize the PowerCenter components

8.Optimize the system.

For Details, please refer Informatica Performance Tuning Guide.

what is the difference between source qualifier transformation and filter


transformation?
In Source Qualifier we can filter records from different source systems(Relational or Flatfile).
In Filter Transformation we will filter those records which we need to update or proceed
further. In simple before Filter Transformation the data from source system may or may not be
processed(ExpressionTransformation etc...).

By using Source Qualifier we can filter out records from only relational sources. But by using
Filter Transformation we can filter out records from any sources.

In Filter Transformation we can use any expression to prepare filter condition which evaluates
to TRUE or FALSE. The same cannot be done using Source Qualifier.

Answer:A Source Qualifier transformation is the starting point in the mapping where in we are
bringing the incoming data or the source data is extracted from this transformation after
connecting to the source data base.

A filter transformation is a transformation which is placed in the mapping pipe line in order to
pass the data to the data following some specific conditions that has to be followed by the
passing records.

Of course the same purpose can be solved by the Source Qualifier transformation if this is
extracting data from a relational source where as if the data is going to be extracted from a
flat file then we cannot do it using source qualifier.

Can we update target table without using update strategy transformation? why?
Yes we can update the target table by using the session properties.
There are some options in the Session properties.

Targetupdate override.

how do you add and delete header , footer records from flat file during load to
oracle?
We can add header and footer record in two ways.

1) Within the informatica session we can sequence the data as such that the header flows in
first and footer flows in last. This only holds true when you have the header and footer the
same format as the detail record.

2) As soon as the sesion to generate detail record file finishes we can call unix script or unix
command through command task which will concat the header file detail file and footer file
and generrate the required file

Informatica FAQ – Collected by Abhik Basak


what types of Errors occur when you run a session, can you describe them with real time example

There are several errors you will get. Couple of them are
1. if informatica failed to connect data base
2. If source file not eixsts in location
3. If paramerters not initilized with parameter file
4. incompataible piped data types etc.

How to assign a work flow to multiple servers?

Informatica server will use Load manager process to run the workflow
load manager will do assign the workflow process to the multiple servers

Source Qualifier in Informatica?

The most important significance of source qualifier is that it converts the datatypes (coming
from a source) into informatica compatible datatypes.
Rest all properties like distinct overide filtering records can be achieved by other
transformations of informatica as well.

What is use of event waiter?

Event-Wait task. The Event-Wait task waits for an event to occur. Once the event triggers the
PowerCenter Server continues executing the rest of the workflow

What are the types of loading in Informatica?

In Informatica there are mainly 2 types of loading is there.


1. Normal
2. Bulk
you say and one more Incremental Loading.

Normal means it loads record by record and writes logs for that. it takes time.
Bulk load means it loads number of records at a time to target - it ignores logs ignores tracing
level. It takes less time to load data to target.

How do you handle two sessions in Informatica

You can handle 2 session by using a link condition (id $ PrevTaskStatus SUCCESSFULL)
or you can have a decision task between them. I feel since its only one session dependent on
one have a link condition

What is the exact difference between joiner and lookup transformation

Lookup transformation is a Passive transformation where as Joiner is Active Transformation.


Lookup you can have both connected as well as unconected where as Joiner is always
connected.
Joiner is used to join two homogenious or hetrogenious sources residing at different location
where as lookup is used to look-up the data.
Lookup is by default left outer join where as joiner You can have Normal Master Outer Detail
Outer Full Outer.

Informatica FAQ – Collected by Abhik Basak


What is Dataware house key?

Dataware house key is warehouse key or surrogate key.ie generated by sequence generator.this
key used in loading data into fact table from dimensions

what s an ODS? what s the purpose of ODS?

ODS- stands for operational data source.It stores the near real time data.it is stored
temporarily for on the fly report generation or real time reporting purpose.

ODSs(Operational Data Store) are frequently updated somewhat integrated copies of


operational data.Commonly ODS is implimented to deliver operational reporting.ODSs are built
to support especially Customer Relation Management Applications(CRM).ODS is like a Data
Warehouse in its first two characteristics but it is like a OLTP system in its last three
characteristics.If anything wrong in my answer please give reply to me. so that I have to
correct.

ODS(opertional data store) It is part of the data warehouse architecture. It is the first stop for
the data on its way to the warehouse. It is here where I can collect and integrate the data. I
can ensure its accuracy and completeness. In many implementations all transformations can't
be completed until a full set of data is available. If the data rate is high I can capture it
without constantly changing the data in the warehouse. I can even supply some analysis of near
current data to those impatient business analysts. The ODS has a life in many implementations.

what is session recovery?

Session recovery is used when you want the session to continue loading from the point where it
stopped last time it ran.for example if the session failed after loading 10 000 records and you
want the 10 001 record to be loaded then you can use session recovery to load the data from
10 001.

We can insert or update the rows without using the update strategy. Then what is the necessity of
the update strategy?

I guess if we need to update/ insert/ delete/ reject data based on some condition(s) we go for
update strategy

It is not possible to insert/update rows into a table with the help of Dynamic lkup
transformation. Update strategy will have to be used in combination with Dynamic lkup
transformation to flag the rows based on the value of NewLookupRow port.

Yes we can update and insert record from session properties but there are two reasons why we
should use update strategy
1) The mapping become readable
2) we can use condition for insert and update by using update strategy.
which one is better performance wise joiner or look Up

Are you lookuping flat file or database table? Generaly sorted joiner is more effective on flat
files than lookup because sorted joiner uses merge join and cashes less rows. Lookup cashes
always whole file. If the file is not sorted it can be comparable.Lookups into database table
can be effective if the database can return sorted data fast and the amount of data is small
because lookup can create whole cash in memory. If database responses slowly or big amount
of data are processed lookup cache initialization can be really slow (lookup waits for database

Informatica FAQ – Collected by Abhik Basak


and stores cashed data on discs). Then it can be better use sorted joiner which throws data to
output as reads them on input.

what is the diffrence between SCD and INCREMENTAL Aggregation?

hi scd means 'slowly changing dimentions'since dimention table maintains master data the
column values occationally changed .so dimention tables are called as scd tables and the fields
in the scd tables are called as slowly changing dimentions .in order to maintain those changes
we are following three types of methods.1. SCD TYPE1 this method maintains only current
data2. SCD TYPE2 this method maintains whole history of the dimentions here three methods
to identify which record is current one . 1> flag current data 2> version number mapping 3>
effective date range3. SCD TYPE3 this method maintains current data and one time historical
data.

INCREMENTAL AGGRIGATIONsome requirements (daily weekly every 15 days quartly..........)


need to aggrigate the values of certain colums. HERE U have to do the same job every time
(according to requirement) and add the aggrigate value to the previous aggrigate
value(previous run value) of those column.THE PROCESS CALLED AS INCREMENTAL
AGGRIGATION.

Difference between informatica 7.1 and 8.1

Hi The main difference between informatica 7.1 and 8.1 are as follows;
1.1At Architecture level:-
In 7.x you will get two server components i.e Repository Server And Informatica Server but in
8.x Architeture enhanced to SOA(SERVICE ORIENTED ARCHITETURE)
2.At Domain Level:
Here no concept of domain But in 8.x Concept of domain introduced which is a logical grouping
of nodes services user with a common set of corre services say-log services authorization
services.
Users and Group information stored @repository level
3.Communication Protocol:
All clients tools and integration services communicate with repository server over tcp/ip but in
8.x All client tools communicate with metadata reopsitory via repository services
4.Server Grid
In 7.x load balancing by round robin assign workflow explicitly to pmserver but in 8.x Priority
based and current load on informatica server used to load balance.
5.pushdown optimization:
no concept of pushdown optimization in 7.x but in 8.x Executing mapping t/r login directly on
database
6.version Control
Version is in built in the suite and version control not available for user defined functions but in
8.x Enhanced versioned contol objects.Explicit checkout of objects possible from this version
onwards.
8. new t/r added in 8.x say java t/r.

Answer:

1. Deployment groups
2. Data Masking and HTTP Transformations
3. Grid Support
4. Partitioning based on number of CPUs
5. INSTR REG_REPLACE string functions

Informatica FAQ – Collected by Abhik Basak


6. LDAP authentication in user management
7. SYSTIMESTAMP

Answer:

The architecture of Power Center 8 has changed a lot; PC8 is service-oriented for modularity
scalability and flexibility.
2) The Repository Service and Integration Service (as replacement for Rep Server and
Informatica Server) can be run on different computers in a network (so called nodes) even
redundantly.
3) Management is centralized that means services can be started and stopped on nodes via a
central web interface.
4) Client Tools access the repository via that centralized machine resources are distributed
dynamically.
5) Running all services on one machine is still possible of course.
6) It has a support for unstructured data which includes spreadsheets email Microsoft Word files
presentations and .PDF documents. It provides high availability seamless fail over eliminating
single points of failure.
7) It has added performance improvements (To bump up systems performance Informatica has
added "push down optimization" which moves data transformation processing to the native
relational database I/O engine whenever its is most appropriate.)
8) Informatica has now added more tightly integrated data profiling cleansing and matching
capabilities.
9) Informatica has added a new web based administrative console.
10) Ability to write a Custom Transformation in C++ or Java.
11) Midstream SQL transformation has been added in 8.1.1 not in 8.1.
12) Dynamic configuration of caches and partitioning
13) Java transformation is introduced
. 14) User defined functions
15) Power Center 8 release has "Append to Target file"

what is the difff between static cache and dynamic cache in which scenario do we use them .give the
example

Static cache : First data is loaded into informatica buffer or cache memory If you update or
new inserted data comes into inf buffer and compare new inserted or updated records pass
new modified data into targets.hire data is passed each and every transformation.

it is used connected and un-connected lookup transformations.

dynamic cache directly loaded into target table i.e data directly compared into lookup table
load into target.

un -connected lookup table it is not possible

In Static Cache the CACHE ON THE LOOKUP TABLE is built only once - when the 1st row enters
the lookup for a comparision.During the whole session the cache stays as it was hence the
name static.In contrary the dynamic cache keeps on changing(if there is any change) during the
session run itself. Rows are inserted and updated ( depending on the logic and incoming data)
in the cache itself. One then has to synchronize the data between the lookup cache and the
target table ( Dynamic caches are used for the purpose where the lookup is on the target itself
for eg:loading a dimension). For this sync..ing we use the Newlookupport value in the
subsequent update strategy transformations.

Informatica FAQ – Collected by Abhik Basak


1. What is the use of tracing levels in transformation?
Tracing levels store information about mapping and transformations.
4. What is the use of control break statements?
They execute a set of codes within the loop and endloop
6. What is the difference between source qualifier transformation and
application source qualifier transformation?
Source qualifier transformation extracts data from RDBMS or from a single flat file
system. Application source qualifier transformation extracts data from application
sources like ERP
7. How do we create primary key only on odd numbers?
To create primary key, we use sequence generator and set the 'Increment by' property
of sequence generator to 2.
9. What is the use of auxiliary mapping?
Auxiliary mapping reflects change in one table whenever there is a change in the
other table.
8. What is authenticator?
It validates user name and password to access the PowerCenter repository.
10. What is a mapplet?
Mapplet is the set of reusable transformation.
11. Which ETL tool is more preferable Informatica or Data Stage and why?
Preference of an ETL tool depends on affordability and functionality. It is mostly a
tradeoff between the price and feature. While Informatica has been a market leader
since the past many years, DataStage is beginning to pick up momentum.
12. What is worklet?
Worklet is an object that represents a set of tasks.
13. What is workflow?
A workflow is a set of instructions that tells the Informatica server how to execute the
tasks.
14. What is session?
A session is a set of instructions to move data from sources to targets.
15. Why do we need SQL overrides in Lookup transformations?

Informatica FAQ – Collected by Abhik Basak


In order to lookup more than one value from one table, we go for SQL overrides in
Lookups.
16. What is Target Update override?
It overrides the default update statement in the target properties.
19. In what conditions we cannot use Joiner transformation?
• Both pipelines begin with the same original data source. • Both input pipelines
originate from the same Source Qualifier transformation. • Both input pipelines
originate from the same Normalizer transformation. • Both input pipelines originate
from the same Joiner transformation. • Any of the input pipelines contains an Update
Strategy transformation. • Any of the input pipelines contains a connected or
unconnected Sequence Generator transformation.
20. If there is no PK or FK in the target table, how do we update or insert value
into the table?
We take a dynamic lookup on the target and do a comparison with source in an
expression and flag it.

Accenture Pune Interview Question's

1) Different types of Schemas


2) Difference between Star Schema and Snowflake Schema.
3) What is Pull refresh and Incremental load
4) Which case we use Unconnected.
5) Difference between Connected and Unconnected
6) What are the transformations not used in Mapplet
7) Different types of dimensions
8) Different types of Fact
9) What is Factless Fact
10) Is all dimensions in the Start Schema Connected to Fact table
1) Different types of Schemas
-->star falke and snow flake schemas

2) Difference between Star Schema and Snowflake Schema.


-->in star schema dimension tables are denormalized and fact tables are normalized.
in snow flake both dim and fact tables are normalized

4) Which case we use Unconnected.


-->when u have only single return value

Informatica FAQ – Collected by Abhik Basak

You might also like