You are on page 1of 35

Simple Example of SCD Type 1-UPSERT In Informatica

Many beginners get it wrong to manually get the SCD working in Informatica. Lets see with a simple example step by step. Assumption : Working with Scott user in Oracle SRC: create table emps_us as select empno,ename,sal from emp ; TGT: create table empt_us as select empno,ename,sal from scott.emp where 1=2 ; alter table empt_us add constraint eno_pk primary key(empno) ; Step1: Lets Get the Simple Pass Through Working Transfer the data from Source to target using Informatica Note: For some reason if this operation is not done , the update may not work. Step2 : Lets get the update working. Objective : When the source rows changes , update only those changed rows in target SRC: update emps_us set sal=sal+1000 where empno in ( 7900,7902,7934); Power Center Designer Drag the source ,target and create a Update Strategy Transformation. Straight link src - Updtrans - tgt Now we need to lookup into the tgt to see which rows to update. We are going to do it using a unconnected lookup transformation.We want to lookup for rows where the SAL is changed. Create lookup transformation,select target table , add two input port IN_EMPNO and IN_SAL .This values will be supplied when this lkp transformation will be called. Add the conditions empno=IN_EMPNO and SAL != IN_SAL Ports -> enable R for empno . This just signifies that if the condition are met a true will be returned. Now we need to edit update strategy expression as IIF( NOT ISNULL( :LKP.LKPTRANS(EMPNO,SAL)),DD_UPDATE,DD_REJECT ) Save and create a Workflow. Imp: 1) WF->Properties->Treat Source Rows as > Change it to Data Driven. 2) The target should have a primary key for update Now run the workflow . Observe that the modified rows are getting updated. Step3: Now lets get the insert also working. Objective : Now along with updating the existing rows we need to add new rows. Power Center Designer

In the same mapping , drag one more instance of the target , also create a new update strategy. Straingth link SRC - NEW UPDTRANS - TGT Create a lookup transformation for target table . Add port IN_EMPNO , condition EMPNO=IN_EMPNO , enable R for empno .The lkp returns true if the condition is satisfied. Now edit the update Strategy expression as below IIF(ISNULL(:LKP.LKPTRANS(EMPNO)) ,DD_INSERT,DD_REJECT) Save , refresh the workflow mapping Do the following modification to source insert into emps_us values ( 1,'N1',100); update emps_us set sal=sal+1000 where empno in ( 7900,7902,7934); Start the workflow to see your SCD Type 1 working The above example will let you get the basic stuff working. You can keep optimizing may be with a single UPDTRANS and a single target.Edit the strategy expression as
IIF( ISNULL(:LKP.LKPTRANS_INSERT(EMPNO)),DD_INSERT,IIF( ISNULL(:LKP.LKPTRANS_UPDATE(EMPNO,SAL)) ,DD_REJECT,DD_UPDATE ) )

SCD Type-1 Implementation in Informatica using dynamic Lookup


SCD Type-1: A Type 1 change overwrites an existing dimensional attribute with new information. In the customer name-change example, the new name overwrites the old name, and the value for the old version is lost. A Type One change updates only the attribute, doesn't insert new records, and affects no keys. It is easy to implement but does not maintain any history of prior attribute values

Implementation:

Source: Create CUST source using following script. CREATE TABLE CUST (CUST_ID NUMBER, CUST_NM VARCHAR2(250 BYTE), ADDRESS VARCHAR2(250 BYTE), CITY VARCHAR2(50 BYTE), STATE VARCHAR2(50 BYTE),

INSERT_DT DATE, UPDATE_DT DATE);

Target: CREATE TABLE STANDALONE.CUST_D ( PM_PRIMARYKEY INTEGER, CUST_ID NUMBER, CUST_NM VARCHAR2(250 BYTE), ADDRESS VARCHAR2(250 BYTE), CITY VARCHAR2(50 BYTE), STATE VARCHAR2(50 BYTE), INSERT_DT DATE, UPDATE_DT DATE); CREATE UNIQUE INDEX STANDALONE.CUST_D_PK ON STANDALONE.CUST_D(PM_PRIMARYKEY); ALTER TABLE CUST_D ADD (CONSTRAINT CUST_D_PK PRIMARY KEY (PM_PRIMARYKEY)); Import Source and target in to informatica using source analyzer and target designer. Create mapping m_Use_Dynamic_Cache_To_SCD_Type1 and drag CUST source from sources to mapping designer.

Create lookup transformation lkp_CUST_D for CUST_D target table.

Create input ports in_CUST_ID, in_CUST_NM, in_ADDRESS, in_CITY and in_STATE attributes in lkp_CUST_D transformation. Connect CUST_ID, CUST_NM, ADDRESS, CITY and STATE from source qualifier to lookup lkp_CUST_D table in_CUST_ID, in_CUST_NM, in_ADDRESS, in_CITY and in_STATE attributes respectively. Create condition in lookup transformation CUST_ID=in_CUST_ID in conditions tab.

Select dynamic cache and insert else update options in lookup transformation properties.

Assign ports for lookup ports as shown in below screen shot.

Create expression transformation and drag all attributes from lookup transformation and drop in expression transformation and change the name of attributes in expression transformation with respect to source attributes or target attributes, so that it is easy to understand the fields which are coming from source and target.

Create one dummy out port in expression transformation to pass date to target and assign SYSDATE in expression editor.

Create router transformation and drag attributes from expression transformation to router transformation as shown in below screen shot.

Create two groups in router transformation one for INSERT and another one for UPDATE. Give condition NewLookupRow=1 for insert group and NewLookupRow=2 for update group.

Connect insert group from router to insert pipe line in target and update group to update pipe line target through update strategy transformation.

For update strategy transformation upd_INSERT give condition DD_INSERT and DD_UPDATE for upd_UPDATE update strategy transformation Create work flow wkfl_Use_Dynamic_Cache_To_SCD_Type1 with session s_Use_Dynamic_Cache_To_SCD_Type1 for mapping m_Use_Dynamic_Cache_To_SCD_Type1.

With coding for SCD Type1 by using Dynamic lookup transformation completed. Execution: Insert records in source CUST table by using following insert scripts. SET DEFINE OFF; Insert into CUST (CUST_ID, CUST_NM, ADDRESS, CITY, STATE, INSERT_DT, UPDATE_DT) Values (80001, 'Marion Atkins', '100 Main St.', 'Bangalore', 'KA', SYSDATE,SYSDATE); Insert into CUST (CUST_ID, CUST_NM, ADDRESS, CITY, STATE, INSERT_DT, UPDATE_DT) Values (80002, 'Laura Jones', '510 Broadway Ave.', 'Hyderabad', 'AP', SYSDATE,SYSDATE); Insert into CUST (CUST_ID, CUST_NM, ADDRESS, CITY, STATE, INSERT_DT, UPDATE_DT) Values (80003, 'Jon Freeman', '555 6th Ave.', 'Bangalore', 'KA', SYSDATE,SYSDATE); COMMIT; Data in source will look like below.

Start work flow after insert the records in CUST table. After completion of this work flow all the records will be loaded in target and data will be look like below.

Now update any record in source and re run the work flow it will update record in target. If any records in source which are not present in target will be inserted in target table.

SCD type1 step by step example:


Step 1: Get an EMP source table: This table available in the scott user: EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ---------- ---------- --------- ---------- --------- ---------- ---------- ---------Step 2 get a target table for the same(note this has a surrogate key newly created:) EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO SK SQL>: Create table empscd1 as select * from scott.emp;( you might have to grant create permission for the scott user incase not working). SQL>: alter table t_emp add sk number(15) primary key;

Now, we have the source and the target tables created.

Lets start off with the mapping. Step 3: In the informatica designer get the source and the target tables that we just created above. Step 4: Now, get a look up and select the target table to look up. Step 5: Drag and drop all the source tables to the look up transformation.(now, we have the target tables rows at the top of the look up and the source rows the lower side in the look up transformation) Step6: Get an expression transformation and connect all the rows in the look up transformation to the expression transformation. Now, you need to add two more columns here 1. Insert_flg of integer(15) type: Give the expression for this as: IIF(ISNULL(SK) OR ISNULL(EMPNO),1,0) 2. Update_flg of integer(15) type: Give the expression for this as: IIF(NOT ISNULL(SK) and ( ( ENAME != ENAME1 ) OR ( JOB != JOB1 ) OR ( MGR != MGR1 ) OR ( HIREDATE != HIREDATE1 ) OR ( SAL != SAL1 ) OR ( COMM != COMM1 ) OR

( DEPTNO != DEPTNO1 ) ) ,1,0 )

Step 7: Get a router transformation and connect the source rows(the rows that have "1" postfixed) to it. Makesure the insert_flg and update flg columns that you created also is connected for the same and give group names for this to create two groups as follows Group 1. Insert_rows: Group filter condition for this: Insert_flg Group 2. Update_rows: Group filter condition for this: Update_flg Step 8: Now connect the insert_rows groups of the router to the target. Note: here you should not connect the SK column from the source /router transformation, but you need to use a sequence generator tranformation and connect the nextval column to the SK of the target as this sequence needs to be updated to the next value whenever a new row gets added. Step 9: Connect the update_rows of the router tranformation to an update strategy tranformation with a transformation value of DD_UPDATE and then connect to the Target instance 2( Note : target instance2 is nothing but a copy paste of the target table and is not generated at the target DB). STep 10: Now create a workflow and the task for the same and run the transformation. Make sure you commit the changes made on the source side :-)

The logic goes very simple: 1. First the lookup will look up in the cache table for a given row for existence. 2. If the SK does not exist, then it will go ahead and update the row in the target/Dimension table. 3. After this the sequence generator will be updated to the next value WRT the target/Dimension table. 4. If the SK exist then the condition will point to the update_flg and will do a DD_UPDATE of the corresponding row in the target table. 5. Now, the same process will continue with the next row onwards. 6. Note the SK in the target table should be a primary key with out fail.

Slowly Changing Dimensions (SCDs) are dimensions that have data that changes slowly, rather than changing on a time-based, regular schedule

For example, you may have a dimension in your database that tracks the sales records of your company's salespeople. Creating sales reports seems simple enough, until a salesperson is transferred from one regional office to another. How do you record such a change in your sales dimension? You could sum or average the sales by salesperson, but if you use that to compare the performance of salesmen, that might give misleading information. If the salesperson that was transferred used to work in a hot market where sales were easy, and now works in a market where sales are infrequent, her totals will look much stronger than the other salespeople in her new region, even if they are just as good. Or you could create a second salesperson record and treat the transferred person as a new sales person, but that creates problems also. Dealing with these issues involves SCD management methodologies: Type 1: The Type 1 methodology overwrites old data with new data, and therefore does not track historical data at all. This is most appropriate when correcting certain types of data errors, such as the spelling of a name. (Assuming you won't ever need to know how it used to be misspelled in the past.) Here is an example of a database table that keeps supplier information: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co CA In this example, Supplier_Code is the natural key and Supplier_Key is a surrogate key. Technically, the surrogate key is not necessary, since the table will be unique by the natural key (Supplier_Code). However, the joins will perform better on an integer than on a character string. Now imagine that this supplier moves their headquarters to Illinois. The updated table would simply overwrite this record: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co IL The obvious disadvantage to this method of managing SCDs is that there is no historical record kept in the data warehouse. You can't tell if your suppliers are tending to move to the Midwest, for example. But an advantage to Type 1 SCDs is that they are very easy to maintain. Explanation with an Example: Source Table: (01-01-11) Target Table: (01-01-11) Emp no Ename Sal

101 102 103

A B C

1000 2000 3000

Emp no 101 102 103

Ename A B C

Sal 1000 2000 3000

The necessity of the lookup transformation is illustrated using the above source and target table. Source Table: (01-02-11) Target Table: (01-02-11) Emp no 101 102 103 104

Ename A B C D

Sal 1000 2500 3000 4000

Empno 101 102 103 104

Ename A B C D

Sal 1000 2500 3000 4000

In the second Month we have one more employee added up to the table with the Ename D and salary of the Employee is changed to the 2500 instead of 2000.

Step 1: Is to import Source Table and Target table.


Create a table by name emp_source with three columns as shown above in oracle. Import the source from the source analyzer. In the same way as above create two target tables with the names emp_target1, emp_target2. Go to the targets Menu and click on generate and execute to confirm the creation of the target tables. The snap shot of the connections using different kinds of transformations are shown below.

Step 2: Design the mapping and apply the necessary transformation.

Here in this transformation we are about to use four kinds of transformations namely Lookup transformation, Expression Transformation, Filter Transformation, Update Transformation. Necessity and the usage of all the transformations will be discussed in detail below.

Look up Transformation: The purpose of this transformation is to determine whether to insert, Delete, Update or reject the rows in to target table.

The first thing that we are goanna do is to create a look up transformation and connect the Empno from the source qualifier to the transformation. The snapshot of choosing the Target table is shown below.

What Lookup transformation does in our mapping is it looks in to the target table (emp_table) and compares it with the Source Qualifier and determines whether to insert, update, delete or reject rows. In the Ports tab we should add a new column and name it as empno1 and this is column for which we are gonna connect from the Source Qualifier. The Input Port for the first column should be unchked where as the other ports like Output and lookup box should be checked. For the newly created column only input and output boxes should be checked. In the Properties tab (i) Lookup table name ->Emp_Target.

(ii)Look up Policy on Multiple Mismatch -> use First Value. (iii) Connection Information ->Oracle.

In the Conditions tab (i) Click on Add a new condition

(ii)Lookup Table Column should be Empno, Transformation port should be Empno1 and Operator should =. Expression Transformation: After we are done with the Lookup Transformation we are using an expression transformation to check whether we need to insert the records the same records or we need to update the records. The steps to create an Expression Transformation are shown below.

Drag all the columns from both the source and the look up transformation and drop them all on to the Expression transformation. Now double click on the Transformation and go to the Ports tab and create two new columns and name it as insert and update. Both these columns are gonna be our output data so we need to have check mark only in front of the Output check box. The Snap shot for the Edit transformation window is shown below.

The condition that we want to parse through our output data are listed below.

Input IsNull(EMPNO1) Output iif(Not isnull (EMPNO1) and Decode(SAL,SAL1,1,0)=0,1,0) .

We are all done here .Click on apply and then OK.

Filter Transformation: we are gonna have two filter transformations one to insert and other to update.

Connect the Insert column from the expression transformation to the insert column in the first filter transformation and in the same way we are gonna connect the update column in the expression transformation to the update column in the second filter. Later now connect the Empno, Ename, Sal from the expression transformation to both filter transformation. If there is no change in input data then filter transformation 1 forwards the complete input to update strategy transformation 1 and same output is gonna appear in the target table.

If there is any change in input data then filter transformation 2 forwards the complete input to the update strategy transformation 2 then it is gonna forward the updated input to the target table. Go to the Properties tab on the Edit transformation

(i) The value for the filter condition 1 is Insert. (ii) The value for the filter condition 1 is Update.

The Closer view of the filter Connection is shown below.

Update Strategy Transformation: Determines whether to insert, delete, update or reject the rows.

Drag the respective Empno, Ename and Sal from the filter transformations and drop them on the respective Update Strategy Transformation. Now go to the Properties tab and the value for the update strategy expression is 0 (on the 1st update transformation). Now go to the Properties tab and the value for the update strategy expression is 1 (on the 2nd update transformation). We are all set here finally connect the outputs of the update transformations to the target table.

Step 3: Create the task and Run the work flow.

Dont check the truncate table option.

Change Bulk to the Normal. Run the work flow from task.

Step 4: Preview the Output in the target table.

Create/Design/Implement SCD Type 1 Mapping in Informatica


Q) How to create or implement or design a slowly changing dimension (SCD) Type 1 using the informatica ETL tool. The SCD Type 1 method is used when there is no need to store historical data in the Dimension table. The SCD type 1 method overwrites the old data with the new data in the dimension table. The process involved in the implementation of SCD Type 1 in informatica is

Identifying the new record and inserting it in to the dimension table. Identifying the changed record and updating the dimension table.

We see the implementation of SCD type 1 by using the customer dimension table as an example. The source table looks as
CREATE TABLE Customers ( Customer_Id Number, Customer_Name Varchar2(30), Location Varchar2(30) )

Now I have to load the data of the source into the customer dimension table using SCD Type 1. The Dimension table structure is shown below.
CREATE TABLE Customers_Dim ( Cust_Key Number, Customer_Id Number, Customer_Name Varchar2(30), Location Varchar2(30) )

Steps to Create SCD Type 1 Mapping Follow the below steps to create SCD Type 1 mapping in informatica

Create the source and dimension tables in the database. Open the mapping designer tool, source analyzer and either create or import the source definition. Go to the Warehouse designer or Target designer and import the target definition. Go to the mapping designer tab and create new mapping. Drag the source into the mapping. Go to the toolbar, Transformation and then Create. Select the lookup Transformation, enter a name and click on create. You will get a window as shown in the below image.

Select the customer dimension table and click on OK.

Edit the lkp transformation, go to the properties tab, and add a new port In_Customer_Id. This new port needs to be connected to the Customer_Id port of source qualifier transformation.

Go to the condition tab of lkp transformation and enter the lookup condition as Customer_Id = IN_Customer_Id. Then click on OK.

Connect the customer_id port of source qualifier transformation to the IN_Customer_Id port of lkp transformation. Create the expression transformation with input ports as Cust_Key, Name, Location, Src_Name, Src_Location and output ports as New_Flag, Changed_Flag For the output ports of expression transformation enter the below expressions and click on ok

New_Flag = IIF(ISNULL(Cust_Key),1,0) Changed_Flag = IIF(NOT ISNULL(Cust_Key) AND (Name != Src_Name OR Location != Src_Location), 1, 0 )

Now connect the ports of lkp transformation (Cust_Key, Name, Location) to the expression transformaiton ports (Cust_Key, Name, Location) and ports of source qualifier transformation(Name, Location) to the expression transforamtion ports(Src_Name, Src_Location) respectively. The mapping diagram so far created is shown in the below image.

Create a filter transformation and drag the ports of source qualifier transformation into it. Also drag the New_Flag port from the expression transformation into it. Edit the filter transformation, go to the properties tab and enter the Filter Condition as New_Flag=1. Then click on ok. Now create an update strategy transformation and connect all the ports of the filter transformation (except the New_Flag port) to the update strategy. Go to the properties tab of update strategy and enter the update strategy expression as DD_INSERT Now drag the target definition into the mapping and connect the appropriate ports from update strategy to the target definition. Create a sequence generator transformation and connect the NEXTVAL port to the target surrogate key (cust_key) port. The part of the mapping diagram for inserting a new row is shown below:

Now create another filter transformation and drag the ports from lkp transformation (Cust_Key), source qualifier transformation (Name, Location), expression transformation (changed_flag) ports into the filter transformation. Edit the filter transformation, go to the properties tab and enter the Filter Condition as Changed_Flag=1. Then click on ok. Now create an update strategy transformation and connect the ports of the filter transformation (Cust_Key, Name, and Location) to the update strategy. Go to the properties tab of update strategy and enter the update strategy expression as DD_Update Now drag the target definition into the mapping and connect the appropriate ports from update strategy to the target definition. The complete mapping diagram is shown in the below image.

Update Without Update Strategy for Better Session Performance


You might have come across an ETL scenario, where you need to update a huge table with few records and occasional inserts. The straight forward approach of using LookUp transformation to identify the Inserts, Update and Update Strategy to do the Insert or Update may not be right for this particular scenario, mainly because of the LookUp transformation may not perform better and start degrading as the lookup table size increases.

In this article lets talk about a design, which can take care of the scenario we just spoke.

The Theory
When you configure an Informatica PowerCenter session, you have several options for handling database operations such as insert, update, delete.

Specifying an Operation for All Rows

During session configuration, you can select a single database operation for all rows using the Treat Source Rows As setting from the 'Properties' tab of the session. 1. 2. 3. 4. Insert :- Treat all rows as inserts. Delete :- Treat all rows as deletes. Update :- Treat all rows as updates. Data Driven :- Integration Service follows instructions coded into Update Strategy flag rows for insert, delete, update, or reject.

Specifying Operations for Individual Target Rows


Once you determine how to treat all rows in the session, you can also set options for individual rows, which gives additional control over how each rows behaves. Define these options in the Transformations view on Mapping tab of the session properties. 1. Insert :- Select this option to insert a row into a target table. 2. Delete :- Select this option to delete a row from a table. 3. Update :- You have the following options in this situation: Update as Update :- Update each row flagged for update if it exists in the target table. Update as Insert :- Insert each row flagged for update. Update else Insert :- Update the row if it exists. Otherwise, insert it. 4. Truncate Table :- Select this option to truncate the target table before loading data.

Design and Implementation


Now we understand the properties we need to use for our design implementation.

We can create the mapping just like an 'INSERT' only mapping, with out LookUp, Update Strategy Transformation. During the session configuration lets set up the session properties such that the session will have the capability to both insert and update.

First set Treat Source Rows As property as shown in below image.

Now lets set the properties for the target table as shown below. Choose the properties Insert and Update else Insert.

Thats all we need to set up the session for update and insert with out update strategy.

Hope you enjoyed this article. Please leave us a comment below, if you have any difficulties implementing this. We will be more than happy to help you.

Update Strategy transformation


Update Strategy transformation Active and Connected Transformation. The Update Strategy transformation to update, delete or reject rows coming from source based on some condition. For example; if Address of a CUSTOMER changes, we can update the old address or keep both old and new address. One row is for old and one for new. This way we maintain the historical data. Update Strategy is used with Lookup Transformation. In Data warehouse, we create a Lookup on target table to determine whether a row already exists or not. Then we insert, update, delete or reject the source record as per business need. In PowerCenter, we set the update strategy at two different levels:

1. Within a session 2. Within a Mapping

Update Strategy within a session: When we configure a session, we can instruct the IS to either treat all rows in the same way or use instructions coded into the session mapping to flag rows for different database operations. Session Configuration: Edit Session -> Properties -> Treat Source Rows as: (Insert, Update, Delete, and Data Driven). Insert is default. Specifying Operations for Individual Target Tables: You can set the following update strategy options: 1. Insert: Select this option to insert a row into a target table. 2. Delete: Select this option to delete a row from a table. 3. Update: We have the following options in this situation: i. Update as Update. Update each row flagged for update if it exists in the target table. ii. Update as Insert. Inset each row flagged for update. iii. Update else Insert. Update the row if it exists. Otherwise, insert it. 4. Truncate: Select this option to truncate the target table before loading data.

Flagging Rows within a Mapping: Within a mapping, we use the Update Strategy transformation to flag rows for insert, delete, update, or reject. Operation INSERT INSERT UPDATE DELETE REJECT Update Strategy Expressions: Frequently, the update strategy expression uses the IIF or DECODE function from the transformation language to test each row to see if it meets a particular condition. You can write these expression in Properties Tab of Update Strategy transformation. Constant DD_INSERT DD_INSERT DD_UPDATE DD_DELETE DD_REJECT Numeric Value 0 0 1 2 3

IIF( ( ENTRY_DATE > APPLY_DATE), DD_REJECT, DD_UPDATE ) Or IIF( ( ENTRY_DATE > APPLY_DATE), 3, 2 ) Note: We can configure the Update Strategy transformation to either pass rejected rows to the next transformation or drop them. To do, see the Properties Tab for the Option.

Understanding Treat Source Rows property and Target Insert, Update properties
Posted by Ankur Nigam on August 17, 2011 Informatica has plethora of options to perform IUD (Insert, Update, Delete) operations on tables. One of the most common method is using the Update strategy, while the underdog is using Treat Source Rows set to {Insert; Update; Delete} and not data driven. I will be focusing on latter in this topic.

In simple terms when you set the Treat Source Rows property it indicates Informatica that the row has to be tagged as Insert or Update or Delete. This property coupled with target level property of allowing Insert, Update, Delete works out wonders even in absence of Update Strategy. This also leads to a clear-cut mapping design. I am not opposing the use of Update Strategy but in some situations this leads to a slight openness in the mapping wherein I dont

have to peek into the reason of action the Strategy is performing e.g. IIF(ISNULL(PK)=1,DD_INSERT,DD_UPDATE). Lets buckle up our belts and go on a ride to understand the use of these properties. Assume a scenario where I have following Table Structure in Stage

Keeping things simple the target table would be something like this

As you can see the target has UserID as a surrogate key which I will populate through a sequence. Also note that Username is unique. Now I have a scenario where I have to update the existing records and insert the new ones as supplied in the staging table. Before beginning with writing code, lets first understand TSA and target properties is more detail. Treat Source Rows accepts 4 types of settings:
1. Insert :- When I set this option Informatica will mark all rows read from source as Insert. Means that the rows will only be inserted. 2. Update :- When I set this option Informatica will mark all rows read from source as Update. It means that rows when arrive target they have to be updated in it. 3. Delete :- The rows will be marked as to be deleted from target once having been read from Source. 4. Data Driven :- This indicates Informatica that we are using an update strategy to indicate what has to be done with rows. So no marking will be done when rows are read from source. Infact what has to be done with rows arriving to target will be decided immediately before any IUD operation on target

However setting TSA alone will not let you modify rows in the target. Each target in itself should be able to accept or I should say allow IUD operations. So when you have set TSA

property you have to also set the target level property also that whether the rows can be inserted, updated or deleted from the target. This can be done in following ways:-

Insert and delete are self-explanatory however update has been categorized into 3 sections. Please note that setting any of them will allow update on your tables:1. Update as Update :- This is simple property which says that if the row arrives target, it has to be updated in target. So if you check the logs Informatica will generate an Update template something like UPDATE INFA_TARGET_RECORDS SET EMAIL = ? WHERE USERNAME = ? 2. Update as Insert :- This means that when row arrives target and it is a row which has to be updated, then the update behaviour should be to insert this row in target. In this case Informatica will not generate any update template for the target instead the incoming row will be inserted using the template INSERT INTO INFA_TARGET_RECORDS(USERID,USERNAME,EMAIL) VALUES ( ?, ?, ?) 3. Update else Insert :- Means that the incoming row flagged as update should be either updated or inserted. In a nutshell it means that if any key column is present in the incoming row which

also exists in target then Informatica will intelligently update that row in target. In case if the incoming key column is not present in the target the row will be inserted.

PS :- The last two properties require you to set the Insert property of target also because if this is not checked then Update as Insert & Update else Insert will not work and session will fail stating that the target does not allows Insert. Why? Well its simple because these update clauses have insert hidden in them. Ok enough of theories? Fine lets get our hand dirty. Coming back to our scenario, we have the rows read from source and want them to be either inserted or updated in target depending upon the status of rows i.e. whether they are present in the target or not. My mapping looks something like this:

Here I have used a lookup table to fetch user ID for a username incoming from stage. In the router following has been set:-

The output from router is sent to respective instances of the target (INFA_TARGET_RECORDS) in case if user exists or not. INFA_TARGET_RECORDS_NEW in case of new records and INFA_TARGET_RECORDS_UPD in case of existing records. Once this is in place I have to set the Treat Source Rows property as Update for this session. Also to enable Informatica to insert in the table I will have to :1. Set the Insert & Update as Insert properties of the instance INFA_TARGET_RECORDS_NEW. 2. Set Update as Update property for INFA_TARGET_RECORDS_UPD instance in the session.

What actually happened is that I have treated all rows from source to be flagged as update. Secondly I have modified the behaviour of the Update and set it as Update as Insert. Due to this property update has allowed me actually to insert the rows in target. When the session runs it will update the rows in target and insert the new rows in target (actually update as insert). Try it out and let me know if it works for you. I am not attaching any run demo because its better if you do it and understand even more clearly what is happening behind the scenes.

Informatica Mapping Insert Update Delete


There are situations where you need to keep a source and target in sync. One method to do this is to truncate and reload. However this method is not that efficient for a table with millions of rows of data. You really only want to:

insert rows from the source that dont exist in the target update rows that have changed delete rows from the target that no longer exist in the source

Can you do this efficiently in a single informatica mapping?

Here is a picture of the informatica mapping:

Here is the detailed informatica mapping:


1. 2. 3. 4. 5. Insert Source and Source Qualifier from Source Insert Source and Source Qualifier from Target Sort Source and Target in Source Qualifiers by Key Fields Insert Joiner Transformation using a Full Outer Join and select Sorted Input option Insert a Router Transformation with 3 groups 1. Insert ISNULL(Target_PK) 2. Delete ISNULL(Source_PK) 3. Default used for Update 6. Insert an Update Transformation coming form the Delete Group using DD_DELETE 1. Connect this transformation to the Target 7. Insert a Filter Transformation coming from the Update Group 1. ( DECODE(Source_Field1, Target_Field1, 1, 0) = 0 OR

DECODE(Source_Field2, Target_Field2, 1, 0) = 0 ) 2. Modify as needed to compare all non-key fields 8. Insert an Update Transformation coming from the Filter Transformation using DD_UPDATE 1. Connect this transformation to the Target 9. Connect the Insert Group in the Router Transformation to the Target

Please leave a comment if you have questions on this informatica mapping process

SCD - Type 1
Slowly Changing Dimensions (SCDs) are dimensions that have data that changes slowly, rather than changing on a time-based, regular schedule For example, you may have a dimension in your database that tracks the sales records of your company's salespeople. Creating sales reports seems simple enough, until a salesperson is transferred from one regional office to another. How do you record such a change in your sales dimension? You could sum or average the sales by salesperson, but if you use that to compare the performance of salesmen, that might give misleading information. If the salesperson that was transferred used to work in a hot market where sales were easy, and now works in a market where sales are infrequent, her totals will look much stronger than the other salespeople in her new region, even if they are just as good. Or you could create a second salesperson record and treat the transferred person as a new sales person, but that creates problems also. Dealing with these issues involves SCD management methodologies: Type 1: The Type 1 methodology overwrites old data with new data, and therefore does not track historical data at all. This is most appropriate when correcting certain types of data errors, such as the spelling of a name. (Assuming you won't ever need to know how it used to be misspelled in the past.) Here is an example of a database table that keeps supplier information: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co CA

In this example, Supplier_Code is the natural key and Supplier_Key is asurrogate key. Technically, the surrogate key is not necessary, since the table will be unique by the natural key (Supplier_Code). However, the joins will perform better on an integer than on a character string. Now imagine that this supplier moves their headquarters to Illinois. The updated table would simply overwrite this record: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co IL The obvious disadvantage to this method of managing SCDs is that there is no historical record kept in the data warehouse. You can't tell if your suppliers are tending to move to the Midwest, for example. But an advantage to Type 1 SCDs is that they are very easy to maintain. Explanation with an Example: Source Emp no 101 102 103 Emp no 101 102 103 Table: (01-01-11) Target Table: (01-01-11) Ename Sal A 1000 B 2000 C 3000 Ename A B C Sal 1000 2000 3000

The necessity of the lookup transformation is illustrated using the above source and target table. Source Table: (01-02-11) Target Table: (01-02-11) Emp no 101 102 103 104 Ename A B C D Sal 1000 2500 3000 4000 Empno 101 102 103 104 Ename A B C D Sal 1000 2500 3000 4000

In the second Month we have one more employee added up to the table with the Ename D and salary of the Employee is changed to the 2500 instead of 2000. Step 1: Is to import Source Table and Target table.

Create a table by name emp_source with three columns as shown above in oracle. Import the source from the source analyzer. In the same way as above create two target tables with the names emp_target1, emp_target2. Go to the targets Menu and click on generate and execute to confirm the creation of the target tables. The snap shot of the connections using different kinds of transformations are shown below.

Step 2: Design the mapping and apply the necessary transformation. Here in this transformation we are about to use four kinds of transformations namely Lookup transformation, Expression Transformation, Filter Transformation, Update Transformation. Necessity and the usage of all the transformations will be discussed in detail below. Look up Transformation: The purpose of this transformation is to determine whether to insert, Delete, Update or reject the rows in to target table.

The first thing that we are goanna do is to create a look up transformation and connect the Empno from the source qualifier to the transformation. The snapshot of choosing the Target table is shown below.

What Lookup transformation does in our mapping is it looks in to the target table (emp_table) and compares it with the Source Qualifier and determines whether to insert, update, delete or reject rows. In the Ports tab we should add a new column and name it as empno1 and this is column for which we are gonna connect from the Source Qualifier. The Input Port for the first column should be unchked where as the other ports like Output and lookup box should be checked. For the newly created column only input and output boxes should be checked. In the Properties tab (i) Lookup table name ->Emp_Target.

(ii)Look up Policy on Multiple Mismatch -> use First Value. (iii) Connection Information ->Oracle.

In the Conditions tab (i) Click on Add a new condition

(ii)Lookup Table Column should be Empno, Transformation port should be Empno1 and Operator should =. Expression Transformation: After we are done with the Lookup Transformation we are using an expression transformation to check whether we need to insert the records the same records or we need to update the records. The steps to create an Expression Transformation are shown below.

Drag all the columns from both the source and the look up transformation and drop them all on to the Expression transformation. Now double click on the Transformation and go to the Ports tab and create two new columns and name it as insert and update. Both these

columns are gonna be our output data so we need to have check mark only in front of the Output check box. The Snap shot for the Edit transformation window is shown below.

The condition that we want to parse through our output data are listed below.

Input IsNull(EMPNO1) Output iif(Not isnull (EMPNO1) and Decode(SAL,SAL1,1,0)=0,1,0) .

We are all done here .Click on apply and then OK.

Filter Transformation: we are gonna have two filter transformations one to insert and other to update.

Connect the Insert column from the expression transformation to the insert column in the first filter transformation and in the same way we are gonna connect the update column in the expression transformation to the update column in the second filter. Later now connect the Empno, Ename, Sal from the expression transformation to both filter transformation. If there is no change in input data then filter transformation 1 forwards the complete input to update strategy transformation 1 and same output is gonna appear in the target table. If there is any change in input data then filter transformation 2 forwards the complete input to the update strategy transformation 2 then it is gonna forward the updated input to the target table. Go to the Properties tab on the Edit transformation

(i) The value for the filter condition 1 is Insert. (ii) The value for the filter condition 1 is Update.

The Closer view of the filter Connection is shown below.

Drag the respective Empno, Ename and Sal from the filter transformations and drop them on the respective Update Strategy Transformation. Now go to the Properties tab and the value for the update strategy expression is 0 (on the 1st update transformation). Now go to the Properties tab and the value for the update strategy expression is 1 (on the 2nd update transformation). We are all set here finally connect the outputs of the update transformations to the target table.

Step 3: Create the task and Run the work flow.


Dont check the truncate table option. Change Bulk to the Normal. Run the work flow from task.

Step 4: Preview the Output in the target table.

You might also like