Professional Documents
Culture Documents
Cloud
Fusion Middleware
Fusion Applications
About
18 Comments
Introduction
The best option for ODI Changed Data Capture is to leverage Oracle GoldenGate. To understand how to best leverage the
out-of-the-box integration between ODI and GoldenGate, we will review how ODI handles CDC with an in depth explanation of
the JKMs principles, then expand this explanation to the specifics of the ODI-GoldenGate integration.
Log in
2 of 17
Social
Copyright Information
Arch Beat
Privacy at Oracle
Twitter
Terms of use
All content and s/w code on this site are offered without any warranty, or promise of operational quality or functionality.
Figure 3: Changes processed by the second subscriber followed by a purge of the consumed records.
NOTE: In the above example, if changes had occurred before 8:00pm, but after GL_INTEGRATION last processed the
3 of 17
changes (i.e. 7:15pm) then these changes would not be purged until GL_INTEGRATION has processed them all (i.e.
8:15pm)
To process the changes, ODI applies a logical lock on the records that it is about to process. Then the records are processed
and the unlock step defines if the records have to be purged or not, based on other subscribers consumption of the changes.
Two views are created: the JV$ view and the JV$D view.
The JV$ view is used in the mappings where you select the option Journalized data only. Figure 4 shows where to find this option
in the Physical tab of the mappings:
4 of 17
Figure 5: Selecting the subscriber name in the mapping options to consume changes.
NOTE: The subscriber name does not have to be hard-coded: you can use an ODI variable to store this name and use
the variable in the filter.
The JV$D view is used to show the list of changes available in the J$ table when you select the menu Journal Data from the
CDC menu under the models and datastores. Figure 6 shows how to access this menu:
Because of these limitations though, the most recent and most efficient JKMs provided out of the box with ODI are all Consistent
set JKMs. One important caveat with simple CDC JKMs is that they create one entry per subscriber in the J$ table for every
single changed row. If you have two subscribers, each change generates two records in the J$ table. Having three subscribers
means three entries in the J$ table for each change. You can immediately see that this implementation works for basic cases, but
it is very limited when you want to expand your infrastructure.
5 of 17
When using Simple CDC JKMs, the lock, unlock and purge operations are performed in the IKM: each IKM has the necessary
steps for these operations, and these steps are only executed if:
Journalizing is selected in the interface, as described above in figure 4;
The JKM used for journalizing in the model that contains the source table is a Simple CDC JKM
Figure 7: Parent and children records arriving during the processing of changes
When you define the parameters for consistent set CDC, you have to define the parent-child relationship between the tables. To
do so, you have to edit the Model that contains these tables and select the Journalized Tables tab. You can either use the
Reorganize button to have ODI compute the dependencies for you based on the foreign keys available in the model, or you can
manually set the order. Parent tables should be at the top, children tables (the ones referencing the parents) should be at the
bottom.
In Figure 8 we see a Diagram that was created under the model that hosts the journalized tables to represent the relationships
between the tables. To reproduce this, create a Diagram under your Model, then drag and drop the selected tables in that
diagram: the foreign keys will automatically be represented as arrows by the ODI Studio.
6 of 17
Figure 8: ODI Diagram that represents the parent-child relationship in a set of tables.
In the illustration shown in figure 9 we would have to move PRODUCT_RATINGS down the list because of its reference to the
SUPPLIERS table.
These operations are performed in the packages before processing the interfaces where CDC data is processed as shown in
Figure 10. After the data has been processed, the subscribers must be unlocked and the J$ table can be purged of the
consumed records.
7 of 17
Extend window
Either the window_id column of the J$ table is updated by the detection mechanism (as is the case with GoldenGate JKMs) or it
is not (as is the case with trigger based JKMs). In all cases, the SNP_CDC_SET table is first updated with the new computed
window_id for the CDC Set that is being processed. The window_id is computed from the checkpoint table for GoldenGate JKMs
or is based on an increment of the last used value (found in the SNP_CDC_SET table) for other JKMs.
For non GoldenGate JKMs, all records of the J$ table that do not have a window_id yet (the value would be null) are updated
with this new window_id value so that the records can be processed: these are records that were written to the J$ table after the
last processing of changes and were never assigned a window_id.
Again, GoldenGate writes this window_id as it inserts records into the J$ table.
Lock subscriber
For all JKMs, the subscribers have to be locked: their processing window are set to range between the last processed window_id
(which is the minimum window_id) and the newly computed window_id (which is the maximum window_id).
8 of 17
The subscribers table is created in the Work Schema of the Default Schema for the data server. To identify the Default Schema,
look under the data server definition in the physical architecture of the Topology Navigator: the Default Schema is marked with a
checkmark. If you edit the schema, the Default checkbox is selected. As such, there will be a single, shared subscribers table for
all the schemas on that server.
The J$ table and the two views are created for each table that is journalized. These are created in the Work Schema associated
to the Physical Schema where the source table is located.
This shows that ODI does not replicate the transactions; it does an integration of the data as they are at the time the integration
process runs. Oracle GoldenGate replicates the transactions as they occur on the source system.
An additional filter is added in the mappings at design time so that only the records for the selected subscriber are consumed
from the J$ table, as we saw in figure 5.
9 of 17
After the Extend Window step updated in the SNP_CDC_SET table for the current CDC set, the Lock Subscriber step in the
packages updates the maximum window_ids of the SNP_CDC_SUBS table with the same values for the current subscriber.
Only the changes from the J$ table that have a window_id between the minimum and maximum window_id recorded in the
SNP_CDC_SUBS table are processed. Once these changes have been processed and committed, the maximum window_id is
used to overwrite the minimum window_id (this is done in the Unlock Subscriber step of the package). This guarantees that the
infrastructure is ready for the next integration cycle, starting where we left off.
10 of 17
This table lists all the CDC infrastructure components associated to a journalized table.
SNP_CDC_OBJECTS
FULL_TABLE_NAME (PK)
CDC_OBJECT_TYPE (PK)
FULL_OBJECT_NAME
DB_OBJECT_TYPE
This table is leveraged to make sure that ODI does not attempt to recreate an object that has already been created (see section
4.1 Only creating the J$ tables and views if they do not exist).
A filter created in the mappings allows the developers to select the subscriber for which the changes are consumed, as we saw in
figure 5.
The JV$D view uses the same approach to remove duplicate entries, but it shows all entries available to all subscribers, including
the ones that have not been assigned a window_id yet.
11 of 17
includes GoldenGate detection of the changes, replication of the changes, transformations by ODI and commit in the target
system.
Heterogeneous capabilities: both ODI and GoldenGate can operate on many databases available on the market, allowing for
more flexibility in the data integration infrastructure.
The second one makes sure that the J$ table is updated at the same time as the staging table. GoldenGate in this case has two
targets when it replicates the changes.
?
map <Source_table_name>, target <J$_Table_name>, KEYCOLS (PK1, PK2,,PKn, WINDOW_ID), INSERTALLRECORDS, OVERRIDEDUPS,
COLMAP (
PK1 = PK1,
PK2 = PK2,
...
PKn=PKn,
WINDOW_ID = @STRCAT(@GETENV("RECORD", "FILESEQNO"), @STRNUM(@GETENV("RECORD", "FILERBA"), RIGHTZERO, 10))
);
If you already have GoldenGate in place to replicate data from the source tables into a staging area, you may not be interested in
using the files generated by ODI. You have already configured and fine tuned your environment, you do not want to override your
configuration. All you need to do in that case is to add the additional maps for GoldenGate to update the ODI J$ tables.
3.4 Evolution of the GoldenGate JKMs between ODI 11g and ODI 12c
There is a deeper integration between ODI and GoldenGate in the 12c release of ODI than what was available with the 11g
release. One immediate consequence is that the JKMs for GoldenGate have evolved to take advantage of features that now
become available:
In ODI 11g the source table for an initial load was different from the source table used with GoldenGate for CDC: the
GoldenGate replicat table had to be used explicitly as a source table in CDC configurations. With the 12c implementation of the
GoldenGate JKMs, the same original source table is used in the mappings for both initial loads and incremental loads using
GoldenGate. For CDC, the GoldenGate source becomes the source table in the mappings for CDC. The GoldenGate replicat
is considered as a staging table and as such is not represented in the ODI mappings anymore. David Allan has a very good
pictorial representation of the new paradigm available here: https://blogs.oracle.com/dataintegration/resource/odi_12c
/odi_12c_ogg_configuration.jpg.
The new JKMs allow for online or offline use of GoldenGate: in online mode, ODI communicates directly with the GoldenGate
JAgent to distribute the configuration parameters. The offline mode is similar to what was available in ODI 11g.
12 of 17
further
To illustrate JKM internal workings, we are looking here at code of some of the Knowledge Modules delivered with ODI 12.1.2.0
4.1 Only creating the J$ tables and views if they do not exist
Traditionally in ODI KMs, tables and views can be created with the option to Ignore Errors so that the code does not fail if the
infrastructure is already in place. This approach does not work well in the case of JKMs where we do want to know that the
creation of a J$ table (or view) fails, but we will continuously add tables and views to the environment. What we want is to ignore
the tables that have already been created, and only create the ones that are needed.
If you edit the JKM Oracle to Oracle Consistent (OGG) and look at the task Create J$ Table you can see that there is code in the
Source command section as well as for the Target command section. The target command creates the table, as you would
expect. The source command only returns a result set if the J$ table we are about to create in not referenced in the
SNP_CDC_OBJECTS table. If there is no result set from the source command, the target command is not executed by ODI: the
standard behavior in KM and procedures tasks is that the target command is executed once for each element of the result set
returned from the source command (if there is a source command). Zero elements in the result set mean no execution.
Figure 11: Repeating the code for all tables of the CDC set in the appropriate order
Note that since GoldenGate updates the window_ids directly for ODI, the matching step does not exist in the GoldeGate JKMs.
But the same technique of processing tables of the set in the appropriate order is leveraged when creating or dropping the
infrastructure (look at the Create J$ and Drop J$ tasks for instance in the GoldenGate JKMs).
Conclusion
As you can see the ODI CDC infrastructure provides a large amount of flexibility and covers the most complex integration
requirements for CDC. The out-of-the-box integration with Oracle GoldenGate helps developers combine both products very
quickly without the need for experts to intervene. But if you need to alter the way the two products interact with one another,
JKMs are the key to the solution you are dreaming about.
For more ODI best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with
customers and partners, visit Oracle A-Team Chronicles for ODI. For Oracle GoldenGate, visit Oracle A-Team Chronicles
for GoldenGate
All site content is the property of Oracle Corp. Redistribution not allowed without written permission
Tweet
Share
tagged with: cdc, change, changed data capture, detect, gg, goldengate, integration, jkm, km,
13 of 17
Comments
Li says:
We are using LKM Orcle to Oracle(DBLink) CDC JKM JKM Oracle consistent IKMOracle Incremental Update
CKM-CKM ORACLE consistent
One more important thing is we implemented CDC consistent even though there is no foreign key relationships
Li says:
14 of 17
Li says:
wingrider says:
Thankyou for the information. can you please tell how can we deploy a Journalising package to an Execution repository.
You can create a dedicated package with the sole purpose of starting/stopping the journals. The start operation will create
15 of 17
the infrastructure for you (snp_cdc, snp_cdc_set, etc). You can use this to add subscribers as well.
1. Create a new package
2. Drag-and-drop the model that you want to work on in that package
3. Click on the icon that represents the model in the package.
4. In the properties window, set the Type of the step to Journalizing Model
5. You will see options to start/stop the journal, add and remove subscribers. Dont forget to add the subscriber names
further down if you want them created.
You can add multiple instances of the model in your package if you want to control the behavior with parameters (for
instance two separate branches to start or stop the journal, pass the subscriber name as a parameter, etc).
I hope this helps
-Christophe
Log in to Reply
If transformation mappings are not able to keep up with the rate at which my J$ tables are updated is there any way to tell
ODI mapping to process n records at a time?
Thanks,
Vishal
Log in to Reply
Christophe Dupupet says:
January 4, 2016 at 8:25 AM
Hello Vishal,
There are several ways you can do this what will be important is to make sure that the JKM does not loose any data as
it goes through the processes of extend window, lock/unlock and purge logs.
If the order in which the data arrives is important, you will have to compute an acceptable value for the increment (instead
of using the last window_id) and use the lowest of these two values for the extend / lock / unlock / purge processes.This
said, you will have to make sure that the transformation process can eventually catch-up with the changes, or will will
keep running further and further behind so I would first invest heavily in optimizing the transformation process.
If the order in which data arrives is not important, then you can run multiple processes in parallel (using a modulus
approach, each process would be in charge of a subset of the records). Here again, you would have to make sure that
the JKM is modified properly so that the lock / extend / unlock / purge mechanisms work properly.
My best
Log in to Reply
Id like to write a mapping once (ODI 12.1.3), and reuse it for first population and for reading the changes without need of
checking unchecking Journalized Data only within the mapping itself?
Is there a dynamic way in an ODI Package (for instance) to tell ODI to use the JV$ tables instead than the source?
Log in to Reply
Christophe Dupupet says:
16 of 17
In ODI 12c, you only have to create a new deployment specification in the physical tab of the mapping. In the original
physical tab, you leave the checkbox unchecked, and select the most appropriate KM combination for an initial load (like
and insert/append). Then in the new tab, you select the checkbox (processing only the changes) and select the most
appropriate KM for that (probably a merge or incremental KM). There is a good description of this feature here:
http://www.rittmanmead.com/2013/10/oracle-data-integrator-12c-release-part-1/
Log in to Reply
Thanks
Log in to Reply
Kamran Hussain says:
You can definitely leverage ODI downstream to transform data in Hadoop or load data to/from RDBMS using sqoop; see
this post on ODI 12c capabilities with Hadoop: https://blogs.oracle.com/dataintegration/entry
/new_big_data_features_in
Log in to Reply
Daniel says:
I am learning a little bit more about GoldenGate, but I am curious about one item as it relates to ODI/GoldenGate. Can you
perform a soft delete on the target (e.g. DW) where the source performed a hard delete?
Log in to Reply
17 of 17
Search
Share
Categories
Categories
Recent Posts
Loading Data from Oracle Field Service
Cloud into Oracle BI Cloud Service using
SOAP
Loading Data into Oracle BI Cloud
Service using OTBI Analyses and SOAP
Uploading a file to Oracle storage cloud
service using REST API
Best Practices Data movement
between Oracle Storage Cloud Service
and HDFS
Loading Data into Oracle BI Cloud
Service using BI Publisher Reports and
REST Web Services
Oracle GoldenGate: Working With Tokens
and Environment Variables
Loading Data into Oracle BI Cloud
Service using BI Publisher Reports and
SOAP Web Services
Archives
Archives