You are on page 1of 30

Version History:

Version Date By Changes


0.1 28/December RN First draft
0.2 29/December RN Added UNIX portion (section 5)
0.3 07/Jan RN Merged section2.2/2.3 from development standards.
0.
12/Oct MF Added Pre GO-LIVE Checks, Oracle Tunning.
5.0 28/Feb/2006 CHA Added 4.22.4 section to the existing 4.0 version
0.6 07-Mar-2006 NBL Additions to Sections 4.22.4
Added section 6.5
All the above pertaining to performance tuning done by Paragon
TOC updated
0.7

Contributors:
Name Role Location Remarks

Approval:
Name Role Location Remarks
Peter reinbold
Ming Fung

Reference Documents:
Name Author Version Date
Unix system (OS) tuning Informatica

CONTENTS
1 DOCUMENT DESCRIPTION 5
2 DOCUMENT ORGANISATION 5
3 INFORMATICA PC PRIMARY GUIDELINES 5
3.1 DATABASE UTILISATION 5
3.2 LOCALISATION 5
3.3 REMOVAL OF DATABASE DRIVEN SEQUENCE GENERATORS 5
3.4 SWITCH OFF THE “COLLECT PERFORMANCE STATISTICS”6
3.5 SWITCH OFF THE VERBOSE LOGGING 6
3.6 UTILISE STAGING 6
3.7 ELIMINATE NON-CACHED LOOKUPS 6
3.8 TUNE THE DATABASE 7
3.9 AVAILABILITY OF SWAP & TEMP SPACE ON PMSERVER 7
3.10 SESSION SETTINGS 7
3.11 REMOVE ALL OTHER APPLICATIONS ON PMSERVER 8
3.12 REMOVAL EXTERNAL REGISTERED MODULES 8
4 INFORMATICA PC ADVANCED GUIDELINES 9
4.1 FILTER EXPRESSIONS : 9
4.2 REMOVE DEFAULT’S: 9
4.3 OUTPUT PORT INSTEAD OF VARIABLE PORT 9
4.4 DATATYPE CONVERSION:10
4.5 STRING FUNCTIONS: 10
4.6 IIF CONDITIONS CAVEAT 10
4.7 EXPRESSIONS 10
4.8 UPDATE EXPRESSIONS FOR SESSION : 10
4.9 MULTIPLE TARGETS / SOURCES ARE TOO SLOW : 10
4.10 AGGREGATOR 11
4.11 JOINER 11
4.12 LOOKUPS 12
4.12.1 Lookups & Aggregators Fight. 13
4.13 MAPLETS FOR COMPLEX LOGIC 13
4.14 DATABASE IPC SETTINGS & PRIORITIES 14
4.15 LOADING 14
4.16 MEMORY SETTINGS 14
4.17 REDUCE NUMBER OF OBJETS IN A MAP 15
4.18 SLOW SOURCES - FLAT FILES 15
4.19 BREAK THE MAPPINGS OUT 15
4.19.1 Keep the mappings as simple as possible 16
4.20 READER/TRANSFORMER/WRITER THREADS AFFECT THE TIMING
16
4.21 SORTING – PERFORMANCE ISSUES 16
4.21.1 Sorted Input Conditions 17
4.21.2 Pre-Sorting Data 17
4.22 WORKFLOW MANAGER 17
4.22.1 Monitoring and Running a Session: 18
4.22.2 Informatica suggests that each session takes roughly 1 to 1 1/2 CPU's. 19
4.22.3 Place some good server load monitoring tools on the PM Server in development
19
4.22.4 Parallel sessions/worklets. 19
4.23 CHANGE DATABASE PRIORITIES FOR THE PMSERVER DATABASE
USER 19
4.24 CHANGE THE UNIX USER PRIORITY 20
4.25 TRY NOT TO LOAD ACROSS THE NETWORK 20
4.26 BALANCE BETWEEN INFORMATICA AND THE POWER OF SQL AND
THE DATABASE 20
5 PERFORMANCE TUNING THE UNIX OS21
5.1 PROCESS CHECK 21
5.2 IDENTIFIYING & RESOLVING MEMORY PROBLEM 21
5.3 IDENTIFYING AND RESOLVING DISK I/O ISSUES 22
5.4 IDENTIFYING AND RESOLVING CPU OVERLOAD ISSUES 23
5.5 IDENTIFYING AND RESOLVING NETWORK ISSUES 23
6 IDENTIFYING ORACLE PERFORMANCE ISSUES 25
6.1 CHECKING PROBLEM PROCESSES. 26
6.2 GETTING AN EXPLAIN PLAN 29
6.3 CHECK STATS FOR AN OBJECT 30
6.4 PROCEDURE TO BE FOLLOWED PRIOR TO AN INTERFACE GO-
LIVE/POST GO-LIVE 31
6.5 LOAD METHODOLOGY FOR STAGING TABLES WITH INDEXES 32
6.6 INVESTIGATING PERFORMANCE ISSUES USING THE DATA
DICTIONARY VIEWS. 33
6.6.1 V$sysstat and V$waitstat 33
6.6.2 Buffer Cache Hit Ratio Should be above 85% 35
6.6.3 Library Cache Hit Ratio 36
6.6.4 Dictionary Cache Hit Ratio should be less than 15% 36
6.6.5 Shared pool 37
6.6.6 Recursive/Total Calls. 38
6.6.7 Short/Total Table Scans 39
6.6.8 Redo Activity 40
6.6.9 Table Contention 42
6.6.10 CPU Parse Overhead 43
6.6.11 Latches 43
6.6.12 Rollback Segment Contention 43

1 Document Description
This document describes the practices that can be followed by the ETL development
team, in order to get the best of Informatica PowerCenter (ETL). This document mainly
concentrates on optimising the performance of core ETL. In order to make ETL to
achieve the optimal performance; it is imperative to strike a good balance in hardware,
OS, RDBMS & Informatica PowerCenter 7.1.1. This document can be used as reference
by the development team & administration team.
2 Document Organisation
This document is divided into following parts
o Primary guidelines - Necessary for ETL to perform optimally, fundamental
approach for ETL design with Informatica PC 7.1.1
o Advanced guidelines - Guidelines can be applied on case-to-case basis, Can be
followed based on the problem scenario / environment
o Optimising Unix system – Performance tuning the OS (Unix/Linux system)
3 Informatica PC Primary guidelines
3.1 Database utilisation
Utilise database for significant data handling operations, staging tables can be a real
benefit for parallelism in operations. It reduces the amount of processing time by a
significant amount.

3.2 Localisation
Try to localise the relational objects as far as possible. Try not to use synonyms for
remote database. Usage of remote links for data processing & loading certainly slow the
things down.

3.3 Removal of Database driven Sequence Generators


Usage of database oriented sequence generators proves to be a costly decision. As this
requires wrapper function / store procedure call, which might degrade the performance by
3 times. Also it is not so easy to determine the bottleneck caused by database sequence
generators. If it is must to use database sequence generators, then have a shared sequence
generator & build a staging table from the flat file, add a SEQ_ID column & call a POST
TARGET LOAD procedure to populate this column.

This requires a wrapper function / stored procedure call. Utilizing these stored procedures
has caused performance to drop by a factor of 3 times. This slowness is not easily
debugged - it can only be spotted in the Write Throughput column. Copy the map, replace
the stored proc call with an internal sequence generator for a test run - this is how fast
you COULD run your map. If we must use a database generated sequence number, then
follow the instructions for the staging table usage. If we're dealing with GIG's or
Terabytes of information - this should save you lot's of hours tuning. IF YOU MUST -
have a shared sequence generator, then build a staging table from the flat file, add a
SEQUENCE ID column, and call a POST TARGET LOAD stored procedure to populate
that column. Place the post target load procedure in to the flat file to staging table load
map. A single call to inside the database, followed by a batch operation to assign
sequences is the fastest method for utilizing shared sequence generators.
3.4 Switch off the “Collect performance statistics”
This has an impact though it is minimal; removing this operation reduces reliance on the
flat file operations. However it may be useful to have this option switch ON during
tuning exercise.
3.5 Switch off the verbose logging
The session log has a tremendous impact on the overall performance of a session. Over-
ride the session log to NORMAL logging mode. In informatica logging mechanism is not
parallel; it is embedded into the operations. Also, this prevents informatica metadata table
from growing. Also, it is good idea to perform some amount of automated house keeping
which truncates the log from Informatica metadata at regular intervals.
3.6 Utilise staging
If the source is flat file utilise a staging table. This way you can use SQL Loader, Bulk-
Load utility. Keep the basic logic in source load map; eliminate all lookups from the
code. At this juncture if the reader is slow, then check for following
o If there is an item in configuration file which sets a value to throttle reader, it will
limit the read throughput.
o Move flat file to local disk; don’t read from network or from RAID.
3.7 Eliminate non-cached lookups
Usage of non-cached lookups will hamper the performance significantly. Especially if the
lookup table is “growing” or “updated” target table. This show the indexes are changing
during the operation and optimizer looses the track of index & its statistics. If possible
use staging table - this allows using joiner also which can increase the performance to
large extent.

3.8 Tune the database


Estimate for small, medium and large source data set sizes, in terms of number of rows /
average bytes per row. Also estimate the throughput for each and turnaround time for
load. DBA should be provided with this information, along with tables that are expected
to be high read / write .DBA should assign the right table to the right disk space that
could make difference.

3.9 Availability of SWAP & TEMP space on PMSERVER


Having less disk space for SWAP & TEMP could potentially slow down the performance
of entire server. To monitor this one need to watch the disk space while sessions are
running. Without monitoring, it would be difficult to assess the reason, especially if
mapping contains Aggregates, or lookups that uses disk cache or Joiner with
heterogeneous sources.

3.10 Session Settings

Major chunk of tuning can be done in session. By switching on the “Collect performance
statistics” one will come to know the parameters to be set in session level, or at least what
has to be changed in database. Basically one should try to achieve OPTIMAL READ,
OPTIMAL THROUGHPUT and OPTIMAL WRITE. Over-tuning one of these pieces can
ultimately slow down the sessions.

Index Cache and Data cache are dynamically allocated first. As soon as the session is
initialised, the memory for data and index caches are setup. Their sizes depend upon
session settings

The Reader DTM also based on dynamic allocation algorithm, it uses the memory
available in chunks. Size of the chunk would be determined by the session setting
“Default Buffer block size”

Read the session throughput, then tune for the reader, see what the settings are, and send
the write output to a flat file for less contention. Check the Throttle reader setting;
increase the default buffer size by a factor of 64K each shot. If the reader still appears to
increase during the session, then stabilize, and then try increasing Shared Session
Memory from 12 MB to 24 MB. Check the writer throughput performance statistics to
make sure there is NO writer bottleneck. If you have slow writer, change the map to
single target table at a time to see which target is causing the slowness and tune it.
NOTE: if the reader session to flat file just doesn't ever "get fast", then we have got some
basic map tuning to do. Try to merge expression objects, set the lookups to unconnected
(for re-use if possible), check the Index and Data cache settings if we have aggregation,
or lookups being performed. Etc... If we have a slow writer, change the map to a single
target table at a time - see which target is causing the "slowness" and tune it. Make
copies of the original map, and break down the copies. Once the "slower" of the N
targets is discovered, talk to DBA about partitioning the table, updating statistics,
removing indexes during load, etc... There are many database things you can do here.
Remember the TIMING is affected by READER/TRANSFORMER/WRITER threads.
With complex mappings, don't forget that each ELEMENT (field) must be weighed - in
this light a firm understanding of how to read performance statistics generated by
Informatica becomes important. In other words - if the reader is slow, then the rest of the
threads suffer, if the writer is slow, same effect. A pipe is only as big as its smallest
diameter.... A chain is only as strong as its weakest link. Sorry for the metaphors, but it
should make sense.

3.11 Remove all other applications on PMServer


Except the database staging, PMServer plays well with RDBMS & its engine, but doesn’t
play well with application servers, in particularly JAVA Virtual Machines, Web Servers,
Security Servers, applications and Report Servers. All of these items should be broken
out to other machines; this is critical to improve performance on PMServer machine.
3.12 Removal external registered modules

As far as possible, try to avoid the API’s which calls external objects, as this has been
proven slow. External modules might exhibit speed problems, instead try using pre-
processing / post processing with SED, AWK or GREP.

4 Informatica PC Advanced guidelines


4.1 Filter Expressions :
Create the filter (TRUE / FALSE) inside the port expression upstream. Complex filter
expressions slow down the mapping. However it acts faster in Expression transformation
with an output port for the result. Place the expression in EXPRESSION Transformation
upstream from filter. Compute a single numerical flag: 1 for TRUE 0 for FALSE as
output port. Push this data into the filter. This will have positive impact on performance.

Use the Filter transformation early in the mapping.


To maximize session performance, keep the Filter transformation as close as possible to
the sources in the mapping. Rather than passing rows that you plan to discard through the
mapping, you can filter out unwanted data early in the flow of data from sources to
targets.
Use the Source Qualifier to filter
The Source Qualifier transformation provides an alternate way to filter rows. Rather than
filtering rows from within a mapping, the Source Qualifier transformation filters rows
when read from a source. The main difference is that the source qualifier limits the row
set extracted from a source, while the Filter transformation limits the row set sent to a
target. Since a source qualifier reduces the number of rows used throughout the mapping,
it provides better performance.
4.2 Remove Default’s:
Having a default value including “ERROR” slows down the session. It causes
unnecessary evaluation of values for every data element in the mapping. Best method of
allotting default value is to have variable in expression, which returns the expected value
on the condition. This will be faster than assigning default value.
4.3 Output port instead of Variable port
Variables are good for static and state driven, but slow down the performance time as
they are allocated each time a row passes through expression object. Try to use Output
port instead of variable port.
4.4 Datatype conversion:
Avoid performing implicit conversion of datatypes by connecting an Integer to string or
vice versa. Instead use the function that converts the data explicitly, this avoids PMServer
to decide on datatype conversion at run time.
4.5 String Functions:
String functions are costly on performance. E.g. ltrim, rtrim etc., as there involves
allocate & re-allocate of memory within READER thread. Also it would be imperative to
perform the string operations on the data, in which case following can be considered.
Use varchar/varchar2 datatypes in database sources, if source is file then make it
delimited one. Try to use LTRIM/RTRIM functions on the data coming in from a
database SQL; this would be much faster than performing in ETL.
4.6 IIF Conditions caveat
As far as possible, make a logic that goes away from IIF, as IIF conditions are costly in
any language. IIF creates multiple path logic inside the application & uses the decision to
navigate. This might have an implication on performance as well. Other option is to use
Oracle DECODE in source qualifier.
4.7 Expressions
Expressions like IS_SPACES, ISNUMBER etc. affects the performance, as this is the
data validation expression that has to scan the entire string to determine the result. Try to
avoid using these expressions unless there is absolute requirement for its usage.
4.8 Update Expressions for session :
In session if the option Update Else Insert is ON, then definitely performance will slow
down. As, Informatica has to performs 2 operations for each rows update w.r.t PK, then if
it returns 0 rows then perform Insert. As an alternative, Update Strategy can be used
where rows would be marked using DD_UPDATE or DD_INSERT inside the mapping.
In this case session settings can be INSERT & UPDATE AS UPDATE or UPDATE AS
INSERT.
4.9 Multiple targets / sources are too slow :
Mappings with Multiple targets can eat up the performance some time. If the architecture
permits then make one map per target. If the sources are from different ftp locations &
they are flat file, then ideal choice would be FTPing the file to source to the ETL server
& then process it.
4.10 Aggregator
If the mapping contains more than one aggregators, then the session will run slow, unless
the cache dir is fast & disk drive access speed is high. Placing aggregator towards the end
might be another option; however this will also bring down the performance. As all the
I/O activity would be a bottleneck in informatica.
Maplets are good source for replicating data logic, but if a maplet contains aggregator
still the performance of the mapping (that contains maplet) will affect. Reduce the
number of aggregators in the entire mapping to 1(if can), if possible, split the mapping to
several mappings for breaking down the logics.
Sorted input to aggregator will increase the performance to large extent, however if the
sorted input is enabled & the data passing to aggregator is not sorted, Session will fail.
Set the cache size to calculated amount using below mentioned formulae.
Index size = (sum of column size in group-by ports + 17) X number of groups
Data size = (sum of column size of output ports + 7) X number of groups
4.11 Joiner
Perform joins in a database. Performing a join in a database is faster than performing a
join in the session.
Use one of the following options:
o Create a pre-session stored procedure to join the tables in a database.
o Use the Source Qualifier transformation to perform the join.
Designate as the master source the source with the smaller number of records.
For optimal performance and disk storage, designate the master source as the source with
the lower number of rows. With a smaller master source, the data cache is smaller, and
the search time is shorter. Set the cache size to calculated amount using below mentioned
formulae.
Index size = (sum of master column size in join condition +16) X number of rows in
master table
Data size = (sum of master column size NOT in join condition but on output ports + 8) x
number of rows in master table
4.12 Lookups
When caching is enabled, the PowerCenter Server caches the lookup table and queries the
lookup cache during the session. When this option is not enabled, the PowerCenter Server
queries the lookup table on a row-by-row basis.
Eliminate too many lookups. More the lookups means, the DTM reader/writer/Transform
threads are not left with enough memory to be able to run efficiently (as it can). With too
many lookups one need to trade in Memory contention for Disk contention. The memory
contention might be worse than disk contention, as the OS ends up swapping in and out
of TEMP/SWAP disk space, with small block sizes to try to locate the lookup row, and as
the row goes from lookup to lookup, swapping becomes worse.
Both lookups and aggregators require memory space & each of them requires Index &
Data cache, ideally they share from the same heap segments. Hence, care should be taken
while designing mapping that consumes the memory.
In the case where a lookup uses more than one lookup condition, set the conditions with
an equal sign first in order to optimize lookup performance. In the case of a cached
lookup, an ORDER BY condition is issued in the SQL statement used to create the cache.
Columns used in the ORDER BY condition should be indexed. The session log will
contain the ORDER BY statement.
Tips on Caches:
Cache small lookup tables
Improve session performance by caching small lookup tables. The result of the lookup
query and processing is the same, whether or not you cache the lookup table.
Use a persistent lookup cache for static lookup tables:
If the lookup table does not change between sessions, configure the Lookup
transformation to use a persistent lookup cache. The Informatica Server then saves and
reuses cache files from session to session, eliminating the time required to read the
lookup table. If the lookup table does not change between sessions, configure the Lookup
transformation to use a persistent lookup cache. The Informatica Server then saves and
reuses cache files from session to session, eliminating the time required to read the
lookup table.
Override the ORDER BY statement for cached lookups:
By default, the Informatica Server generates an ORDER BY statement for a cached
lookup that contains all lookup ports. To increase performance, you can suppress the
default ORDER BY statement and enter an override ORDER BY with fewer columns
Place conditions with an equality operator (=) first.
If a Lookup transformation specifies several conditions, you can improve lookup
performance by placing all the conditions that use the equality operator first in the list of
conditions that appear under the Condition tab
Consider following for calculating caches for lookup
Attributes Method
Minimum Index Cache 200 * [( S column size) + 16] over all condition ports
Maximum Index Cache
# rows in lookup table [(S column size) + 16] * 2 over all condition ports
Minimum Data Cache
# rows in lookup table [(S column size) + 8] over all outputports (not condition
port)
Maximum Data Cache
2 * minimum data cache
4.12.1 Lookups & Aggregators Fight.
The lookups and the aggregators fight for memory space as discussed above. Each
requires Index Cache, and Data Cache and they "share" the same HEAP segments inside
the core. Particularly in the 4.6 / 1.6 products and prior - these memory areas become
critical, and when dealing with many rows - the session is almost certain to cause the
server to "thrash" memory in and out of the OS Swap space. If possible, separate the
maps - perform the lookups in the first section of the maps, position the data in an
intermediate target table - then a second map reads the target table and performs the
aggregation (also provides the option for a group by to be done within the database)...
Another speed improvement...
4.13 Maplets for complex Logic
It’s good idea to break the complex logic into maplets. This allows managing the
mapping in much better & efficient way of breaking down the business logics. Always
remember shorter the distance between source and Target, better the performance.
With complex mappings READER,WRITER & TRANSFORM threads affects by timing,
i.e. if the reader is slow, then rest of the threads suffer,similarily f he writer is slow same
is the effect.
4.14 Database IPC settings & Priorities
If PMServer & Oracle instance are running on the same server, use IPC connection
instead of TCP/IP connection. Try to change the protocol in the TNSNames.ORA and
Listener.ORA files, and restart the listener on the server .However this can be used only
locally, at the same time the speed increases between 2x and 5x .Another option one can
think is prioritizing the database login that informatica uses to execute its task. These
tasks when login to the database would override others. This would be particularly
helpful in increasing the performance especially when bulk loader or SQL Loader is used.
4.15 Loading
Make sure indexes and constraints are removed before loading into relational targets &
this can be created as soon as the load is completed. This would help in boost up the
performance in bulk data loads.

Lesser the commit interval more the time for session to complete, set the appropriate
commit interval, anything above 50K is good. Partioning the data while loading is
another wise option. Following are the partitions Informatica provides
o Key Range
o Hash Key
o Round Robin
o Pass Through
When partitioning the individual transformation it is advisable to go for the following
o Aggregator Cache Use hash auto key
o Lookup Cache Use hash auto key partition type, equality condition
o Sorter Cache Use Hash auto key or Pass-through or Key range
4.16 Memory Settings
Session Shared Memory Size controls the total amount of memory used to buffer rows
internally by the reader and writer. Set session shared memory between 12MB and
25MB, remember increasing the shared memory beyond this doesn’t guarantee increase
in performance rather it does decrease the performance.
Buffer Block Size controls the size of the blocks that move in the pipeline. Set shared
buffer block size around 128K, This would be used by informatica for handling block of
rows.
If the server has RAM over 12 GIG’s, then shared memory can be increased between 1
and 2 GIG’s. Also the shared buffer block size should set relative to shared memory
settings.
The Informatica Server moves data from sources to targets based on workflow and
mapping metadata stored in a repository. A workflow is a set of instructions that describes
how and when to run tasks related to extracting, transforming, and loading data. The
Informatica Server runs workflow tasks according to the conditional links connecting the
tasks.
4.17 Reduce Number of OBJETS in a map
Frequently, the idea of these tools is to make the "data translation map" as easy as
possible. All too often, that means creating "an" (1) expression for each
throughput/translation (taking it to an extreme of course). Each object adds computational
overhead to the session and timings may suffer. Sometimes if performance is an issue /
goal, integrate several expressions in to one expression object, thus reducing the "object"
overhead. In doing so – it could speed up the map.
4.18 Slow Sources - Flat Files
If we've got slow sources, and these sources are flat files, we can look at some of the
following possibilities. If the sources reside on a different machine, and we've opened a
named pipe to get them across the network - then we've opened (potentially) a can of
worms. We’ve introduced the network speed as a variable on the speed of the flat file
source. Try to compress the source file, FTP PUT it on the local machine (local to
PMServer), decompress it, and then utilize it as a source. If you're reaching across the
network to a relational table - and the session is pulling many rows (over 10,000) then the
source system itself may be slow. We may be better off using a source system extract
program to dump it to file first, and then follow the above instructions. However, there is
something your SA's and Network Ops folks could do (if necessary) - this is covered in
detail in the advanced section. They could backbone the two servers together with a
dedicated network line (no hubs, routers, or other items in between the two machines).
At the very least, they could put the two machines on the same sub-net. Now, if the file is
local to PMServer but is still slow, examine the location of the file (which device is it on).
If it's not on an INTERNAL DISK then it will be slower than if it were on an internal disk
(C drive for you folks on NT). This doesn't mean a UNIX file LINK exists locally, and
the file is remote - it means the actual file is local.

4.19 Break the mappings out

One per target. If necessary, 1 per source per target. Why does this work? Well -
eliminating multiple targets in a single mapping can greatly increase speed... Basically it's
like this: one session per map/target. Each session establishes its own database
connection. Because of the unique database connection, the DBMS server can now
handle the insert/update/delete requests in parallel against multiple targets. It also helps
to allow each session to be specified for its intended purpose (no longer mixing a data
driven session with INSERTS only to a single target). Each session can then be placed in
to a batch marked "CONCURRENT" if preferences allow. Once this is done, parallelism
of mappings and sessions become obvious. A study of parallel processing has shown
again and again, that the operations can be completed sometimes in half the time of their
original counterparts merely by streaming them at the same time. With multiple targets
in the same mapping, a single database connection to handle multiplies diverse database
statements - sometimes hitting this target, other times hitting that target. Think - in this
situation it's extremely difficult for Informatica (or any other tool for that matter) to build
BULK operations... even though "bulk" is specified in the session. Remember that
"BULK" means this is your preference, and that the tool will revert to NORMAL load if
it can't provide a BULK operation on a series of consecutive rows. Obviously, data
driven then forces the tool down several other layers of internal code before the data
actually can reach the database.

4.19.1 Keep the mappings as simple as possible


Bury complex logic (if you must) in to a maplet. If you can avoid complex logic all
together - then that would be the key. The old rule of thumb applies here (common
sense) the straighter the path between two points, the shorter the distance... Translated as:
the shorter the distance between the source qualifier and the target - the faster the data
loads

4.20 READER/TRANSFORMER/WRITER threads affect the TIMING


With complex mappings, don't forget that each ELEMENT (field) must be weighed - in
this light a firm understanding of how to read performance statistics generated by
Informatica becomes important. In other words - if the reader is slow, then the rest of the
threads suffer, if the writer is slow, same effect. A pipe is only as big as its smallest
diameter.... A chain is only as strong as its weakest link.
4.21 Sorting – performance issues
We can improve Aggregator transformation performance by using the Sorted Input
option. When the Sorted Input option is selected, the Informatica Server assumes all data
is sorted by group. As the Informatica Server reads rows for a group, it performs
aggregate calculations as it reads. When necessary, it stores group information in
memory. To use the Sorted Input option, we must pass sorted data to the Aggregator
transformation. We can gain added performance with sorted ports when we partition the
session.
When Sorted Input is not selected, the Informatica Server performs aggregate
calculations as it reads. However, since data is not sorted, the Informatica Server stores
data for each group until it reads the entire source to ensure all aggregate calculations are
accurate.
For example, one Aggregator has the STORE_ID and ITEM Group By ports, with the
Sorted Input option selected. When you pass the following data through the Aggregator,
the Informatica Server performs an aggregation for the three records in the 101/battery
group as soon as it finds the new group, 201/battery:

STORE_ID ITEM QTY PRICE


101 ‘battery’ 3 2.99
101 ‘battery’ 1 3.19
101 ‘battery’ 2 2.59
201 ‘battery’ 4 1.59
201 ‘battery’ 1 1.99
If you use the Sorted Input option and do not presort data correctly, the session fails.
4.21.1 Sorted Input Conditions
Do not use the Sorted Input option if any of the following conditions are true:
• The aggregate expression uses nested aggregate functions.
• The session uses incremental aggregation.
• Input data is data-driven. You choose to treat source data as data driven in the
session properties, or the Update Strategy transformation appears before the Aggregator
transformation in the mapping.
• If we use the Sorted Input option under these circumstances, the Informatica
Server reverts to default aggregate behaviour, reading all values before performing
aggregate calculations.
4.21.2 Pre-Sorting Data
To use the Sorted input option, you pass sorted data through the Aggregator.
Data must be sorted as follows:
• By the Aggregator group by ports, in the order they appear in the Aggregator
transformation.
• Using the same sort order configured for the session.
If data is not in strict ascending or descending order based on the session sort order, the
Informatica Server fails the session. For example, if you configure a session to use a
French sort order, data passing into the Aggregator transformation must be sorted using
the French sort order.
If the session uses file sources, you can use an external utility to sort file data before
starting the session. If the session uses relational sources, we can use the Number of
Sorted Ports option in the Source Qualifier transformation to sort group by columns in the
source database. Group By columns must be in the exact same order in both the
Aggregator and Source Qualifier transformations.
4.22 Workflow Manager
The Informatica Server moves data from sources to targets based on workflow and
mapping metadata stored in a repository. A workflow is a set of instructions that describes
how and when to run tasks related to extracting, transforming, and loading data. The
Informatica Server runs workflow tasks according to the conditional links connecting the
tasks.

• Monitor, add, edit, delete Informatica Server info in the repository


• Stop the informatica Server
• Configure database, external loader, and FTP connections
• Manage sessions and batches - create, edit, delete, copy/move within a folder,
start/stop, abort sessions, view session logs, details, and session performance details.

Source Server Target

Source Data Transformed data


Instructions from Metadata

`
Repository
4.22.1 Monitoring and Running a Session:
When the Informatica Server runs a Session task, the Workflow Monitor creates session
details that provide load statistics for each target in the mapping. We can view session
details when the session runs or after the session completes.
Create a worklet to reuse a set of workflow logic in several workflows. Use the Worklet
Designer to create and edit worklets.
4.22.2 Informatica suggests that each session takes roughly 1 to 1 1/2 CPU's.
In keeping with this - Informatica play's well with RDBMS engines on the same
machine, but does NOT get along (performance wise) with ANY other engine (reporting
engine, java engine, OLAP engine, java virtual machine, etc...)
4.22.3 Place some good server load monitoring tools on the PM Server in development
Watch it closely to understand how the resources are being utilized, and where the hot
spots are. Try to follow the recommendations - it may mean upgrading the hardware to
achieve throughput. Look in to EMC' s disk storage array - while expensive, it appears to
be extremely fast, it may improve the performance in some cases by up to 50%
4.22.4 Parallel sessions/worklets.
In workflow, depending on the business logic configure the session/worklet tasks parallel
instead of sequential. This will reduce the workflow process time to large extent.
However creating too many parallel tasks may decrease the performance.

For example : - This logic has been implemented for Acs_London workflow, previously
balances, postings and contracts worklets are arranged in sequential. Postings worklet has
no dependency on any other worklet, in this case postings re- arranged as parallel worklet
to balances worklet. Balances worklet has three sequential staging sessions, these three
tasks re-arranged as parallel. Now the execution time for balances staging sessions
reduced to 40%.
Earlier, ICON worklet was designed with sequential tasks which will create 6 output
files. Since there were no dependencies between the files, this worklet was re-arranged to
load parallel. This reduced to 30% time. Funding inter company tasks moved to another
worklet.
Paragon workflow parallelism – Out of the 17 sessions loading Paragon data into staging
tables which were scheduled to run sequentially, 2 sessions (corporate load and cp
mapping load) were changed to run sequentially and the remaining 15 sessions were
changed to run in parallel. Due to the parallelism and other changes at the database level
(described in detail under section 6.5), performance improved by 100%.
Creating too many parallel tasks may decrease the performance, there is no limit for
parallel tasks, however the performance depends upon available Informatica server
resources.

4.23 Change Database Priorities for the PMServer Database User

Prioritizing the database login that any of the connections use (setup in Server Manager)
can assist in changing the priority given to the Informatica executing tasks. These tasks
when logged in to the database then can over-ride others. Sizing memory for these tasks
(in shared global areas, and server settings) must be done if priorities are to be changed.
If BCP or SQL*Loader or some other bulk-load facility is utilized, these priorities must
also be set. This can greatly improve performance. Again, it's only suggested as a last
resort method, and doesn't substitute for tuning the database, or the mapping processes. It
should only be utilized when all other methods have been exhausted (tuned). Keep in
mind that this should only be relegated to the production machines and only in certain
instances where the Load cycle that Informatica is utilizing is NOT impeding other users.

4.24 Change the Unix User Priority


In order to gain speed, the Informatica UNIX User must be given a higher priority. The
UNIX SA should understand what it takes to rank the UNIX logins, and grant priorities to
particular tasks. Or - simply have the pmserver executed under a super user (SU)
command; this will take care of reprioritizing Informatica's core process. This should
only be used as a last resort - once all other tuning avenues have been exhausted, or if we
have a dedicated UNIX machine on which Informatica is running.
4.25 Try not to load across the network
If at all possible, try to co-locate PMServer executable with a local database. Not having
the database local means: 1) the repository is across the network (slow), 2) the sources /
targets are across the network, also potentially slow. If we have to load across the
network, at least try to localize the repository on a database instance on the same machine
as the server. The other thing is: try to co-locate the two machines (pmserver and Target
database server) on the same sub-net, even the same hub if possible. This eliminates
unnecessary routing of packets all over the network. Having a localized database also
allows us to setup a target table locally - which you can then "dump" following a load, ftp
to the target server, and bulk-load in to the target table. This works extremely well for
situations where append or complete refresh is taking place.

4.26 Balance between Informatica and the power of SQL and the database
Try to utilize the DBMS for what it was built for:
reading/writing/sorting/grouping/filtering data en-masse. Use Informatica for the more
complex logic, outside joins, data integration, multiple source feeds, etc... The balancing
act is difficult without DBA knowledge. In order to achieve a balance, we must be able
to recognize what operations are best in the database, and which ones are best in
Informatica. This does not degrade from the use of the ETL tool, rather it enhances it -
it's a MUST if you are performance tuning for high-volume throughput.

5 Performance tuning the UNIX OS


5.1 Process check
ps-axu:
Run to check for the following items:
o Are there any processes waiting for disk access or for paging, if so check the I/O
and memory subsystems.
o Processes that are using most of the CPU and Processes are using most of the
memory. This may help you distribute the workload better.
5.2 Identifiying & resolving memory problem
Run vmstat S 5 confirms memory problems and check for the following:
o Pages-outs occurring consistently? If so, you are short of memory.
o Are there a high number of address translation faults? (System V only) This
suggests a memory shortage.
o Are swap-outs occurring consistently? If so, you are extremely short of memory.
Occasional swap-outs are normal; BSD systems swap-out inactive jobs. Long bursts of
swap-outs mean that active jobs are probably falling victim and indicate extreme memory
shortage. If you don’t have vmstat -S, look at the w and de fields of vmstat. These should
ALWAYS be zero.
If memory seems to be the bottleneck of the system, try following remedial steps:
o Reduce the size of the buffer cache, if your system has one, by decreasing
BUFPAGES. The buffer cache is not used in system V.4 and SunOS 4.X systems. Making
the buffer cache smaller will hurt disk I/O performance.
o If you have statically allocated STREAMS buffers, reduce the number of large
(2048- and 4096-byte) buffers. This may reduce network performance, but netstat-m
should give you an idea of how many buffers you really need.
o Reduce the size of your kernels tables. This may limit the systems capacity
(number of files, number of processes, etc.).
o Try running jobs requiring a lot of memory at night. This may not help the
memory problems, but you may not care about them as much.
o Try running jobs requiring a lot of memory in a batch queue. If only one memory-
intensive job is running at a time, your system may perform satisfactorily.
o Try to limit the time spent running sendmail, which is a memory hog.
o If you don’t see any significant improvement, add more memory.
5.3 Identifying and Resolving Disk I/O Issues
Use iostat to check i/o load and utilization, as well as CPU load. Iostat can be used to
monitor the I/O load on the disks on the UNIX server. Using iostat permits monitoring
the load on specific disks. Take notice of how fairly disk activity is distributed among the
system disks. If it is not, are the most active disks also the fastest disks
Following might help to rectify the problem due to I/O
o Reorganize your file systems and disks to distribute I/O activity as evenly as
possible.
o Using symbolic links helps to keep the directory structure the same throughout
while still moving the data files that are causing I/O contention.
o Use your fastest disk drive and controller for your root file system; this will
almost certainly have the heaviest activity. Alternatively, if single-file throughput is
important, put performance-critical files into one file system and use the fastest drive for
that file system.
o Put performance-critical files on a file system with a large block size: 16KB or
32KB (BSD).
o Increase the size of the buffer cache by increasing BUFPAGES (BSD). This may
hurt your systems memory performance.
o Rebuild your file systems periodically to eliminate fragmentation (backup, build a
new file system, and restore).
o If you are using NFS and using remote files, look at your network situation. You
don’t have local disk I/O problems.
o Check memory statistics again by running vmstat 5 (sar-rwpg). If your system is
paging or swapping consistently, you have memory problems, fix memory problem first.
Swapping makes performance worse.

If system has disk capacity problem and is constantly running out of disk space, try the
following actions:

o Write a find script that detects old core dumps, editor backup and auto-save files,
and other trash and deletes it automatically. Run the script through croon.
o Use a smaller block size on file systems that are mostly small files (e.g., source
code files, object modules, and small data files).
5.4 Identifying and Resolving CPU Overload Issues
Use sar u to check for CPU loading. This provides the %usr (user), %sys (system), %wio
(waiting on I/O), and %idle (% of idle time). A target goal should be %usr + %sys= 80
and %wio = 10 leaving %idle at 10. If %wio is higher, the disk and I/O contention should
be investigated to eliminate I/O bottleneck on the UNIX server. If the system shows a
heavy load of %sys, and %usr has a high %idle, this is indicative of memory and
contention of swapping/paging problems. In this case, it is necessary to make memory
changes to reduce the load on the system server.
When you run iostat 5 above, also observe for CPU idle time. Is the idle time always 0,
without letup? It is good for the CPU to be busy, but if it is always busy 100 percent of
the time, work must be piling up somewhere. These points to CPU overload.
o Eliminate unnecessary daemon processes. rwhod and routed are particularly likely
to be performance problems, but any savings will help.
o Get users to run jobs at night with any queuing system that’s available always for
help. You may not care if the CPU (or the memory or I/O system) is overloaded at night,
provided the work is done in the morning.
o Use nice to lower the priority of CPU-bound jobs will improve interactive
performance. Also, using nice to raise the priority of CPU-bound jobs will expedite them
but will hurt interactive performance. In general though, using nice is really only a
temporary solution. If your workload grows, it will soon become insufficient. Consider
upgrading your system, replacing it, or buying another system to share the load.
5.5 Identifying and Resolving Network Issues
One can suspect problems with network capacity or with data integrity if users
experience slow performance when they are using rlogin or when they are accessing files
via NFS.
Look at netsat-I. If the number of collisions is large, suspect an overloaded network. If
the number of input or output errors is large, suspect hardware problems. A large number
of input errors indicate problems somewhere on the network. A large number of output
errors suggest problems with your system and its interface to the network.
If collisions and network hardware are not a problem, figure out which system appears to
be slow. Use spray to send a large burst of packets to the slow system. If the number of
dropped packets is large, the remote system most likely cannot respond to incoming data
fast enough. Look to see if there are CPU, memory or disk I/O problems on the remote
system. If not, the system may just not be able to tolerate heavy network workloads. Try
to reorganize the network so that this system isn’t a file server.
A large number of dropped packets may also indicate data corruption. Run netstat-s on
the remote system, then spray the remote system from the local system and run netstat-s
again. If the increase of UDP socket full drops (as indicated by netstat) is equal to or
greater than the number of drop packets that spray reports, the remote system is slow
network server If the increase of socket full drops is less than the number of dropped
packets, look for network errors.
Run nfsstat and look at the client RPC data. If the retransfield is more than 5 percent of
calls, the network or an NFS server is overloaded. If timeout is high, at least one NFS
server is overloaded, the network may be faulty, or one or more servers may have
crashed. If badmixis roughly equal to timeout, at least one NFS server is overloaded. If
timeout and retrans are high, but badxidis low, some part of the network between the NFS
client and server is overloaded and dropping packets. Try to prevent users from running
me /O- intensive programs across the network. The greputility is a good example of an
I/O intensive program. Instead, have users log into the remote system to do their work.
Reorganize the computers and disks on your network so that as many users as possible
can do as much work as possible on a local system. Use systems with good network
performance as file servers. If you are short of STREAMS data buffers and are running
Sun OS 4.0 or System V.3 (or earlier), reconfigure the kernel with more buffers.

6 Identifying Oracle performance Issues


When reviewing Oracle performance issues on the box be aware of the following:
• If there is a difference in performance between UAT and PRD, do not assume that
the two environments are configured the same. Keeping in mind that UAT is the DR
environment for PRD this is NOT likely to be different.
• Always start by extracting the slow SQL being executed and get an explain plan
out in UAT & PRD, its likely the explain plan wont be the same, in which case drill down
to ask why. Start simple with ORACLE 101 analyse.
• Always consider what else is running on the environment, it may not be the slow
SQL which is the problem, you may have another process holding onto resource. This is
especially true of the cases when you get inconsistent performance from a process.
• Concurrency/Parrellism is NOT always the solution to all your problems, infact if
you dont have idle CPU it causes more problems. Use Concurrency/Parrellism sensibly
and with caution.
• Keep in mind there is no magical switch when tunning slow Oracle jobs, the most
important weapon we have is Visibility and Information on whats been executed at a
point in time.
• 70% of Oracle performance issues are related to the SQL being executed.
• Inconsistency between environments can attributed to STATS, this should be one
of the first areas to be reviewed.
• Consult the DBA but only after you’ve followed the necessary steps above, the
DBA team can setup the stats pack report and drill down into further detail if its
established that its not an application problem or where we dont have enough
visibility/information.

6.1 Checking Problem Processes.


There a number of ways to monitor and check for problem Oracle processes.

• Use the monitoring in tools like TOAD to review Oracle jobs in realtime.
• Request the DBA setup/switch on STATS PACK and give you a point in time
report.
The STATS PACK report can give you the top worst queries running at a point in time.

Whilst a problem process is running we should consider using TOAD (or a tool like it), to
review what this process is doing. This can be done via the Kill/Trace Session Tool in
TOADs DBA menu.

Using this tool we can review exactly what the process is executing and also review
statistics based on physical reads, connect time, user, block gets, block changes,
consistent gets. Consider taking the SQL and getting a plan.

Using this TOOL we can do a very simple analyse of the process by using the SPID in the
Kill/Trace Session Tool and linking that back to the PID in “top” output in the servers
unix command line.

For example the Oracle process below with SPID 22301 in TOAD can be linked to PID
22301 in TOP and is using 12.1% of a CPU.

If we see a process approaching 100% of a CPU this process clearly needs to be tuned.

Be careful when using “top” on the unix command line as it can be expensive, you should
Quit out of this when its no longer needed.
6.2 Getting an explain plan

The thing that should be done when reviewing any problem process is to get the explain
plan, this will give you a clue as to what the problem may or may not be.

• Compare the explain plan between an environment where the process runs
quickly and in an environment where it does not. If the plans are different between
environments we could be looking at a stats problem or missing indexes or different data
volumes.
• For expensive plans use Oracle 101 techniques, check we have stats that reflect
the contents of an object and that we have indexes that relect the WHERE clause.
• Keep in mind a “Full Table Scan” is NOT always bad, especially if we return 50%
of the rows when using an index. In the case of nested joins the smaller table is always
fully scanned, in the case of hash joins both tables are usually fully scanned.

You can use FEED_USER to get an explain plan, ensure you use table PLAN_TABLE
owned by SYS. A tool like SQL*Navigator can be used for this.

6.3 Check Stats for an object

Stale or out of date statistics can cause havoc in an Oracle database, here are some tell tall
signs.

• Indexes exist but never used.


• Performance different between environments.
• Performance for the same environment degrades or is inconsistent.

A very simple sanity check can be done using the query for objects associated with the
problem process. Look at the NUM_ROWS value initially.

SELECT * FROM DBA_TABLES WHERE TABLE_NAME=’<table>’;


SELECT * FROM DBA_TAB_PARTITIONS WHERE TABLE_NAME=’<table>’;

Keep in mind that statistics must be fairly accurate in the Oracle data dictionaries, it does
not have to exact BUT the differences should NEVER be in order of magnitude. As
Oracle will make a decision to use an index ot not using information like the data
volumns in an object.

The following can be done to ensure good statistics.


• Gather Stats scheduled to capture a accurate snapshot of volumnes in database.
• Dynamic Sampling in process, preferable for objects that are not accessed by
multple jobs at the same time.
• Export/Import of statistics used to capture and reuse good statistics at a point in
time, and potentially Gather Stats not used for these objects.

6.4 Procedure to be followed prior to an interface go-live/post go-live

These steps should be followed prior to an interface go-live.

Using the information/tools ( or similar tools ) above

• Explain Plans for processed to be reviewed.


• CPU consumption reviewed using top or tools in section 5.
• I/O reviewed using tools in section 5.
• Time taken for major queries / function / procedures
• If Dynamic Sampling/Runtime Gather Stats is not used.
o Ensure the scheduled gather stats job gives accurate stats post population of
tables.
o Consider exporting/importing the good stats from UAT to PRD, and keeping this
static.

Theese steps should be followed post interface go-live for the first few days of
processing.

Using the information/tools ( or similar tools ) above

• Explain Plans for processed to be reviewed.


• CPU consumption reviewed using top or tools in section 5.
• I/O reviewed using tools in section 5.
• Check statistics for objects used see 6.3

6.5 Load methodology for staging tables with indexes


Follow the guidelines below for loading any staging table –
• Indexes on staging table – check up if the staging table to be loaded has any
indexes
• Usage of staging tables by other processes – ascertain if any concurrent processes
running at the time when the staging table is loaded,query the staging table
If the staging table has indexes and no concurrent processes query the staging table, then
performance improvement can be achieved by dropping all the indexes on the staging
table before loading the data and re-creating the indexes after completion of data load.

Paragon case study – in Paragon, 17 staging tables which had indexes (but not queried by
any concurrent processes) were getting loaded with indexes. After changing the workflow
to drop all the indexes on the staging tables, load data and re-create indexes, the
performance gain was to the order of 80%

6.6 Investigating Performance Issues Using the Data Dictionary Views.

Oracle maintains good statistics on the state of the database and this can be a good
inidcator as to what problems queries exist, what waits exist and also the current setup of
the Oracle database

6.6.1 V$sysstat and V$waitstat

V$SYSSTAT stores instance-wide statistics on resource usage, cumulative since the


instance was started.

Below is list of useful stats can be taken from v$sysstat.

SELECT * FROM V$sysstat where name in ('parse count (hard)',


'db block changes',
'execute count',
'CPU used by this session',
'logons current',
'logons cumulative',
'parse count (total)',
'parse time cpu',
'parse time elapsed',
'physical reads',
'physical writes',
'redo log space requests',
'redo size',
'session logical reads',
'sorts (memory)',
'sorts (disk)',
'sorts (rows)',
'table fetch by rowid',
'table scan rows gotten',
'table scan blocks gotten',
'user commits',
'user rollbacks'
)

V$WAITSTAT stores a summary all buffer waits since instance startup. It is useful for
breaking down the waits by class if you see a large number of buffer busy waits on the
system.
The following are possible reasons for waits:
• Undo segment header: not enough rollback segments
• Data segment header/freelist: freelist contention
• Data block
• Large number of CR clones for the buffer
• Range scans on indexes with large number of deletions
• Full table scans on tables with large number of deleted rows
• Blocks with high concurrency

6.6.2 Buffer Cache Hit Ratio Should be above 85%

Example

SELECT NAME, VALUE


FROM V$SYSSTAT
WHERE NAME IN ('session logical reads','physical reads',
'physical reads direct','physical reads direct (lob)','db block gets','consistent
gets');

Hit Ratio = 1 - ((physical reads - physical reads direct - physical reads direct (lob)) /
(db block gets + consistent gets - physical reads direct - physical reads direct (lob))

SELECT 1 - (40436054-2384700-0) / (786683547 + 5145590004 - 40443416 - 2384700)


FROM DUAL

Interpreting and Using the Buffer Cache Advisory Statistics


There are many factors to examine before considering whether to increase or decrease the
buffer cache size. For example, you should examine V$DB_CACHE_ADVICE data and
the buffer cache hit ratio.

A low cache hit ratio does not imply that increasing the size of the cache would be
beneficial for performance. A good cache hit ratio could wrongly indicate that the cache
is adequately sized for the workload.

To interpret the buffer cache hit ratio, you should consider the following:

Repeated scanning of the same large table or index can artificially inflate a poor cache hit
ratio. Examine frequently executed SQL statements with a large number of buffer gets, to
ensure that the execution plan for such SQL statements is optimal. If possible, avoid
repeated scanning of frequently accessed data by performing all of the processing in a
single pass or by optimizing the SQL statement.
If possible, avoid requerying the same data, by caching frequently accessed data in the
client program or middle tier.
Blocks encountered during a long full table scan are not put at the head of the list of last
recently used (LRU) blocks. Therefore, the blocks are aged out faster than blocks read
when performing indexed lookups or small table scans. Thus, poor hit ratios when valid
large full table scans are occurring should also be considered when interpreting the buffer
cache data.

6.6.3 Library Cache Hit Ratio

SELECT sum(pinhits) / sum(pins) "Hit Ratio",


sum(reloads) / sum(pins) "Reload percent"
FROM v$librarycache
WHERE namespace in ('SQL AREA', 'TABLE/PROCEDURE', 'BODY', 'TRIGGER');

The hit ratio should be at least 85% (i.e. 0.85).


The reload percent should be very low, 2% (i.e. 0.02) or less.
If this is not the case, increase the initialisation parameter SHARED_POOL_SIZE.
Although less likely, the init.ora parameter OPEN_CURSORS may also need to
increased.

6.6.4 Dictionary Cache Hit Ratio should be less than 15%

select sum(gets-getmisses)*100/sum(gets) from v$rowcache


select (sum(getmisses)/sum(gets))*100 from v$rowcache

The dictionary cache hit ratio is a measure of the proportion of requests for information
from the data dictionary, the collection of database tables and views containing reference
information about the database, its structures, and its users. On instance startup, the data
dictionary cache contains no data, so any SQL statement issued is likely to result in cache
misses. As more data is read into the cache, the likelihood of cache misses should
decrease. Eventually the database should reach a "steady state" in which the most
frequently used dictionary data is in the cache.

The dictionary cache resides within the Shared Pool, part of the SGA, so increasing the
shared pool size should improve the dictionary cache hit ratio.

6.6.5 Shared pool

select * from V$SHARED_POOL_ADVICE

V$SHARED_POOL_ADVICE displays information about estimated parse time savings


in the shared pool for different sizes. The sizes range from 50% to 200% of the current
shared pool size, in equal intervals. The value of the interval depends on the current size
of the shared pool.

Table 24-22 V$SHARED_POOL_ADVICE View


Column Datatype Description
SHARED_POOL_SIZE_FOR_ESTIMATE
NUMBER
Shared pool size for the estimate (in megabytes)

SHARED_POOL_SIZE_FACTOR
NUMBER
Size factor with respect to the current shared pool size

ESTD_LC_SIZE
NUMBER
Estimated memory in use by the library cache (in megabytes)

ESTD_LC_MEMORY_OBJECTS
NUMBER
Estimated number of library cache memory objects in the shared pool of the specified
size

ESTD_LC_TIME_SAVED
NUMBER
Estimated elapsed parse time saved (in seconds), owing to library cache memory objects
being found in a shared pool of the specified size.

ESTD_LC_TIME_SAVED_FACTOR
NUMBER
Estimated parse time saved factor with respect to the current shared pool size

ESTD_LC_MEMORY_OBJECT_HITS
NUMBER
Estimated number of times a library cache memory object was found in a shared pool of
the specified size

6.6.5.1 Shared Pool Free

select round((sum(decode(name,'free memory',bytes,0))/sum(bytes))*100,2)from


v$sgastat
The percentage of the shared pool not currently in use. If a large proportion of the shared
pool is always free, it is likely that the size of the shared pool can be reduced. Low free
values are not a cause for concern unless other factors also indicate problems,
e.g. a poor dictionary cache hit ratio or large proportion of reloads occurring.

6.6.5.2 Shared Pool Reload

select round(sum(reloads)/sum(pins)*100,2)from v$librarycache where namespace in


('SQL AREA','TABLE/PROCEDURE','BODY','TRIGGER')

This is similar to a Library Cache Miss Ratio, but is specific to SQL and PL/SQL blocks.
Shared pool reloads occur when Oracle has to implicitly reparse SQL or PL/SQL at the
point when it attempts to execute it. A larger shared pool wil reduce the number of times
that code needs to be reloaded. Also, ensuring that similar pieces of SQL are written
identically will increase sharing of code.
To take advantage of additional memory available for shared SQL areas, you may also
need to increase the number of cursors permitted for a session. You can do this by
increasing the value of the initialization parameter OPEN_CURSORS.

6.6.6 Recursive/Total Calls.

select round((rcv.value/(rcv.value+usr.value))*100,2)
from v$sysstat rcv, v$sysstat usr where rcv.name='recursive calls'and usr.name='user
calls'

High Ratio Caused by


Dynamic extension of tables due to poor sizing
Growing and shrinking of rollback segments due to unsuitable OPTIMAL settings
Large amounts of sort to disk resulting in creation and deletion of temporary segments
Data dictionary misses
Complex triggers, integrity constraints, procedures, functions and/or packages

6.6.7 Short/Total Table Scans

select round((shrt.value/(shrt.value+lng.value))*100,2)from v$sysstat shrt, v$sysstat lng


where shrt.name='table scans (short tables)'and lng.name='table scans (long tables)'

This is the proportion of full table scans which are occurring on short tables.
Short tables may be scanned by Oracle when this is quicker than using an index.
Full table scans of long tables is generally bad for overall performance.
Low figures for this graph may indicate lack of indexes on large tables or poorly
written SQL which fails to use existing indexes or is returning a large percentage of the
table.

6.6.8 Redo Activity

6.6.8.1 Redo Space Wait Ratio


select round((req.value/wrt.value)*100,2)from v$sysstat req, v$sysstat wrt
where req.name= 'redo log space requests'
and wrt.name= 'redo writes'

A redo space wait is when there is insufficient space in the redo buffer for a transaction to
write redo information.
It is an indication that the redo buffer is too small given the rate of transactions occurring
in relation to the rate at which the log writer is writing data to the redo logs.

6.6.8.2 Redo Log Allocation Contention

There are two latches

6.6.8.2.1 The Redo Allocation Latch


select round(greatest((sum(decode(ln.name,'redo allocation',misses,0))/
greatest(sum(decode(ln.name,'redo allocation',gets,0)),1)),
(sum(decode(ln.name,'redo allocation',immediate_misses,0))/
greatest(sum(decode(ln.name,'redo allocation',immediate_gets,0))
+sum(decode(ln.name,'redo allocation',immediate_misses,0)),1)))*100,2)
from v$latch l,v$latchname ln
where l.latch#=ln.latch#

The redo allocation latch controls the allocation of space for redo entries in the redo log
buffer. To allocate space in the buffer, an Oracle user process must obtain the redo
allocation latch. Since there is only one redo allocation latch, only one user process can
allocate space in the buffer at a time. The single redo allocation latch enforces the
sequential nature of the entries in the buffer.
After allocating space for a redo entry, the user process may copy the entry into the
buffer. This is called "copying on the redo allocation latch". A process may only copy on
the redo allocation latch if the redo entry is smaller than a threshold size.

The maximum size of a redo entry that can be copied on the redo allocation latch is
specified by the initialization parameter LOG_SMALL_ENTRY_MAX_SIZE.
6.6.8.2.2 Redo Copy Latches
select round(greatest((sum(decode(ln.name,'redo copy',misses,0))/
greatest(sum(decode(ln.name,'redo copy',gets,0)),1)),
(sum(decode(ln.name,'redo copy',immediate_misses,0))/
greatest(sum(decode(ln.name,'redo copy',immediate_gets,0))
+sum(decode(ln.name,'redo copy',immediate_misses,0)),1)) )*100,2)
from v$latch l,v$latchname ln
where l.latch#=ln.latch#

The user process first obtains the copy latch. Then it obtains the allocation latch,
performs allocation, and releases the allocation latch. Next the process performs the copy
under the copy latch, and releases the copy latch. The allocation latch is thus held for
only a very short period of time, as the user process does not try to obtain the copy latch
while holding the allocation latch.
If the redo entry is too large to copy on the redo allocation latch, the user process must
obtain a redo copy latch before copying the entry into the buffer. While holding a redo
copy latch, the user process copies the redo entry into its allocated space in the buffer and
then releases the redo copy latch.

With multiple CPUs the redo log buffer can have multiple redo copy latches. These allow
multiple processes to copy entries to the redo log buffer concurrently. The number of redo
copy latches is determined by the parameter LOG_SIMULTANEOUS_COPIES.

6.6.9 Table Contention

There are two figures which give indications of how well table storage is working.
Figures are averaged across all tables in use.
This means one table may be seriously at fault or many tables may have low level
problems.

6.6.9.1 Chained Fetch Ratio

select round((cont.value/(scn.value+rid.value))*100,2)
from v$sysstat cont, v$sysstat scn, v$sysstat rid
where cont.name= 'table fetch continued row'
and scn.name= 'table scan rows gotten'
and rid.name= 'table fetch by rowid'

This is a proportion of all rows fetched which resulted in a chained row continuation.
Such a continuation means that data for the row is spread across two blocks, which can
occur in either of two ways:
Row Migration
This occurs when an update to a row cannot fit within the current block. In this case, the
data for the row is migrated to a new block leaving a pointer to the new location in the
original block.

Row Chaining
This occurs when a row cannot fit into a single data block, e.g. due to having large or
many fields. In this case, the row is spread over two or more blocks.

6.6.9.2 Free List Contentions


select round((sum(decode(w.class,'free list',count,0))/
(sum(decode(name,'db block gets',value,0))
+ sum(decode(name,'consistent gets',value,0))))*100,2)
from v$waitstat w, v$sysstat

Free list contention occurs when more than one process is attempting to insert data into a
given table. The table header structure maintains one or more lists of blocks which have
free space for insertion.
If more processes are attempting to make insert than there are free lists some will have to
wait for access to a free list.

6.6.10 CPU Parse Overhead


select round((prs.value/(prs.value+exe.value))*100,2)
from v$sysstat prs, v$sysstat exe
where prs.name like 'parse count (hard)' and exe.name= 'execute count'

The CPU parse overhead is the proportion of database CPU time being spent in parsing
SQL and PL/SQL code.
High values of this figure indicate that either a large amount of once-only code is being
used by the database or that the shared sql area is too small.

6.6.11 Latches
Latches are simple, low-level serialization mechanisms to protect shared data structures
in the SGA. When attempting to get a latch a process may be willing to wait, hence this
graph includes two figures. See also redo log allocation latches.
6.6.11.1 Willing to Wait Latch Gets
select round(((sum(gets) - sum(misses)) / sum(gets))*100,2)from v$latch

An attempt by a process to obtain a latch which is willing to wait will sleep and retry
until it obtains the latch. Optimum = High.

6.6.11.2 Immediate Latch Gets


select round(((sum(immediate_gets) - sum(immediate_misses)) /
sum(immediate_gets))*100,2)from v$latch

An attempt to obtain a latch which a process is not allowed to wait for will timeout.
Optimum = High.

6.6.12 Rollback Segment Contention


select sum(waits)/sum(gets)*100 from v$rollstat

This figure is an indication of whether a process had to wait to get access to a rollback
segment. To improve figures, increase the number of rollback segments available.

You might also like