15782

IBM AIX
Using Perfpmr, AIX VMM Page Replacement and

System p Performance / AIX v6.1
(Part 1)
Click anywhere on the

course to continue.
© 2007 IBM Corporation
Agenda
This deep-dive course covers the following topics:

Part 1: Deep Dive perfpmr tool
Part 2: VMM page replacement
Part 2: System performance / IBM AIX 6.1
► New tuning parameters
This course applies to students who want to know about PerfPMR, how
to generate reports and what parameters apply.
It also helps with determining when there is a performance issue, what
to review and what the possible causes of the issue can be.
This course assumes students are familiar with IBM AIX® performance
tools.
Click the Notes tab on the left to see the accompanying notes text.
2
Performance Data Collection
PERFPMR consists of a set of utilities that collect information in

analyzing performance issues
PERFPMR is downloadable from a public ftp site:
► Not distributed on AIX media
► ftp ftp.software.ibm.com using anonymous ftp
► cd /aix/tools/perftools/perfpmr/perfXX (where XX is the AIX release)
► Get the compressed .tar file in that directory and install it using the
directions in the provided README file
► PERFPMR is updated periodically, so it’s advisable to check the FTP
site for the most recent version
The bos.perf packages (from AIX install media) must also be installed.
3
Running PERFPMR
Once PERFPMR has been installed, you can run it in any directory
► To determine the amount of space needed, estimate at least 20MB per
logical CPU plus an extra 50MB of space
► Run “perfpmr.sh <# of seconds>” at the time when the performance
problem is occurring
► A pair of 5-second traces are collected first
► Then various monitoring tools are run for the duration of time specified as
a parameter to perfpmr.sh
► After this, tprof, filemon, iptrace, tcpdump data is collected
► Finally, system configuration data is collected
► Data can be tar’d up and sent to testcase.software.ibm.com with the
filename having the pmr# in it
4
Syntax
perfpmr.sh [-PQDIgfnpsc][-F file][-x file][-d sec] monitor_seconds
-P preview only - show scripts to run and disk space needed"

-Q don't run lsattr,lvlv,lspv commands in order to save time"
-D run perfpmr the original way without a perfpmr cfg file"
-I get lock instrumented trace also"
-g do not collect gennames output."
-f if gennames is run, specify gennames -f."
-n used if no netstat or nfsstat desired."
-p used if no pprof collection desired while monitor.sh running."
-s used if no svmon desired."
-c used if no configuration information is desired."
-F file use file as the perfpmr cfg file - default is perfpmr.cfg"
-x file only execute file found in perfpmr installation directory"
-d sec sec is time to wait before starting collection period default is delay_seconds 0
monitor_seconds is for the monitor collection period in seconds
5
PERFPMR shell scripts

aiostat.sh Collects AIO information into a report called aiostat.int
config.sh Collects configuration information into a report called config.sum.
emstat.sh time Builds a report called emstat.int on emulated instructions. The time
parameter must be greater than or equal to 60
filemon.sh time Builds a report called filemon.sum on file I/O. The time parameter does
not have any restrictions.
iostat.sh time Builds two reports on I/O statistics: a summary report called iostat.sum
and an interval report called iostat.int. The time parameter must be
greater than or equal to 60.
iptrace.sh time Builds a raw Internet Protocol (IP) trace report on network I/O called
iptrace.raw. You can convert the iptrace.raw file to a readable ipreport
file called iptrace.int using the iptrace.sh -r command. The time
parameter does not have any restrictions
lpartstat.sh Builds a report on Logical partitioning information, two file are created
lparstat.int and lparstat.sum
monitor.sh time Invokes system performance monitors and collects interval and summary
reports:
6
PERFPMR shell scripts (Cont.)

mpstat.sh Builds a report on Logical processor information into a report called mpstat.int
netstat.sh [-r] time Builds a report on network configuration and use called netstat.int containing
entstat -d of the Ethernet interfaces, netstat -in, netstat -m, netstat -rn,
netstat -rs, netstat -s, netstat -D, and netstat -an before and after monitor.sh
was run.
nfsstat.sh time Builds a report on NFS configuration and use called netstat.int containing
nfsstat -m, and nfsstat –csnr before and after nfsstat.sh was run. The time
parameter must be greater than or equal to 60.
pprof.sh time Builds a file called pprof.trace.raw that can be formatted with the pprof.sh -r
command. The time parameter does not have any restrictions.
ps.sh time Builds reports on process status (ps). ps.sh creates the following files:
psa.elfk: A ps -elfk listing after ps.sh was run.
psb.elfk: A ps -elfk listing before ps.sh was run.
ps.int Active processes before and after ps.sh was run.
ps.sum A summary report of the changes between when ps.sh started and
finished. This is useful for determining what processes are consuming
resources. The time parameter must be greater than or equal to 60.
7
PERFPMR shell scripts (Cont.)

sar.sh time Builds reports on sar. sar.sh creates the following files:
sar.int Output of commands sadc 10 7 and sar -A
sar.sum A sar summary over the period sar.sh was run. The time parameter
must be greater than or equal to 60.
svmon.sh Builds a report on svmon data into two files svmon.out and svmon.out.S
tcpdump.sh int.time The int. parameter is the name of the interface; for example, tr0 is token-ring.
Creates a raw trace file of a TCP/IP dump called tcpdump.raw. To produce a
readable tcpdump.int file, use the tcpdump.sh -r command.
tprof.sh time Creates a tprof summary report called tprof.sum. Used for analyzing memory
use of processes and threads. You can also specify a program to profile by
specifying the tprof.sh -p program 60 command, which enables you to profile the
executable-called program for 60 seconds.
trace.sh time Creates the raw trace files (trace*) from which an ASCII trace report can be
generated using the trcrpt command
vmstat.sh time Builds reports on vmstat: a vmstat interval report called vmstat_v and a
vmstat_s summary report The time parameter must be greater than or
equal to 60. .
8
PERFPMR configuration file for perfpmr scripts

perfpmr.cfg This is the perfpmr configuration file which includes the following scripts:
♦ perfpmr_tool = trace.sh
♦ perfpmr_tool = monitor.sh
♦ perfpmr_tool = iptrace.sh
♦ perfpmr_tool = tcpdump.sh
♦ perfpmr_tool = filemon.sh
♦ perfpmr_tool = tprof.sh
♦ perfpmr_tool = netpmon.sh
♦ perfpmr_tool = config.sh
Note: Bigger trace buffers may not be a viable option for systems that are tight on memory and trace
buffers are pinned memory.
9
Example
root@nkeung /home/nam/perfpmr/test: > ../perfpmr.sh 300

…
10:40:30-08/01/07 : PERFPMR: executing perfpmr_trace -k 10e,254,116,117 -L
20000000 -I 5
…
TRACE.SH: Starting trace for 5 seconds
/bin/trace -k 492,10e,254,116,117 -f -n -C all -d -L 20000000 -T 20000000 -a
TRACE.SH: Data collection started
TRACE.SH: Data collection stopped
TRACE.SH: Binary trace data is in file trace.raw
…
TRACE.SH: Enabling locktrace
lock tracing enabled for all classes
TRACE.SH: Starting trace for 5 seconds
/bin/trace -j 106,10C,10E,112,113,134,139,465,46D,606,607,608,609 -f -n -C all –d –L 2000000 -T 20000000 -ao
trace.raw.lock
TRACE.SH: Data collection started
TRACE.SH: Data collection stopped
TRACE.SH: Trace stopped
TRACE.SH: Binary trace data is in file trace.raw.lock
10
Example (Cont.)
10:42:48-08/01/07 : PERFPMR: executing perfpmr_monitor -h -I 0 -N 0 -S 0 3
MONITOR: Capturing final lsps, svmon, and vmstat data
MONITOR: Generating reports....
MONITOR: Network reports are in netstat.int and nfsstat.int
MONITOR: Monitor reports are in monitor.int and monitor.sum
10:49:47-08/01/07 : PERFPMR: executing perfpmr_filemon -T 60000000 60
FILEMON: Starting filesystem monitor for 60 seconds....

10:50:55-08/01/07 : filemon completed
10:50:55-08/01/07 : PERFPMR: executing perfpmr_tprof 60
TPROF: Tprof report is in tprof.sum

10:52:00-08/01/07 : config.sh begin
CONFIG.SH: Generating SW/HW configuration

10:53:38-08/01/07 : config.sh completed
PERFPMR: Data collection complete.
11
Example (Cont.)
root@nkeung /home/nam/perfpmr/test: > ls
aiostat.int gennames.out mpstat.int tprof.csyms trace.syms
config.sum getevars.out netstat.int tprof.ctrc tunables_lastboot
crontab_l instfix.out nfsstat.int tprof.out tunables_lastboot.log
devtree.out iostat.Dl objrepos tprof.sum tunables_nextboot
errlog iptrace.raw perfpmr.int trace.crash.inode unix.what
errpt_a lparstat.int pile.out trace.fmt vfs.kdb
errtmplt lparstat.l pprof.trace.raw trace.inode vmstat_s.out
etc_filesystems lparstat.sum psa.elfk trace.j2.inode vmstat_s.p.after
etc_inittab lslpp.Lc psb.elfk trace.maj_min2lv vmstat_s.p.before
etc_rc lsps.after psemo.after trace.mount vmstat_v.after
etc_security_limits lsps.before psemo.before trace.nm vmstat_v.before
fastt.out lsrset.out sar.bin trace.raw vmstati.after
fcstat.after mem_details_dir svmon.after trace.raw-0 vmstati.before
fcstat.before mempools.out svmon.after.S trace.raw-1 vnode.kdb
filemon.sum mempools.save svmon.before trace.raw.lock w.int
genkex.out monitor.int svmon.before.S trace.raw.lock-0 xmwlm.070731
genkld.out monitor.sum tcpdump.raw trace.raw.lock-1 xmwlm.070801
12
Postprocessing raw data
TRACE report (monitors statistics of user and kernel subsystems in detail)

►The trcrpt command reads the trace log and formats the trace entries, and writes a report to
standard output.
CPU Usage Reporting Tool
►The CPU Usage Reporting Tool (curt) takes an AIX trace file as input and produces a
number of statistics related to CPU utilization and process/thread activity
SPLAT report (Simple Performance Lock Analysis Tool)
►splat is a software tool which post-processes AIX trace files to produce kernel simple and
complex lock usage reports.
tprof report
►The tprof command reports processor usage for individual programs and the system as a
whole.
I/O performance report (monitor.int and monitor.sum)
System call report
13
Case study – Ethernet Transmit Lock
curt.out
Hypervisor Calls Summary
------------------------
Count Total Time % sys Avg Time Min Time Max Time Tot ETime Avg ETime Min ETime
Max ET
ime HCALL (Cal(msec) time (msec) (msec) (msec) (msec) (msec) (msec)
(msec)
======== =========== ====== ======== ======== ======== ======== ========= =========
=========
625 169.1873 6.86% 0.2707 0.0014 2.8567 311.8368 0.4989 0.0077
3.7711 H_CONFER((unknown) 210178)
117 38.2630 1.55% 0.3270 0.0005 2.7299 57.9804 0.4956 0.0005
2.7299 H_CONFER((unknown) 2f2230)
hconfer is a hypervisor call that is used in a shared partition to confer certain

processor cycles of one virtual processor to another specific virtual processor in the
same partition—it recognizes that the second virtual processor is in need of the
excess cycles that the first virtual processor has. For example, assume one virtual
processor is holding a lock and does not have enough cycles to release that lock. If
another virtual processor needs that lock and has excess cycles, it confers those to
the first processor through this call.
14

(Continue)
SPLAT.out
10 max entries, Summary sorted by Percent CPU spin hold time:
T Acqui- Wait
y sitions or Locks or
Percent Holdtime
p or Trans- Passes Real
Real Comb
Lock Name, Class, or Address e Passes Spins form %Miss %Total / CSec CPU
Elapse Spin
********************************** * ******* ****** ****** ******* ******** *********
******** ******** ********
F10001005D00A680 D 1934 1933 0 49.9871 0.1373 364.302 0.2149
4.7678 0.0425
[AIX SIMPLE Lock] ADDRESS: F10001005D00A680 KEX: unknown

======================================================================================
| Trans- | | Percent Held ( 0.241491s )
Type: | Miss Spin form Busy | Secs Held | Real Real Comb Real
Disabled | Rate Count Count Count |CPU Elapsed | CPU Elapsed Spin Wait
| 49.987 1933 0 0 |0.011407 0.011514 | 0.21 4.77 0.04 0.00
--------------------------------------------------------------------------------------
Total Acquisitions: 1934 |SpinQ Min Max Avg |Krlocks SpinQ Min Max Avg
Acq. holding krlock: 0 |Depth 0 2 0 |Depth 0 0 0
--------------------------------------------------------------------------------------
15
CaseLock study
Activity – Ethernet
(mSecs) - InterruptsTransmit
Disabled Lock
(Continue)
SIMPLE Count Minimum Maximum Average Total
+++++++ ++++++ ++++++++++++++ ++++++++++++++ ++++++++++++++ ++++++++++++++
LOCK 1934 0.003430 0.182701 0.005955 11.406502
SPIN 1933 0.000568 0.164352 0.001174 2.256266
Acqui- Miss Spin Transf. Busy Percent Held of Total Time

Function Name sitions Rate Count Count Count CPU Elapse Spin Transf.
Return Address Start Address Offset
^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^ ^^^^^^ ^^^^^^ ^^^^^^ ^^^^^^ ^^^^^^ ^^^^^^ ^^^^^^ ^^^^^^
^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^ ^^^^^^^^
.goent_output 1933 49.99 1932 0 0 0.21 4.69 0.04 0.00
00000000040B6E10 00000000040AA200 0000CC10
Acqui- Miss Spin Transf. Busy Percent Held of Total Time

Process
ThreadID sitions Rate Count Count Count CPU Elapse Spin Transf. ProcessID Name
~~~~~~~~ ~~~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~~~~
~~~~~~~~~~~~~
1753227 18 50.00 18 0 0 0.48 0.05 0.77 0.00 2093076
oraclehrvst1
3121321 32 50.00 32 0 0 0.63 0.08 0.39 0.00 2105404
oraclehrvst1
4186123 1 50.00 1 0 0 0.98 0.00 0.23 0.00 2666654 tee
2093311 1 50.00 1 0 0 22.47 0.08 0.16 0.00 528490
hats_nim
16

(continue)
tprof.sum
Total Ticks For All Processes (KERNEL) = 702
Subroutine Ticks % Source Address Bytes

========== ===== ====== ====== ======= =====
h_cede_end_point 361 45.87 hcalls.s 2d4ea8 8
.unlock_enable_mem 264 33.55 64/low.s 930c 1f4
.waitproc 26 3.30 ../../../../../src/bos/kernel/proc/dispa
tch.c 42e28 57c
.trchook64 9 1.14 trchka64.s 1d418 220
pcs_glue 6 0.76 vmvcs.s 2e3c74 c4
h_confer_end_point 2 0.25 hcalls.s 2d4ed0 8
.enable 2 0.25 misc.s ec398 10
17
Case study – Mutex Lock

Profile: /usr/lib/libc.a[shr.o]
Total Ticks For All Processes (/usr/lib/libc.a[shr.o]) = 7143

========== ===== ====== ====== ======= =====
.free_y 1599 1.67 ../../../../../../../src/bos/usr/ccs/lib/libc/malloc_y.c 7a2fc 6dc
.leftmost 1127 1.17 ../../../../../../../src/bos/usr/ccs/lib/libc/malloc_y.c 79d44 124
.splay 1019 1.06 ../../../../../../../src/bos/usr/ccs/lib/libc/malloc_y.c 79868 3b4
.fetch_and_addlp 658 0.69 atomic_op.s 86d78 1
.malloc_y 598 0.62 ../../../../../../../src/bos/usr/ccs/lib/libc/malloc_y.c 7b688 604
._doprnt 255 0.27 ../../../../../../../src/bos/usr/ccs/lib/libc/doprnt.c f0cc 6748
Profile: /usr/lib/libpthreads.a[shr.o]
Total Ticks For All Processes (/usr/lib/libpthreads.a[shr.o]) = 5448

========== ===== ====== ====== ======= =====
.global_unlock_ppc_mp 2374 2.47 pth_locks_ppc_mp.s 22a3c c4
.global_lock_ppc_mp 1194 1.24 pth_locks_ppc_mp.s 2293c c4
._mutex_lock 905 0.94 ../../../../../../../../src/bos/usr/ccs/lib/libpthreads/pth_mutex.c 18bc 348
.pthread_mutex_unlock 425 0.44 ../../../../../../../../src/bos/usr/ccs/lib/libpthreads/pth_mutex.c 1fc8 17c
18
Case study – Mutex Lock (continue)
Modify the trcfmt file, add the following to the end of the trace.fmt file (Require debug libpthread
library)
020 1.0 "PTHREADS" \
"mutex_lock " $D1 $D2 $D3 $D4 $D5
"mutex " lockaddr=$D4 $D1 $D2 $D3
"mutex_unlock " $D1 $D2 $D3 $D4 $D5
The 020 and 040 trace hooks show the pthread lock trace hook whereas 030 and 040 show the
unlock trace hook. The 5 parms in 020 and 030 are the top 5 routines on the stack. The first
parm in 040 is the lock address and the last 3 parms of the 040 hook are the 6th, 7th, and 8th
stack addresses
19
040 scp_server 27 410070 6755063 0.582794 PTHREADS mutex lockaddr=20444B50 1009E44C 1009C8D8 10038FAC
020 scp_server 32 410070 7184789 0.582797 PTHREADS mutex_lock D0300EFC D02FF57C D01C5DB0 100173F0
100174E0
040 scp_server 32 410070 7184789 0.582797 PTHREADS mutex lockaddr=20444B50 10017454 1009E6D8 10038BF8
030 scp_server 46 410070 6594563 0.582799 PTHREADS mutex_unlock D03027A8 D02FFCDC D01C82BC 1009E7E4
10038F0C
040 scp_server 46 410070 6594563 0.582800 PTHREADS mutex lockaddr=20444B50 1002F690 1002928C D1BD7288
020 scp_server 27 410070 6755063 0.582804 PTHREADS mutex_lock D0300EFC D02FF57C D01C5DB0 D01D0F14
100173C8
040 scp_server 27 410070 6755063 0.582805 PTHREADS mutex lockaddr=20444B50 100174E0 1009E460 1009C8D8
030 scp_server 4 410070 6598669 0.582805 PTHREADS mutex_unlock D03027A8 D02FFCDC D01C82BC 1009E5E4
10038F0C
040 scp_server 4 410070 6598669 0.582806 PTHREADS mutex lockaddr=20444B50 1002F690 1002928C D1BD7288
020 scp_server 38 410070 7131397 0.582807 PTHREADS mutex_lock D0300EFC D02FF57C D01C5DB0 D01D0F14
100173C8
040 scp_server 38 410070 7131397 0.582807 PTHREADS mutex lockaddr=20444B50 100174E0 1009E44C 10038DDC
030 scp_server 2 410070 6750965 0.582811 PTHREADS mutex_unlock D03014F4 D02FF57C D01C5DB0 D01D0F14
D1CBA968
040 scp_server 2 410070 6750965 0.582812 PTHREADS mutex lockaddr=20444B50 D1CB6EAC D1CC72C4 D1CC1C94
020 scp_server 46 410070 6594563 0.582813 PTHREADS mutex_lock D03022EC D02FFCDC D01C82BC 1009D0AC
1009E808
020 scp_server 32 410070 7184789 0.582813 PTHREADS mutex_lock D0300EFC D02FF57C D01C5DB0 100173F0
100174E0
20
use the sym.sh with the sym to look at 020 stack trace
./sym.sh D0300EFC D02FF57C D01C5DB0 100173F0 100174E0

D0300EFC </usr/lib/libc.a[shr.o] .free_y>
D02FF57C </usr/lib/libc.a[shr.o] .free_common>
D01C5DB0 </usr/lib/libC.a[ansicore_32.o] .operator>
100173F0
100174E0
./sym.sh D03022EC D02FFCDC D01C82BC 1009D0AC 1009E808

D03022EC </usr/lib/libc.a[shr.o] .malloc_y>
D02FFCDC </usr/lib/libc.a[shr.o] .malloc_common_53_36>
D01C82BC </usr/lib/libC.a[ansicore_32.o] .operator>
1009D0AC
1009E808
21
Case study – shm lock

Report from splat.out
10 max entries, Summary sorted by Percent CPU spin hold time:
T Acqui- Wait
y sitions or Locks or Percent Holdtime
p or Trans- Passes Real Real Comb
Lock Name, Class, or Address e Passes Spins form %Miss %Total / CSec CPU Elapse Spin
********************************** * ******* ****** ****** ******* ******** ********* ******** ********
********
F00000002FF48C98 C 74582 255 0 0.3407 2.2872 2284.374 184.9558 90932.0141
33.7647
lock_shm S 25870 25864 1246 49.9942 0.7933 792.373 2.7038 86.6720 9.2270
0000000002A65750 D 3500 1178 3 25.1817 0.1073 107.202 0.0445 1.4113
0.1217
0000000002A65A68 D 2534 1177 1 31.7165 0.0777 77.614 0.0480 1.5246
0.0724
0000000002A65A28 D 887 451 1 33.7070 0.0272 27.168 0.0224 0.7103
0.0614
0000000002A659D0 D 3773 726 0 16.1369 0.1157 115.563 0.0476 1.5147
0.0232
F100010052ECF6D8 C 13084 997 132 7.0805 0.4012 400.750 0.1719 5.5278
0.0142
00000000010A5438 D 51737 4690 0 8.3116 1.5866 1584.654 0.3101 11.6915
0.0094
lock_sem_undo S 17169 37 0 0.2150 0.5265 525.870 0.0353 1.1238
0.0075
ipintrq_qarray_lock D 66924 2058 0 2.9834 2.0523 2049.817 0.0933 2.9916
0.0053
22
Case study – shm lock (continue)
Report from curt.out

System Calls Summary
--------------------
Count Total Time % sys Avg Time Min Time Max Time Tot ETime Avg ETime Min ETime Max ETime SVC
(Address)
(msec) time (msec) (msec) (msec) (msec) (msec) (msec) (msec)
======== =========== ====== ======== ======== ======== ======== ========= ========= =========
================
14595 2619.7518 6.99% 0.1795 0.0524 1.1668 31358.9807 2.1486 0.0524 331.9492
shmdt(14a0070)
14590 1835.2794 4.90% 0.1258 0.0067 1.0218 38748.6944 2.6558 0.0067 332.0727
shmat(14a0088)
28595 368.4303 0.98% 0.0129 0.0029 1.0036 916.5396 0.0321 0.0029 56.1218
kwrite(149e138)
42642 332.9538 0.89% 0.0078 0.0019 0.0327 74828.0660 1.7548 0.0019 879.2453
kread(149e180)
23
Case study – application mutex lock
Millennium threads contention summary in POWER5+ 40way

There is one point of contention limiting the scalability of the application. The
many threads of the sec_server, tdb_server, srv_drver, scp_server, and
cpm_srvcachema processes are calling an application routine called
IPC_ReadMessage which then calls _IPC_ReadMessage which then calls IPC_GetMessage
which calls a routine in /cerner/w_standard/rev008_wh/aixrs6000/libcmb.a called
cmb_hiber(). cmb_hiber calls pthread_cond_wait to wait on a condition ...
The application contention on the pthread condition in turn also causes kernel
contention on the event list (resulting in long times for the thread_unlock
system call).
What would be useful to know is if all these various processes need to wait on
the same condition variable (in IPC_GetMessage).
24
TPROF example
Configuration information
System: AIX 5.3 Node: rvn1 Machine: 00CC12CE4C00
Tprof command was:
tprof -E -skeulz -x sleep 30
Trace command was:
/usr/bin/trace -ad -M -L 1719908352 -T 500000 -j
000,00A,001,002,003,38F,005,006,134,139,5A2,5A5,465,2FF, -o -
Total Samples = 84268
Traced Time = 30.01s (out of a total execution time of 30.01s)
Performance Monitor based reports:
Processor name: POWER6
Sampling interval: 10ms
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Process FREQ Total Kernel User Shared Other
======= ==== ===== ====== ==== ====== =====
/cerner/…/srv_drvr 976 28105 3892 0 24210 3
/cerner/…/cpm_srvscript 212 17642 3039 43 14549 11
wait 32 14991 14991 0 0 0
oraclehrvst1 320 14751 3080 11442 210 19
/usr/bin/amqzlaa0 1504 3053 1694 65 1281 13
25
TPROF example (Cont.)

Total Ticks For All Processes (KERNEL) = 15310

========== ===== ====== ====== ======= =====
.dispatch 558 0.66 ../../../../../src/bos/kernel/proc/dispatch.c
45694 195c
.disable_lock 464 0.55 64/low.s 9004 2fc
._check_lock 452 0.54 64/low.s 3420 40
.unlock_enable_mem 424 0.50 64/low.s 930c 1f4
.fetch_and_add 391 0.46 64/low.s 9b00 80
.simple_unlock 385 0.46 64/low.s 9900 18
Total Ticks For All Processes (SH-LIBs) = 43188
Shared Object Ticks % Address Bytes

============= ===== ====== ======= =====
/usr/lib/libc.a[shr.o] 10558 12.53 d0331800 213698
/cerner/…/libcclora.a[shobjcclora.o] 9312 11.05 d2549260 14a66f
/usr/lib/libpthreads.a[shr.o] 7791 9.25 d0656180 254b4
/cerner/…/libsrvdata.a[libsrvdata.o] 4882 5.79 d1b8e240 95150
/oracle/product/9.2.0.5/../libclntsh.a[shr.o] 2242 2.66 d273c2a0 883d72
26
I/O Tuning – iostat -aD Read/write IOPS

# iostat -a –D rps/wps
System configuration: lcpu=2 drives=3 paths=1 vdisks=1
Adapter: PV
scsi0 xfer: bps tps bread bwrtn
Virtual adapter
0.0 0.0 0.0 0.0
Paths/Disk: Paths
hdisk0_path0 xfer: %tm_act bps tps bread bwrtn
0.0 0.0 0.0 0.0 0.0
read: rps avgserv minserv maxserv timeouts fails
0.0 0.0 0.0 0.0 0 0
write: wps avgserv minserv maxserv timeouts fails
0.0 0.0 0.0 0.0 0 0
queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull
0.0 0.0 0.0 0.0 0.0 0
Vadapter
vsci0 xfer: tps bread bwrtn partition-id
0.0 0.0 0.0 ####
read: avgserv minserv maxserv Use –l option for wide
0.0 0.0 0.0 column, one device
write: avgserv minserv maxserv per line format
0.0 0.0 0.0
queue: avgtime mintime maxtime avgsqsz qfull
0.0 0.0 0.0 0.0 0
Disk:
hdisk10 xfer: %tm_act bps tps bread bwrtn
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0 0
0.0 0.0 0.0 0.0 0 0
0.0 0.0 0.0 0.0 0.0 0
I/O Tuning – iostat -D Service times you could only get from filemon before
hdisk1 xfer: %tm_act bps tps bread bwrtn

87.7 62.5M 272.3 62.5M 823.7
271.8 9.0 0.2 168.6 0 0
0.5 4.0 1.9 10.4 0 0
1.1 0.0 14.1 0.2 1.2 2374
Virtual adapter’s extended throughput report (-D)
Metrics related to transfers (xfer:) All –D outputs are rates, except
tps Indicates the number of transfers per second issued to the adapter. sqfull, which is an interval delta
recv The total number of responses received from the hosting server to this adapter.
sent The total number of requests sent from this adapter to the hosting server.
partition id The partition ID of the hosting server, which serves the requests sent by this adapter. Can’t exceed queue_depth for the disk
Adapter Read/Write Service Metrics (read:)
avgserv Indicates the average time. Default is in milliseconds.
If this is often > 0, then increase queue_depth
minserv Indicates the minimum time. Default is in milliseconds.
maxserv Indicates the maximum time. Default is in milliseconds.
Adapter Wait Queue Metrics (wait:)
avgtime Indicates the average time spent in wait queue. Default is in milliseconds.
mintime Indicates the minimum time spent in wait queue. Default is in milliseconds.
maxtime Indicates the maximum time spent in wait queue. Default is in milliseconds.
avgwqsz Indicates the average wait queue size.
avgsqsz Indicates the average service queue size – Waiting to be sent to the disk.
sqfull Indicates the number of times the service queue becomes full.
28
Application IO characteristics
Random IO
Typically small (4-32 KB)
Measure and size with IOPS
Usually disk actuator limited
Sequential IO
Typically large (32KB and up)
Measure and size with MB/s
Usually limited on the interconnect to the disk actuators
To determine application IO characteristics

Use filemon
# filemon –o /tmp/filemon.out –O lv,pv –T 500000; sleep 90; trcstop
Check for trace buffer wraparounds which may invalidate the data, run
filemon with a larger –T value or shorter sleep
Use lvmstat to get LV IO statistics
Use iostat to get PV IO statistics
29
Filemon
To find out what files, logical volumes, and disks are most active, run the following
command as root:
# filemon -u -O all -o /tmp/fmon.out; sleep 30;trcstop In 30 seconds, a report is created in
/tmp/fmon.out.
Check for most active segments, logical volumes, and physical volumes in this report.
Check for reads and writes to paging space to determine if the disk activity is true
application I/O or is due to paging activity.
Check for files and logical volumes that are particularly active. If these are on a busy
physical volume, moving some data to a less busy disk may improve performance. The
Most Active Segments report lists the most active files by file system and inode. The
mount point of the file system and inode of the file can be used with the ncheck command
to identify unknown files:
# ncheck -i <inode> <mount point> This report is useful in determining if the activity is to a
file system (segtype = persistent), the JFS log (segtype = log), or to paging space (segtype
= working).
#find /var/mqm –inum 30762
/var/mqm/qmgrs/CERN!RVN1!HRVSTA/queues/CERN!SSREQ!PM!REG/q
By examining the reads and read sequences counts, you can determine if the access is
sequential or random. As the read sequences count approaches the reads count, file
access is more random. The same applies to the writes and write sequences.
30
U sing filem on To D eterm ine B ottleneck

#filemon -u -O lf,pv -o fmon.out
# dd if=/test/100m bs=1024k of=/de v/null
# trcstop
# more fmon.out
Thu Apr 17 09:11:53 2003

System: AIX nkeung Node: 5 Machine : 000D8 0CD4C00
Cpu utilization: 6.9%
Most Active Files

---------------------------------- ------- ---- ---- ------- ----------------
#MBs #opns #rds #wrs fil e volume:inode
---------------------------------- ------- ---- ---- ------- ----------------
101.0 1 101 0 100m / dev/jfslv:23
101.0 1 0 100 null
3.0 0 385 0 pid= 270570_ fd=20960
0.2 1 62 0 unix / dev/hd3:10
0.0 1 2 0 ksh. cat / dev/hd2:41634
0.0 1 2 0 cmdt race.ca t / dev/hd2:41498
31
filemon summary reports

Summary reports at PV and LV layers
Most Active Logical Volumes
------------------------------------------------------------------------
util #rblk #wblk KB/s volume description
------------------------------------------------------------------------
1.00 10551264 5600 17600.8 /dev/rms09_lv /RMS/bormspr0/oradata07
1.00 128544 3315168 5741.5 /dev/rms04_lv /RMS/bormspr0/oracletemp
0.98 6237648 128 10399.9 /dev/oraloblv01 /RMS/bormspr0/oralob01
0.96 0 3120 5.2 /dev/hd8 jfslog
0.55 38056 104448 237.6 /dev/rms041_lv /RMS/bormspr0/oraredo
Most Active Physical Volumes

------------------------------------------------------------------------
util #rblk #wblk KB/s volume description
------------------------------------------------------------------------
1.00 3313059 4520 5531.2 /dev/hdisk66 SAN Volume Controller Device
1.00 11669 6478 30.3 /dev/hdisk0 N/A
32
filemon detailed reports

Detailed reports at PV and LV layers (only for one LV
Average IO sizes
shown)
Blks are 512 bytes in AIX
Similar reports for each PV 439 x 512 = ~219KB average size
VOLUME: /dev/rms09_lv description: /RMS/bormspr0/oradata07
reads: 23999 (0 errs)
read sizes (blks): avg 439.7 min 16 max 2048 sdev 814.8
read times (msec): avg 85.609 min 0.139 max 1113.574 sdev 140.417
read sequences: 19478
read seq. lengths: avg 541.7 min 16 max 12288 sdev 1111.6
writes: 350 (0 errs)
write sizes (blks): avg 16.0 min 16 max 16 sdev 0.0
write times (msec): avg 42.959 min 0.340 max 289.907 sdev 60.348
write sequences: 348
write seq. lengths: avg 16.1 min 16 max 32 sdev 1.2
seeks: 19826 (81.4%)
seek dist (blks): init 18262432, avg 24974715.3 min 16
max 157270944 sdev 44289553.4
time to next req(msec): avg 12.316 min 0.000 max 537.792 sdev 31.794
throughput: 17600.8 KB/sec
utilization: 1.00
33
Using filemon
Look at PV summary report
Look for balanced IO across the disks
Lack of balance may be a data layout problem
Depends upon PV to physical disk mapping
LVM mirroring scheduling policy also affects balance for reads
IO service times in the detailed report is more definitive on data layout
issues
Dissimilar IO service times across PVs indicates IOs are not
balanced across physical disks
Look at most active LVs report

Look for busy file system logs
Look for file system logs serving more than one file system
34
topas - new LPAR screen

zSplit screen accessible from -L or the "L" command
zupper section shows a subset of lparstat metrics
zlower section shows sorted list of logical processor with mpstat columns
Interval: 2 Logical Partition: aix Sat Mar 13 09:44:48 2004

Poolsize: 3.0 Shared SMT ON Online Memory: 8192.0
Entitlement: 2.5 Mode: Capped Online Logical CPUs: 4
Online Virtual CPUs: 2
%user %sys %wait %idle physc %entc %lbusy app vcsw phint %hypv hcalls
47.5 32.5 7.0 13.0 2.0 80.0 100.0 1.0 240 150 5.0 1500
==============================================================================
logcpu minpf majpf intr csw icsw runq lpa scalls usr sys wt idl pc lcsw
cpu0 1135 145 134 78 60 2 95 12345 10 65 15 10 0.6 120
cpu1 998 120 104 92 45 1 89 4561 8 67 25 0 0.4 120
cpu2 2246 219 167 128 72 3 92 76300 20 50 20 10 0.5 120
cpu3 2167 198 127 62 43 2 94 1238 18 45 15 22 0.5 120
zNotable metrics
z%hypv and hcalls: percentage of time in hypervisor and number of calls made
zpc: fraction of physical processor consumed by a logical processor
35
New lparstat command

zShows partition level
zThree modes
zinformation (-i)
ƒshows static configuration information
zdetailed hypervisor (-H)
ƒshows breakdown of hypervisor time by hcall type
zmonitoring mode (default)
zMonitoring mode metrics
zCPU utilization (%user, %sys, %idle, %wait)
zpercentage spent in hypervisor (%hypv) and number of hcalls (hcalls) [both optional]
zadditional shared mode only metrics
ƒPhysical Processor Consumed (physc)
ƒPercentage of Entitlement Consumed (%entc)
ƒLogical CPU Utilization (%lbusy)
ƒAvailable Pool Processors (app)
ƒnumber of virtual context switches (vcsw)
zvirtual processor hardware preemptions
ƒnumber of phantom interrupts (phint)
zinterrupts received for other partitions
36
New lparstat info example

# lparstat -i
Node Name : sq07
Partition Name : sq07_aix53lpar
Partition Number : 2
Type : Shared-SMT
Mode : Uncapped
Entitled Capacity : 100
Partition Group-ID : 32770
Shared Pool ID : 0
Online Virtual CPUs : 1
Maximum Virtual CPUs : 40
Minimum Virtual CPUs : 1
Online Memory : 1536 MB
Maximum Memory : 2048 MB
Minimum Memory : 1024 MB
Variable Capacity Weight : 128
Minimum Capacity : 10
Maximum Capacity : 400
Capacity Increment : 1
Maximum Dispatch Latency : 0
Maximum Physical CPUs in system : 4
Active Physical CPUs in system : 4
Active CPUs in Pool : -
Unallocated Capacity : 0
Physical CPU Percentage : 100.00%
Unallocated Weight : 127
Minimum Virtual Processor Required Capacity: 10
37
New mpstat command

zShows detailed logical processor information
zdefault mode shows
ƒutilization
metrics (%user, %sys, %idle, %wait)
ƒmajor and minor page faults (with and without disk I/O)
ƒnumber of syscalls and interrupts
ƒdispatcher metrics
znumber of migrations
zvoluntary and involuntary context switches
zlogical processor affinity (percentage of redispatches inside MCM)
zrun queue size
ƒfraction
of processor consumed [SMT or shared mode only]
ƒpercentage of entitlement consumed [shared mode only]
ƒnumber of logical context switches [shared mode only]
zhardware preemptions
z-d shows detailed software and hardware dispatchers metrics
z-i shows detailed interrupt metrics
z-s option shows SMT utilization
38
New mpstat command example SMT mode

zmpstat -s (example: shows SMT utilization )
mpstat -s
Proc0 Proc2 Proc4 Proc6

80% 78% 75% 82% [shared mode only]
cpu0 cpu1 cpu2 cpu3 cpu4 cpu5 cpu6 cpu7
40% 40% 68% 10% 35% 40% 41% 41%
smtctl
This system is SSMT capable
SMT is currently enabled
SMT boot mod is not set
Processor 0 has 2 SMT threads
SMT thread 0 is bound with processor 0
SMT thread 1 is bound with processor 1
…
lsattr –El proc0
fequency 1904000000 Processor Speed False
smt_enabled true Processor SMT enabled False
smt_threads 2 Processor SMT threads False
state enable Processor state False
type PowerPC_POWER5 Processor type False
39
topas - main screen update

zNew cpu section metrics on physical processing resources consumed
zPhysc:amount consumed in fractional number of processors
z%Entc: amount consumed in percentage of entitlement
Topas Monitor for host: specweb8 EVENTS/QUEUES FILE/TTY

Sat Mar 13 09:47:18 2004 Interval: 2 Cswitch 50 Readch 0
Syscall 47 Writech 34
Kernel 0.0 | | Reads 0 Rawin 0
User 0.0 | | Writes 0 Ttyout 34
Wait 0.0 | | Forks 0 Igets 0
Idle 100.0 |############################| Execs 0 Namei 1
Physc = 0.01 %Entc= 1.2 Runqueue 0.0 Dirblk 0
Waitqueue 0.0
Network KBPS I-Pack O-Pack KB-In KB-Out
en0 0.1 1.0 1.0 0.0 0.1 PAGING MEMORY
lo0 0.0 0.0 0.0 0.0 0.0 Faults 0 Real,MB 8191
Steals 0 % Comp 5.4
Disk Busy% KBPS TPS KB-Read KB-Writ PgspIn 0 % Noncomp 1.6
hdisk0 0.0 0.0 0.0 0.0 0.0 PgspOut 0 % Client 1.6
hdisk2 0.0 0.0 0.0 0.0 0.0 PageIn 0
hdisk3 0.0 0.0 0.0 0.0 0.0 PageOut 0 PAGING SPACE
hdisk1 0.0 0.0 0.0 0.0 0.0 Sios 0 Size,MB 512
% Used 0.6
Name PID CPU% PgSp Owner NFS (calls/sec) % Free 99.3
IBM.CSMAg 13180 0.0 1.6 root ServerV2 0
syncd 9366 0.0 0.5 root ClientV2 0 Press:
prngd 22452 0.0 0.3 root ServerV3 0 "h" for help
psgc 2322 0.0 0.0 root ClientV3 0 "q" to quit
pilegc 2580 0.0 0.0 root
zNew metrics are added automatically when running in shared mode

zCPU utilization metrics are calculated using new purr-based data and formula
when running in SMT or shared mode
40
topas - CEC monitoring screen

Split screen accessible from -C or the "C" command
► upper section shows CEC-level metrics
► lower sections shows sorted lists of shared and dedicated partitions
41
CEC configuration info retrieved from HMC or specified on command line
Automatic Performance Metric recording

Local metrics recordings
► Uses xmwlm daemon
●automatically started from inittab
●15 sec sampling frequency
●5 minutes recording frequency
► Kept 7 days worth of data
► Recordings include most of topas local data

●except process and WLM data
► Disk space required
●system with a low number of disks: 2 MB/day

●10 MB/day for every 100 disks
●Stored in /etc/perf/daily
42
Automatic Performance Metric recording

CEC-wide metrics recording
► Implemented as topas -R option
●records all topas -C metrics
●works independently and in parallel from topas
► 10 sec sampling frequency, 60 seconds recording frequency
► Installed by default in AIX 5.3 TL6, before then

●Must be “installed” in inittab manually in one of the partitions in CEC
# /usr/lpp/perfagent/config_topas.sh add
43
Topas recordings – topasout

Exports data and generates reports
Input source for WLE (Workload Estimator)
► free alternative to System p PM services (aka PTX provider)
► Provides peak weekly utilization information in XML file
●peak 2 hours of week (cpu, memory, filesystem, disk I/O totals)
Formats
► Spreadsheet and csv formats
► nmon_analyser format
● http://www-941.ibm.com/collaboration/wiki/display/WikiPtype/nmonanalyser
44
Example: topasout - detailed local report in text
45
New Configure Topas Option setup in SMIT

smitty Æ Performance & Resource Scheduling Æ
Move cursor to desired item and press Enter.
Resource Status & Monitors
Analysis Tools
Resource Controls
Schedule Jobs
Workload Manager
Enterprise Workload Management
Resource Set Management
Tuning Kernel & Network Parameters
Simultaneous Multi-Threading Processor Mode
Configure Topas Options

Fastpath: smitty topas
46
topas recordings - 5.3 TL6 update

SMIT panels
► setup access to partitions
not on local subnet
► turn on/off CEC and local recordings
► display recording status
► generate reports
●to file
●to printer
●to stdout
► eliminates
●need to know file location and names
●topasout syntax
47
Using "topas -R" to Record Cross Partition

Performance
Must be running AIX5.3 TL5 Service pack 4 with APAR IY87993 or newer)
This option records the performance of all partitions on a physical server
The command is run on just one server
Start Recording
►#/usr/lpp/perfagent/config_topas.sh add
►The performance data is logged to: /etc/perf/topas_cec.YYMMDD
Stop Recording
►#/usr/lpp/perfagent/config_topas.sh delete
►Rename the /etc/perf/topas_cec.YYMMDD logfile so that a restart does not corrupt it
Summary Report
►# topasout -R summary /etc/perf/topas_cec.070805
#Report: CEC Summary --- hostname: localhost version:1.1
Start:08/05/07 17:24:43 Stop:08/05/07 17:37:43 Int: 5 Min Range: 13 Min
Partition Mon: 2 UnM: 0 Shr: 2 Ded: 0 Cap: 0 UnC: 2
-CEC------ -Processors------------------------- -Memory (GB)------------
Time ShrB DedB Mon UnM Avl UnA Shr Ded PSz APP Mon UnM Avl UnA InU
17:29 0.01 0.00 0.3 0.0 0.0 0 0.3 0 2.0 2.0 3.0 0.0 0.0 0.0 1.4
17:34 0.01 0.00 0.3 0.0 0.0 0 0.3 0 2.0 2.0 3.0 0.0 0.0 0.0 0.9
17:37 0.01 0.00 0.3 0.0 0.0 0 0.3 0 2.0 2.0 3.0 0.0 0.0 0.0 0.9
48
Using "topas -R" to Record Cross Partition Performance

(Cont)
Detailed Report
►# topasout -R detailed /etc/perf/topas_cec.070805
Time: 17:29:42 ------------------------------------------------------------

Partition Info Memory (GB) Processors
Monitored : 2 Monitored : 3.0 Monitored : 0.3 Shr Physcl Busy: 0.01
UnMonitored: 0 UnMonitored: 0.0 UnMonitored: 0.0 Ded Physcl Busy: 0.00
Shared : 2 Available : 0.0 Available : 0.0
Dedicated : 0 UnAllocated: 0.0 Unallocated: 0.0 Hypervisor
Capped : 0 Consumed : 1.4 Shared : 0.3 Virt Cntxt Swtch: 808
UnCapped : 2 Dedicated : 0.0 Phantom Intrpt : 0
Pool Size : 2.0
Avail Pool : 2.0
Host OS M Mem InU Lp Us Sy Wa Id PhysB Ent %EntC Vcsw PhI
-------------------------------------shared-------------------------------------
gilmore A53 U 2.0 0.9 4 1 1 0 97 0.01 0.2 4.46 482 0
localhost A53 U 1.0 0.4 2 0 2 0 97 0.00 0.1 4.48 325 0
------------------------------------dedicated----------------------------------
For more information, see the AIX /usr/lpp/perfagent/README.perfagent.tools
49
SVMON command
A n a n a ly sis to o l fo r v irtu a l m e m o ry
P u rp o se :
C a p tu re s a sn a p sh o t o f th e c u rre n t sta te o f m e m o ry .
T h e d isp la y o f in fo rm a tio n c a n b e a n a ly z e d u sin g fo u r d iffe re n t re p o rts:
g lo b a l [ -G ]
p ro c e ss [-P ]
se g m e n t [-S ]
d e ta ile d se g m e n t [-D ]
E x a m p le # sv m o n -G
siz e in u se fre e p in v irtu a l
m e m o ry 32636928 12958574 19678354 1453011 5842282
p g sp a c e 2097152 22267
w o rk p e rs c ln t
p in 1452772 239 0
in u se 5842282 834564 6281728
P a g e S iz e P o o lS iz e in u se p g sp p in v irtu a l
s 4 KB - 12679102 22267 1253811 5562810
m 64 K B - 17467 0 12450 17467
50
Correllating vmstat and svmon output
#vmstat 1 4
configuration: System lcpu=32 mem=127488MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 5842256 19678557 0 0 0 0 0 0 6 1223 1251 0 0 99 0
0 0 5842255 19678558 0 0 0 0 0 0 5 1026 1200 0 0 99 0
0 0 5842253 19678560 0 0 0 0 0 0 4 853 1130 0 0 99 0
2 0 5842253 19678560 0 0 0 0 0 0 5 1046 1218 0 0 99 0
# svmon -G
size inuse free pin virtual
memory 32636928 12958401 19678527 1453001 5842265
pg space 2097152 22267
work pers clnt

pin 1452762 239 0
in use 5842265 834408 6281728
PageSize PoolSize inuse pgsp pin virtual

s 4 KB - 12678929 22267 1253801 5562793
m 64 KB - 17467 0 12450 17467
51
Correlating ps and svmon output
52

15782

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

15782

Uploaded by

Copyright:

Available Formats

IBM AIX

Using Perfpmr, AIX VMM Page Replacement and

Click anywhere on the

This deep-dive course covers the following topics:

Performance Data Collection

 PERFPMR consists of a set of utilities that collect information in

-P preview only - show scripts to run and disk space needed"

PERFPMR shell scripts

PERFPMR shell scripts (Cont.)

PERFPMR shell scripts (Cont.)

PERFPMR configuration file for perfpmr scripts

root@nkeung /home/nam/perfpmr/test: > ../perfpmr.sh 300

FILEMON: Starting filesystem monitor for 60 seconds....

TPROF: Tprof report is in tprof.sum

CONFIG.SH: Generating SW/HW configuration

Postprocessing raw data

TRACE report (monitors statistics of user and kernel subsystems in detail)

Case study – Ethernet Transmit Lock

hconfer is a hypervisor call that is used in a shared partition to confer certain

Case study – Ethernet Transmit Lock

[AIX SIMPLE Lock] ADDRESS: F10001005D00A680 KEX: unknown

Acqui- Miss Spin Transf. Busy Percent Held of Total Time

Acqui- Miss Spin Transf. Busy Percent Held of Total Time

Case study – Ethernet Transmit Lock

Total Ticks For All Processes (KERNEL) = 702

Subroutine Ticks % Source Address Bytes

Case study – Mutex Lock

Total Ticks For All Processes (/usr/lib/libc.a[shr.o]) = 7143

Subroutine Ticks % Source Address Bytes

 Total Ticks For All Processes (/usr/lib/libpthreads.a[shr.o]) = 5448

Subroutine Ticks % Source Address Bytes

Case study – Mutex Lock (continue)

Case study – Mutex Lock (continue)

Case study – Mutex Lock (continue)

./sym.sh D0300EFC D02FF57C D01C5DB0 100173F0 100174E0

./sym.sh D03022EC D02FFCDC D01C82BC 1009D0AC 1009E808

Case study – shm lock

10 max entries, Summary sorted by Percent CPU spin hold time:

Case study – shm lock (continue)

Report from curt.out

Case study – application mutex lock

Millennium threads contention summary in POWER5+ 40way

TPROF example (Cont.)

Subroutine Ticks % Source Address Bytes

Total Ticks For All Processes (SH-LIBs) = 43188

Shared Object Ticks % Address Bytes

I/O Tuning – iostat -aD Read/write IOPS

hdisk1 xfer: %tm_act bps tps bread bwrtn

 To determine application IO characteristics

# filemon –o /tmp/filemon.out –O lv,pv –T 500000; sleep 90; trcstop

U sing filem on To D eterm ine B ottleneck

Thu Apr 17 09:11:53 2003

Cpu utilization: 6.9%

Most Active Files

filemon summary reports

Most Active Physical Volumes

filemon detailed reports

 Look at most active LVs report

topas - new LPAR screen

Interval: 2 Logical Partition: aix Sat Mar 13 09:44:48 2004

New lparstat command

New lparstat info example

New mpstat command

PERFPMR consists of a set of utilities that collect information in

TRACE report (monitors statistics of user and kernel subsystems in detail)

Total Ticks For All Processes (/usr/lib/libpthreads.a[shr.o]) = 5448

Subroutine Ticks % Source Address Bytes

Report from curt.out

To determine application IO characteristics

Look at most active LVs report