You are on page 1of 34

1

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Be a Hero with your DBA: Database Performance Tuning for System Admins and IT Architects
Randal Sagrillo Session #544

Program Agenda
Scope and Method Tools Examples Next Steps

NOTE: I ASSUME SQL, SCHEMA, INSTANCE AS TUNED AS THEY CAN GET!


3 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

The Life of an Systems Architect Isnt Easy


And Not Much Better for DBAs and SysAdmins

More Users More Data More Transactions More Complexity More Hardware More Software More Data Centers

Lower Performance

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

It is All About I/O: Logical I/O


Faster CPUs Usually Mean Faster Memory and More Memory
Database Size (Relational Data)

20%

I/O

Working Set Size Query/DML DB Memory Size Logical I/O

CPU

80%

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

It is All About I/O: Physical I/O


Faster CPUs do not help Physical-I/O bound Databases
Database Size (Relational Data)
Working Set Size (Relational Data) Query/DML DB Memory Size

CPU

20%

Physical I/O

80%

I/O

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Enterprise Application Issues

Batch job duration too long Reporting/ad hoc query too long OLTP transaction times too long (Business value)

Or not high enough OLTP rate (Operational value)

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Typical Storage Bottlenecks


I/O Supply vs. Demand
Maximum IOPS delivered
Talked about the most, but least

Demand

Supply

important for enterprise Apps


Really measures concurrency

Maximum data rate delivered


Really measured channel and disk

bandwidth
Shortest service time delivered
Usually most important for databases

I N I T I A T O R

IOPS

MB/Sec

T A R G E T

milliseconds
8 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Performance Methodology
Performance below expectation, variance, degradation over time, etc.
Identify SLAs

Symptoms

Tuning tips Document and apply best practices Optimize Diagnosis

Systemic analysis

Minimize scope

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Tools to Identify Database Performance Issues


Database performance view gives more insight than just OS view
OS tools
mpstat, iostat,

Free DB Utilities SQL tracing/tkprof Statspack: PL code, Since Oracle 8i, download

Licensed Tools Oracle Tuning Pack


SQL tuning

strace, truss
Dtrace,SWAT Very powerful,

Oracle Diagnostic Pack


Automated Diagnostic Database Monitor (ADDM) Active Session History (ASH) Application Workload Repository (AWR)

expert tools
But hard to estimate

impact/relevance of database performance

10

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

AWR & Statspack: First Things


Top of the reports: What is the Environment? How Long?

11

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

AWR & Statspack: Most Important Things


How much would faster CPU execution help here?

12

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Database I/O Bottlenecks: Wait Events


Note These Are OFF CPU Events! Typical I/O wait types, foreground
db file sequential read: disk to database buffer cache wait
log file sync: waiting for background write of log data to complete db file scatter read: wait for multi-block read into buffer cache read by other session: another session waiting for block above direct path read: read bypassing buffer cache directly into PGA

Typical I/O wait types, background


log file parallel write: write log data (typically to NVRAM) from LGWR db file parallel write: write to tables async from DBWR(S) log file sequential read: to build archive log, DataGuard log archive I/O, RMAN, etc.

13

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Example #1: Online Payment Processing


db file sequential read: The Poster Child Off-CPU Wait Events
Operational Objective: reduce I/O burden on large multi-million

dollar storage system. Deployment platform and topology:


SPARC T-Series, Oracle Solaris Oracle Real Application Clusters (RAC)

14

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

db file sequential read Before Optimization


Much More I/O Wait Time Than Real Time
Top 5 Timed Foreground Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Class Waits Time (s) (ms) Time Wait ---------------------------- ---------- --------- ------ ---- --------db file sequential read 3,189,229 34,272 11 67.8 User I/O CPU time 11,332 22.4 log file sync 2,247,374 4,612 2 9.1 Commit gc cr grant 2-way 1,365,247 793 1 1.6 Cluster enq: TX index contention 140,257 720 5 1.5 Concurrenc -------------------------------------------------------------

Analysis:
db file sequential read - 3500 IOPS, 10ms average 15 minute snapshots under load: but 9.5 hours of disk waits! 77 minutes of commit time
15 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

11gR2 Database Smart Flash Cache


If I Cannot Add DRAM or Increase SGA
Acts as Level 2 Buffer

Cache (SGA holds pointers) Clean Cache! Buffer Cache Almost changes physical read I/O to logical I/O Rule of sizing: 2x 10x SGA Many size. See Buffer Pool I/Os Advisory to narrow estimate Best accelerates read Storage intensive workloads
16 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Buffer Cache

Few I/Os
Storage

Database Smart Flash Cache

Database Smart Flash Cache Setup


Two Simple Steps
Aggregate Flash LUNs into ONE file
ASM preferred Concatenate. No mirroring - it is a cache!

Set two init.ora parameters


db_flash_cache_file = <+flashdg/FlashCacheFile> Path to flash file/raw aggregation/metadevice db_flash_cache_size = <flash file size> Level 2 buffer cache size: amount of flash file to use

17

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

RAC Considerations for Smart Flash Cache


Flash Cache is Not Shared!
RAC Scaling Generally Held Eliminates Physical I/O if
Global Buffer Cache (LMS)

Buffer Cache

Buffer Cache

block in any nodes Buffer Cache But only checks blocks in local nodes Flash Cache File

Database Smart Flash Cache

Database Smart Flash Cache Shared Storage

Two Node RAC Example


18 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Example #1 After Optimizing with Flash Cache


Top 5 Timed Foreground Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time Wait Class ---------------------------- ---------- --------- ------ ---- --------CPU time 11,353 57.6 log file sync 1,434,247 6,587 3 33.4 Commit flash cache single block read 4,221,599 2,284 1 21.3 User I/O Buffer busy waits 723,807 1,502 329 3.3 Concurrenc db file sequential read 22,727 182 8 .9 User I/O -------------------------------------------------------------

Results: 140x reduction in db file sequential reads with ~190GB of Flash!


Average Flash Cache Read time 540us vs 10.75ms: 20X quicker.

Transaction and commit rate also went up over 40%!

19

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Example #2: Bank Processing


Not Always About Killer IOPS Rate!
Bank Objective: reduce proccode response time Deployment topology:
Solaris Capped Containers Zones SPARC T-Series 2GB Buffer Cache

20

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Example #2: Top Wait Events Before


Adding IOPS Supply Will Not Help Much Here

295 IOPS foreground reads


1,062,961 waits (IOs) / 60.13 minutes (3607.8 Seconds)

Propose to add ~90GB of Flash


21 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Example #2: Top Wait Events After Optimized


20 Times Shorter I/O Wait Times with Flash
Average 507us wait from flash cache
114 seconds/224,808 waits = .000507 sec/wait RT

Proccode response time cut better than in half! Fewer Index reads needed

22

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

db file sequential read Summary


How to Make This Go Away
2/3rds of OLTP databases have this as majority wait event.
Even some data warehouses!

Use Buffer Pool Advisory to determine how much more cache needed
If you can add/reallocate memory to the DB servers: GREAT!

20x reduction in storage response time common with flash vs HDD in

arrays. 2x improvement in SLA typical when I/O bound.


5x and higher improvements seen.

23

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Example #3: Batch Processing (SAP)


Top 5 Timed Foreground Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time ---------------------------- ---------- --------- ------ ---db file sequential read 109,123,471 593,577 5 40.0 log file sync 1,818,523 559,444 308 37.7 CPU time 344,454 23.2 db file parallel write 1,444,242 35,970 25 2.4 log file parallel write 775,249 17,371 22 1.2
Analysis:
473 minute (~ 8 hour) snapshot With 290 minutes (~5 hours) of logging time, but over 155 hours commit time!! Nearly 1/3 second per batch commit. 1.8M commits (~64/second)

Wait Class --------User I/O Commit System I/O System I/O

24

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Example #3 Analysis
Only One Log Writer Process per Instance

While log time is slow (25ms), log writing time is only 3% of

commit time!
Log write time is only 1.2% of entire AWR report, but STILL top 5! More Important 5 hours of single process I/O in 8 hours real time!

Solution: Scheduling (LGWR priority!) Processor Binding improved commit times 4X!

Follow on: work to improve storage subsystem write

response times
25 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Example #4: Telecommunications


More Logging
Communications Objective: Reduce commit time.

26

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Example #4 Analysis

Archiving each redo log write after every commit IN A SQL LOADER ENVIRONMENT!
No direct load

This was a stress test


27

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

log file sync/log file parallel write Summary


Managing Commit Time to Applications for Batch and OLTP
~20% of OLTP databases have this as a majority wait event
Also a common wait bottleneck in batch environments

While improving log write (I/O wait time) will help, usually easier to

improving scheduling
Processor binding or Critical Threads feature

ALWAYS check total log write time of LGWR to real time (snap)

duration!
As serialized single process (LGWR) time approaches all available real time, no

more room to schedule: THEN Need to speed up log write response times!
Separate Log device, HBAs, channel paths, LUNs, LUN Cache, etc. Log optimized storage: NVRAM, SPARC SuperCluster, Exadata
28 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Additional Resources: Performance/AWR

Oracle Database Concepts 11g R2 (esp Chapter 14)

http://docs.oracle.com/cd/E11882_01/server.112/e25789/toc.htm

Oracle Database Performance Tuning Guide 11g R2(esp Chapter 10) Oracle Database Licensing Information 11g R2 (esp Chapter , Diagnostic Pack)

http://docs.oracle.com/cd/E11882_01/server.112/e16638/toc.htm

http://docs.oracle.com/cd/E11882_01/license.112/e10594/toc.htm

29

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Additional Resources: Statspack

Statspack Overview http://www.orafaq.com/wiki/Statspack Statspack Installation (Last used in 9i) http://docs.oracle.com/cd/B10501_01/server.920/a96533/statspac.ht m#27255 Using Statspack with Oracle 11g http://myoracleworld.hobby-electronics.net/DB-statspack.html

30

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Learn More About Oracle Optimized Solutions


Access To Webcasts, Videos, Whitepapers, Blogs, and More

http://oracle.com/optimizedsolutions Check out Oracle Optimized Solution for Oracle Database


31 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

32

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

33

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

34

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

You might also like