You are on page 1of 41

Performance Tuning and Databases

Tom Hamilton Americas Channel Database CSE

Agenda
Tuning Methodology Performance Analysis
What causes poor performance How to identify the causes How to fix the causes

Performance Analysis Tools Oracle Best Practices SQL Server Best Practices NetApp Controller Best Practices Examples Protocol Comparisons
2

Introduction

Performance tuning is not a science!

It is an art form!!!! It is about finding the bottleneck(s)!

Tuning Methodology
Check for Known Hardware and Software Problems Consider the Whole System Measure and Reconfigure by Levels Change One Thing at a Time Put Tracking and Fallback Procedures in Place Before You Start Do Not Tune Just for the Sake of Tuning Remember the Law of Diminishing Returns

HP-UX NFS Performance with Oracle Database 10g Using NetApp Storage TR3557

Top 5 Performance Bottlenecks


Disk bottleneck FC-AL Loop bottleneck CPU/domain bottleneck Networking and target card limits Client side thread parallelism

Top 5 Performance Bottlenecks


Server Threads

Network (FC/Ethernet)

Storage CPU/Domain

Shelf Loops Disk Bottleneck

Drive Type and IOPS

Diagnosing Disk Throughput Bottlenecks


Bottleneck: Disks have data throughput and IOPS limitations
Approx IOPS

SAS/FC Disks
SATA

20-260
20-80

Symptoms: High latency or inability to add more load on a particular volume Diagnosis: Use statit to monitor disk utilization Disk bottleneck - If disk utilization is >70% *and* high data transfer rates or transfers per disk
disk ut% xfers ureads--chain-usecs writes--chain-usecs /vol0/plex0/rg0: 0b.17 99 297.22 297.22 1.00 20034 0.00 .... 0b.18 99 292.55 292.55 1.00 19960 0.00 .... 0b.19 99 294.75 294.75 1.00 20180 0.00 .... 0b.20 99 294.15 294.15 1.00 19792 0.00 .... 0b.21 99 294.76 294.76 1.00 19632 0.00 .... 0b.22 99 293.70 293.70 1.00 20341 0.00 .... . . . . . .

Diagnosing Disk Throughput Bottlenecks

Solutions:
Use flexible volumes with large aggregates Add more drives to the aggregate Redistribute load on to lightly loaded disks FlashCache

10

Diagnosing Disk Loop Saturation

Bottleneck: Disk Loop has throughput limitations Each loop can support up to 2Gbit~180MB/s, 4Gbit = ~360MB/s Symptoms: High response times to requests on one or more volumes Inability to add more load to one or more volumes RAID reconstruction / scrub

11

Diagnosing Disk Loop Saturation


Diagnosis: High disk utilization can indicate loop limits Use statit to monitor disk utilizationLoop Saturation - If disk utilization is >70% but low disk tput(67 * 31.5 * 4K = 8 MB/s per disk) *and* high data transfer rates on a loop
disk ut% /vol0/plex0/rg0: 0b.16 0b.17 0b.18 0b.19 0b.20 xfers
0 97 96 95 94

ureads--chain-usecs writes--chain-usecs
0.02 66.90 67.05 66.98 67.04 10.00 31.54 31.49 31.54 31.48 5250 1774 1552 1472 1453 0.01 0.01 0.01 0.01 0.01 10.00 10.00 10.00 10.00 10.00 0 0 0 0 0

0.03 66.91 67.06 66.99 67.05

12

Diagnosing Disk Loop Saturation

Solutions: Add more Disk loops Add Dual-path on single controller or Multipath HA on CFO(Requires ONTAP 7.1.1 or greater) Redistribute disks in a volume/raid group across loops

13

Diagnosing CPU/Domain Bottleneck


Bottleneck: CPU(s) have more work than they can handle Symptoms: High latency or Client side sluggishness sysstat reports CPU utilization > 95% Diagnosis: Use statit to check CPU Statistics Look for CPU utilization > 95% on 1P, >190% >350% on 4P
CPU Statistics 109.921938 time (seconds) 100 % 216.140218 system time 197 % 2.582292 rupt time 2 % 445061 rupts x 6 usec/rupt) 213.55792 non-rupt system time 194 % 3.703658 idle time 3 %

on 2P,

14

Diagnosing CPU/Domain Bottleneck


Diagnosis: If CPU not high enough, may indicate domain bottleneck Controller processing is split into domains such as RAID, Networking, Storage, etc. Work associated with a given domain can only run on one processor at any given time Improper management of filer could result in all work that needs to be done is in one domain Not a common problem

15

Diagnosing CPU/Domain Bottleneck


Diagnosis: Use sysstat m option or statit to check domain utilization Look for total domain time > 900,000 usec(90%)
idle kahuna Network storage Exempt raid target cpu0 16952.55 456550.84 293882.11 30605.83 121252.02 70162.66 214.24 cpu1 total 16740.97 33693.52 465228.22 921779.17 282281.51 576163.53 30353.09 60958.92 121671.95 242923.97 70080.22 140242.89 204.22 418.46

92% Utilization

16

Diagnosing CPU Bottleneck


Solutions: Use FlexShare to prioritize workloads Stagger workload to non-peak times Reschedule transient activities such as raid reconstruct, scrubs etc. Load balance by migrating to other filers If high network traffic, look for - Misbehaving clients - Bad mounts - Virus scanning activity Upgrade to a higher performing filer FlashCache?

17

Diagnosing Network and Target Adapter Limits


Bottleneck: Network and Target adapters have throughput and IOPS limitations per port
Gigabit Ethernet 80 MB/s 10 Gb Ethernet 800 MB/s FC 2Gb target 180 MB/s FC 4Gb target 360 MB/s

Symptoms: Poor responsiveness to clients on a given network and inability to add more load to filer Diagnosis: Use statit or ifstat to monitor traffic on each port Look for the port limits, Ex. Gigabit Ethernet Interface
iface e0 e9a e9b e11a Network Interface Statistics (per second) side bytes packets multicasts errors collisions pkt drops recv 595.95 7.65 1.67 0.00 0.00 xmit 679.24 6.46 0.00 0.00 0.00 recv 473754.48 3536.66 2.24 0.00 0.00 xmit 121823577.22 40645.67 0.00 0.00 0.00 recv 471987.25 3523.48 2.24 0.00 0.00 xmit 60596317.35 40493.83 0.00 0.00 0.00 recv 477358.49 3563.57 2.24 0.00 0.00 xmit 61286194.79 40954.82 0.00 0.00 0.00
18

Diagnosing Network/Target Adapter Bottleneck


Solutions: Use Link aggregation (trunking) to increase network bandwidth Add more adapters or multi-ported adapter Route traffic through underutilized interfaces Upgrade to 10 Gb network interface or 4 Gb FC target interface

19

Diagnosing Client Side Thread Parallelism


Problem: Applications see poor throughput but controller doesnt show any bottleneck Symptoms: Storage system not fully utilized High I/O wait times at the client

20

Diagnosing Client Side Thread Parallelism


Diagnose: Use sysstat to check CPU, Disk, network utilization
CPU NFS CIFS HTTP Total Net kB/s in out 1 1 1 1 1 1 1 1 1 1 Disk kB/s read write 24871 36846 24937 34109 23924 42640 16554 21744 6134 4650 Tape kB/s Cache Cache CP CP Disk read write age hit time ty util 0 0 2 100% 61% 23f 30% 0 0 2 100% 63% 22f 28% 0 0 2 100% 67% 26f 31% 0 0 2 100% 39% 14 20% 0 0 2 99% 11% 5 7% DAFS FCP iSCSI FCP in 50221 44818 53965 27403 7077 kB/s out 19861 20359 25354 12942 4890

48% 44% 51% 29% 11%

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

1141 1112 1192 761 415

0 0 0 0

1141 1112 1192 761 415

0 0 0 0

Typically single thread of execution from the client (ex. cp and dd) Solutions: Application Tuning - More threads - Increase transfer size Tune client or target throttles in case of FCP

21

What is not a Bottleneck?

NVRAM
Provides data integrity in case of controller failure Only logs transactions before committing to disk NVRAM is not a write cache

22

Key Takeaways
Performance Monitoring
Adhere to performance monitoring methodology
Focus on latency monitoring Recognize performance trends

Performance Analysis
Identify common performance bottlenecks
Disk bottleneck FC-AL Loop bottleneck CPU/domain bottleneck Networking and target card limits Client side thread parallelism

Understand possible solutions


23

Performance Tuning Analysis Tools


Sysstat Statit - na_statit.1a.htm statit_explained.pdf Perfstat / Latx SIO Stats https://now.netapp.com/eservice/toolchest Quest Spotlight OnCommand InSight Balance

DBTuna

24

Sizing Tools
Database Sizer
Statspak / AWR SQL Performance Analyzer Perfstat

Unified Sizer

25

NetApp Controller Best Practices


Aggregates TR-3437
Disks (Number, Size, Type)
15k Disks

Disk loops

IMT
Network Adapters FC Adapters

WAFL - WAFL Overview Data OnTap - version

26

Before PAM-II
Disk I/O for 30 Disk SATA Aggregate

31

After PAM-II Installation TR3832


Disk I/O for 30 Disk SATA Aggregate
20 Minutes
After Install

The First 30 minutes of operation PAM-II is delivering I/O the equivalent of: 1 Shelf 15K FC or 5 Shelves of SATA

Cache 90% Populated

32

Deciding Between Flash Cache & SSD

Flash Cache (Intelligent Read Caching) Good Fit When


Random read intensive workload Improving average response time is adequate Active data is unpredictable or unknown Administration free approach desired Minimizing system price is important Accelerating an existing HDD configuration is desired
2010 NetApp. All rights reserved.

SSDs in DS4243 (Persistent Storage) Good Fit When


Random read intensive workload

Every read must be fast


Active data is known and fits into SSD capacity Active data is known, is dynamic, and ongoing administration is okay Upside of write acceleration desired Performance must be consistent across failover events
33 33

NetApp Confidential NetApp Confidential Limited Use Limited Use

Product Information TR3938


DS4243 SSD Option
24 x 100GB SSD per shelf

Requires Data ONTAP 8.0.1+


Both 7-Mode and Cluster-Mode supported

Shelf setup is exactly the same as with SATA or SAS Add-on: DS4243-SL02-24A-QS-R5 Configured: DS4243-SL02-24A-R5-C Individual drives may be parts ordered (X441A-R5)
Formerly X442A-R5

Available in full shelf increments only (24 drives):

Platforms:

Supported:
FAS/V3160, 3170, 6000 series, 3240, 3270, and 6200 series FAS/V2020, 2040, 2050, 3140, 3210, and all 3000 series

Not supported:

2010 NetApp. All rights reserved.

NetApp Confidential NetApp Confidential Limited Use Limited Use

34 34

Performance - Sequential
Sequential I/O throughput per drive
120

100

80 MB/sec per Drive

60

15K rpm FC drive SSD

40

20

0 Large Seq. Read


2010 NetApp. All rights reserved.

Large Seq. Writes


NetApp Confidential NetApp Confidential Limited Use Limited Use 35 35

Storage Performance Guidelines TR-3437


Adequate spindle counts
Aggregate Traditional volumes SATA considerations

SnapMirror / SnapVault considerations


Stagger schedules Off-peak workload time scheduling Throttle bandwidth

Multipath HA LUN Alignment


36

HP-UX NFS Performance with Oracle Database 10g Using NetApp Storage TR3557

37

HP-UX NFS Performance with Oracle Database 10g Using NetApp Storage TR3557

HP-UX Server Tuning http://docs.hp.com/en/5992-4222ENW/5992-4222ENW.pdf Tune-N-Tools http://software.hp.com/portal/swdepot/displayProductInfo.do?prod uctNumber=Tune-N-Tools HP-UX TCP/IP Performance White Paper http://docs.hp.com/en/11890/perf-whitepaper-tcpip-v1_1.pdf

38

HP-UX NFS Performance with Oracle Database 10g Using NetApp Storage TR3557

HP-UX Kernel Parameters Oracle Initialization Parameters NFS Mount Options


http://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb7518

HP-UX NFS Kernel Parameters HP-UX Patch Bundles

39

AIX DB2 Performance Protocol Comparison on NetApp, TR-3581

40

AIX Oracle Performance Protocol Comparison on NetApp, TR-3871

41

Oracle 10g Performance Protocol Comparison on SUN Solaris 10 TR3496

42

Oracle 10g Performance Protocol Comparison on SUN Solaris 10 TR3496

43

Oracle 11g R2 Performance Protocol Comparison on RHEL 5 TR3932

44

Summary
NetApp Controller Host Network Database Settings

FIND THE BOTTLENECK!!

45

You might also like