Professional Documents
Culture Documents
2011 Hewlett-Packard Development Company, L.P. 2010 The information contained herein is subject to change without notice
Agenda
Evolution on Integrity servers to expect from hardware do you know about multipathing on IOPERFORM and I/O
NUMA
OpenVMS Have
Q&A
9/19/2011
39/19/2011
# I/O Device 3
12/16
Architecture has evolved drastically for I/O Devices within Integrity, Performance & Scalability - doubling in each new hardware release
9/19/2011
Stripe data within and across multiple p410 RAID controllers (OpenVMS Shadowing)
Striping within provides high Performance. Striping across controllers provides no SPOF storage
9/19/2011
500
IOPS
32
64
128
256
16
Load
32
64
128
256
W/O Cache
Cache
Use p410i Cache Batter Kit for faster response Stripe across multiple disks to maximize the utilization and throughput
9/19/2011
Customer Concerns..
How is I/O Performance on Integrity Servers? How they compare against my existing high end alpha servers? After migrating to new platform what should I expect? Why is i2 server I/O market differentiator?
9/19/2011
10
Multipathing 1(4)
Multipathing (MP) is a technique to manage multiple paths to storage device through failover and failback mechanisms It helps user to load balance across multiple paths to storage By default multipath will be enabled on OpenVMS OpenVMS MP supports ALUA (Asymmetric Logical Unit Access) [> V8.3]
spreads # of devices evenly across all the paths during boot time
11
9/19/2011
switch
HSV
Single
HSV
Two Paths
HSV
Four Paths
Alpha
IA64
Alpha
IA64
switch
switch
switch
switch
HSV
HSV
HSV
HSV
12
9/19/2011
Multipathing 2(4)
Device discovery initiates the path discovery and forms MP set for each device
MC First The
SYSMAN IO AUTO, SDA > SHOW DEVICE DGAxx shows MP Set path discovered is considered primary active path is called current path selection algorithms optimized to support Active/Active arrays
Automatic Always
active optimized (AO) paths are picked for I/O, if no alternative then active nonoptimized (ANO) is picked [ how to fix this will discuss during EVA best practices]
With latest firmware's on storage, very rare you will get connected to ANO
13
9/19/2011
Multipathing 3(4)
VMS switches its path to a LUN when:
MOUNT of a device with current path offline Manual path switch via SET DEVICE/SWITCH/PATH= Some local path becomes available when current path is MSCP
Path
switch from MSCP to local triggered by the poller [not if manual sw]
14
9/19/2011
Multipathing 4(4)
DEVICE device/POLL/NOPOLL
devices can initiate and complete a MV ; Each shadow member operate independently on a path switch
operator logs will indicate the details DEV device/full will show details of path switch [time etc] SHOW DEV device logs lot of diagnostic information in MPDEV structure
SHOW SDA>
15
9/19/2011
We have seen customers reporting huge traffic of Unit Attention(UA) in SAN resulting in Cluster hangs, slow disk operations, high mount verifications etc! These UA are initiated due to any changes in SAN like Firmware Upgrades, Bus Reset etc SCSI_ERROR_POLL is poller responsible for clearing the latched errors (like SCSI UA) on all the fibre and SCSI devices, which can otherwise cause confusion in SAN By default the poller will be enabled SYSGEN>SHOW SCSI_ERROR_POLL
16
9/19/2011
Customer/Field Concerns..
OpenVMS Multipathing
After upgrading my SAN components, I see large number of mount verifications, does that indicate problem? Does multipathing do load balance? Or policies? I see too many mount verification messages in operator log, will it impact the volume performance (especially latency) How do I know if my paths are well balanced or not? How do I know my current path is active optimized or not? Does multipathing support Active/Active Arrays, ALUA, third party storage, SAS device, SSD device
17
9/19/2011
Did you know QIO is one of the most heavily used interface in OpenVMS. We want to put it on diet. What should we do? 1. Optimize QIO 2. Replace QIO 3. Provide alternative
18 9/19/2011
IOPERFORM/FastIO
Fast I/O is a performance-enhanced alternative to performing QIOs Substantially reduces the setup time for an I/O request Fast I/O uses buffer objects to eliminate the I/O overhead of manipulating I/O buffers locked memory doubly mapped Performed using the buffer objects and the following system services:
sys$io_setup
,sys$io_perform, sys$io_cleanup
(jacket) /sys$create_bufobj_64
sys$create_bufobj $
dir sys$examples:io_perform.c
buffer objects once and using them for the time of application is faster
19
9/19/2011
Impact of IOPERFORM/FASTIO
I/O Data Rate (MB/sec)
size random workloads doubles the throughput with increased loads size sequential workloads operate same
Larger
Throughput (IOPS)
45000 40000 35000 30000 25000 20000 15000 10000 5000 0
8K_READ
8K_READ_QIO
8
Threads
16
32
64
20
9/19/2011
NUMA/RAD Impact
What you should know?
DGA100
P1
22
9/19/2011
NUMA/RAD Impact
In RAD based system, each RAD is made of CPU, Memory and I/O devices The accessibility of I/O devices from local to remote domains results in accessing remote memory and remote interrupt latency
Impact of RAD on I/O Device
3700 3600 3500 IO Rate 3400 3300 3200 3100
Optimized Performance
10-15 % overhead
3000
0 1 2 3 4 RAD #
23 9/19/2011
Opt/sec
Keep I/O Devices close to process which is heavily accessing it Make use of FASTPATHING efficiently
Make The
sure to FASTPATH the Fibre Devices close to the process which is initiating the I/O
overhead involved in handling the remote I/O can impact the throughput [chart]
FASTPATH algorithms assign the CPU on round robin basis Statically load balance the devices across multiple RADs Make use to SET PROC/AFFINITY to bind processes with high I/O Use SET DEVICE device/PREFERRED_CPUS
24
9/19/2011
25
EVA Differences
Speeds and Feeds
EVA4400
Model Memory /controller pair HSV300 4GBytes
EVA6400
HSV400 8GBytes
EVA8400
HSV450 14/22GBytes
P6300
HSV340 4GBytes
P6500
HSV360 8GBytes
4 FC 20 w switches
8 FC
8 FC
Host Port speed Device Ports, # Device Ports, Speed # 3-1/2 drives # 2-1/2 drives Max. Vdisk I/O Read Bandwidth I/O Write Bandwidth Random Read I/O
4Gb/s FC
4Gb/s FC
8 FC, 0 GbE 4 FC, 8 1GbE 4 FC, 4 10GbE 8Gb/s FC 8Gb/s FC 1Gb/s iSCSI 1Gb/s iSCSI 10Gb/s 10Gb/s iSCSI/FCoE iSCSI/FCoE 8 16
6Gb/s 120 250 1024 1,700 MB/s 600 MB/s 45,000 IOPs 6Gb/s 240 500 2048 1,700 MB/s 780 MB/s 55,000 IOPs
27
9/19/2011
After upgrading the OS or applying the patch, I/O response has become slower
We
After moving to new blade in same SAN environment, our CRTL FSYNC is running slow After upgrading, we see additional CPUs ticks for copy, delete and rename Our database is suddenly responding slow Some nodes in cluster see high I/O latency after mid-night
28
Maximum number of storage performance issues reported are due to mis-configuration of SAN components
29 9/19/2011
Best Practices..1(6)
Tests
mixed load environments, it would be ok to have random vs. sequential application disk groups best performance over the widest range of workloads; however, Vraid5 is better for some sequential-write workloads provides the best random write workload performance , but no protection Use for non-critical storage needs
30
9/19/2011
Best Practices..2(6)
price-performance
For the equivalent cost of using 15k rpm disks, consider using more 10k rpm disks.
Combine disks with different performance characteristics in the same disk group
Do
31
9/19/2011
Best Practices..4(6)
EVA stripes LUN capacity across all the disks in a disk group. The larger disk will have more LUN capacity leading into imbalanced density control over the demand to the larger disks. disks with equal capacity in a disk group.
Read cache management influences performance and always ENABLE LUN count Yes, No
Good
to have few LUNs per controller on Host Requests and Queue Depths OpenVMS Queue depth
Depends Monitor
SEQUENTIAL Workloads
OpenVMS
Max Transfer Size is 128K for Disk and 64K for Tapes! DEVICE_MAX_IO_SIZE
9/19/2011
32
Best Practices..3(6)
like transaction processing, data mining, and databases such as Oracle are ideal SSDs evenly across all available back end loops and HDDs may be mixed in the same drive enclosure
your application and EVA, accordingly can assign SSD or HDD to individual Controllers, or enable write through mode for SSDs can help [Experiment!!] use SSD drives to keep the critical path data, where the response time is un-compromised
Customers
33
9/19/2011
OpenVMS V8.4
10x Faster
250 200 150 100 50 0 0
10x Faster
50
100
200
250
300
Mixed Load, 8 Disks SSD/FC DG on EVA4400 Smaller IOs(4K/8K) showed 9-10x times sustained increased in IOPS and MB/sec with increase in load for SSD carved LUNs compared to FC With 10 times faster response time, SSD carved LUN was able to deliver 10 time more performances and bandwidth for smaller size IOs
9/19/2011
34
Best Practices..5(6)
the utilization present LUNs simultaneously through both controllers, Ownership is only one
Active/Active
load balance LUN ownership to both controllers (use EVAPerf) either through using CV EVA or using OpenVMS SET DEVICE/SWITCH/PATH='PATH_NAME' 'DEV_NAME' command Path During the initial boot of the EVA the preferred path parameter is read and determines the managing controller [see below figure for options] LUN ownership is reassigned after a failed controller has been repaired the workload as evenly as possible across all the Host Ports
DGA99 answers Inquiry on these ports DGA99 does I/O on these ports HSVxxx HSVxxx
Preferred Verify
Balance
DGA99
35 9/19/2011
Customer Scenario
Controller Load Imbalance & No Equal Port Load Distribution
36
9/19/2011
Best Practices..6(6)
Ensure there are no HW issues Specially Battery failure (Cache Battery failure causes to change to Write-Through mode, hence Write performance becomes an issue), Device loop failure etc. Drive reporting timeouts Deploy the array only in supported configurations Stay latest on EVA firmware!! BC and CA have different best practices and beyond the scope of this discussion
37
9/19/2011
Large latencies may be quite natural in some contexts, such as an array processing large I/O requests Array processor utilizations tend to be high under intense, small-block, transaction oriented workloads but low under intense large-block workloads
38
9/19/2011
OpenVMS IO Data
P6500 36G RAID 5 Volume 4G FC Infrastructure
Sequential Read (MB/sec)
45 40 35 30 25 20 15 10 5 0 138 286 367 383 404 411 412 412 MB/sec resp time(msec)
Higher bandwidth can be obtained with larger blocks. Larger blocks can drain the interconnects faster due large data transfers.
128K IOs pushing 4G FC line speeds!
Higher throughputs can be obtained with smaller blocks. Usually smaller blocks need a lot of processing power
8K workloads pushing close to EVA Max Throughputs!
40
9/19/2011
41 9/19/2011
TLViz Disk, FCP, VEVAMON (older EVAs) > FC [for fibre devices], PKR/PKC [for SAS devices] SCSI_INFO.EXE, SCSI_MODE.EXE, FIBRE_SCAN.EXE more..
SDA
SYS$ETC: Many
42
9/19/2011
metric
43
9/19/2011
Sizing EVA
HP StorageWorks Sizing Tool
44
9/19/2011
HP StorageWorks Enterprise Virtual Array A tactical approach to performance problem diagnosis. HP Document Library
45
9/19/2011
Questions/Comments
46
9/19/2011
THANK YOU
EVA8000
EVA8100 EVA4400 EVA6400 EVA8400
HSV210 or HSV210-A
HSV210-B HSV300 HSV400 HSV450 09XXXXXX or 10000000 095XXXXX, or 10000000 10000090
P6300
P6500
48 9/19/2011
HSV340
HSV360