Advantages of Using Vmware Vaai On Hus 100 Family

Advantages of using VMware VAAI on Hitachi Unified Storage 100 Family
Lab Validation Report

By Erico Cardelli, Chris Didato
August 23, 2012
Feedback
Hitachi Data Systems welcomes your feedback. Please share your thoughts by sending an email message to SolutionLab@hds.com. To assist the routing of this message, use the paper number in the subject and the title of this white paper in the text.
Table of Contents
Product Features ..............................................................................................................3 Hitachi Unified Storage 150 ...................................................................................3 Hitachi Dynamic Provisioning ...............................................................................4 Hitachi Compute Blade 2000 .................................................................................4 VMware vSphere 5 ................................................................................................5 VMware vStorage APIs for Array Integration (VAAI) .............................................5 Test Environment .............................................................................................................6 Test Methodology ..........................................................................................................10 Cloning with Full Copy .........................................................................................11 Block Zeroing Disk Utilization with Hitachi Dynamic Provisioning .......................12 Block Zeroing Performance Warm up..................................................................14 Block Zeroing with Zero Page Reclaim ...............................................................15 Large Scale Boot Storm with Hardware-assisted Locking...................................17 Large Scale Simultaneous vMotion with Hardware-assisted Locking .................18 Thin Provisioning Stun.........................................................................................19 Analysis ...........................................................................................................................20 Cloning with Full Copy .........................................................................................20 Block Zeroing Disk Utilization with Hitachi Dynamic Provisioning .......................20 Large Scale Boot Storm with Hardware-assisted Locking...................................21 Large Scale Simultaneous vMotion with Hardware-assisted Locking .................21 Thin Provisioning Stun ........................................................................................22 Complete Test Results ...................................................................................................23 Cloning with Full Copy .........................................................................................23 Block Zeroing Disk Utilization with Hitachi Dynamic Provisioning .......................27 Block Zeroing Performance Warm up..................................................................28 Block Zeroing with Zero Page Reclaim ...............................................................30 Large Scale Boot Storm with Hardware-assisted Locking...................................31 Large Scale Boot Storm with Hardware-assisted Locking...................................33 Thin Provisioning Stun.........................................................................................33 Additional Information about VMware vStorage APIs for Array Integration.............35 Full Copy..............................................................................................................35 Block Zeroing.......................................................................................................36 Hardware-assisted Locking .................................................................................37 Thin Provisioning Stun.........................................................................................38
1
1
Advantages of using VMware VAAI on Hitachi Unified Storage 100 Family

Lab Validation Report
These are the benefits of pairing Hitachi Unified Storage with vStorage APIs for This testing supports the following recommendations.
When requiring maximum initial performance, implement these settings:
Use eagerzeroedthick virtual disk format to eliminate performance anomalies. Configure Hitachi Unified Storage and Hitachi Dynamic Provisioning to natively support thin provisioning without double overhead caused by thin provision processing on the virtualization layer and the storage system layer. Enable VAAI on all hosts to take advantage of Hitachi Unified Storage support of the vStorage APIs.
When requiring maximum cost savings and low administrative overhead, implement these settings:
Use thin or lazyzeroedthick virtual disks. Configure Hitachi Dynamic Provisioning volumes. Enable VAAI on all hosts to take advantage of Hitachi Unified Storage support of the vStorage APIs. Enable the hardware-assisted locking features. Run the zero page reclaim utility on Hitachi Dynamic Provisioning against volumes periodically
This paper is intended for you if you are a storage administrator, vSphere administrator, application administrator or virtualization infrastructure architect charged with managing large, dynamic VMware-based virtual environments. You need familiarity with SAN-based storage systems, VMware vSphere, and general IT storage practices.
2
2
Note Testing was done in a lab environment. Many things affect production
environments beyond prediction or duplication in a lab environment. Follow recommended practice by conducting proof-of-concept testing for acceptable results before solution implementation in your production environment. This means to test applications in a non-production, isolated test environment that otherwise matches your production environment.
3
3
Product Features
These are the features of some of the products used in testing.
Hitachi Unified Storage 150

Hitachi Unified Storage is a midrange storage platform for all data. It helps businesses meet their service level agreements for availability, performance, and data protection. The performance provided by Hitachi Unified Storage is reliable, scalable, and available for block and file data. Unified Storage is simple to manage, optimized for critical business applications, and efficient. Using Unified Storage requires a smaller capital investment. Deploy this storage, which grows to meet expanding requirements and service level agreements, for critical business applications. Simplify your operations with integrated set-up and management for a quicker time to value. Unified Storage enables extensive cost savings through file and block consolidation. Build a cloud infrastructure at your own pace to deliver your services. Hitachi Unified Storage 150 provides reliable, flexible, scalable, and cost-effective modular storage. Its symmetric active-active controllers provide input-output load balancing that is integrated, automated, and hardware-based. Both controllers in Unified Storage 150 dynamically and automatically assign the access paths from the controller to a logical unit (LU). All LUs are accessible, regardless of the physical port or the server that requests access. Due to tight integration, Hitachi Unified Storage works seamlessly with VMware vSphere 5.0 implementations of the VAAI APIs. Native support enhances the following:
Storage vMotion Cloning Virtual machine provisioning SCSI reservation locking, which highly optimizes SAN performance for virtualized environments.
Hitachi Unified Storage uses of native thin storage and high performance data management. This creates a leveraged option of using thick storage at the virtualization layer while the storage layer uses thin provisioning. This occurs without any additional overhead.
4
4
Hitachi Dynamic Provisioning

On Hitachi storage systems, Hitachi Dynamic Provisioning provides wide striping and thin provisioning functionalities. Using Dynamic Provisioning is like using a host-based logical volume manager (LVM), but without incurring host processing overhead. It provides one or more wide-striping pools across many RAID groups. Each pool has one or more dynamic provisioning virtual volumes (DP-VOLs) of a logical size you specify of up to 60 TB created against it without allocating any physical space initially. Deploying Dynamic Provisioning avoids the routine issue of hot spots that occur on logical devices (LDEVs). These occur within individual RAID groups when the host workload exceeds the IOPS or throughput capacity of that RAID group. Dynamic provisioning distributes the host workload across many RAID groups, which provides a smoothing effect that dramatically reduces hot spots. When used with Hitachi Unified Storage, Hitachi Dynamic Provisioning has the benefit of thin provisioning. Physical space assignment from the pool to the dynamic provisioning volume happens as needed using 1 GB chunks, up to the logical size specified for each dynamic provisioning volume. There can be a dynamic expansion or reduction of pool capacity without disruption or downtime. You can rebalance an expanded pool across the current and newly added RAID groups for an even striping of the data and the workload.
Hitachi Compute Blade 2000

Hitachi Compute Blade 2000 is an enterprise-class blade server platform. It features the following:
A balanced system architecture that eliminates bottlenecks in performance and throughput Configuration flexibility Eco-friendly power-saving capabilities Fast server failure recovery using a N+1 cold standby design that allows replacing failed servers within minutes
5
5
VMware vSphere 5
VMware vSphere 5 is a virtualization platform that provides a datacenter infrastructure. It features vSphere Distributed Resource Scheduler (DRS), high availability, and fault tolerance. VMware vSphere 5 has the following components:
ESXi 5.0 This is a hypervisor that loads directly on a physical server. It partitions one physical machine into many virtual machines that share hardware resources. vCenter Server This allows management of the vSphere environment through a single user interface. With vCenter, there are features available such as vMotion, Storage vMotion, Storage Distributed Resource Scheduler, High Availability, and Fault Tolerance.
VMware vStorage APIs for Array Integration (VAAI)

VMware vStorage APIs for Array Integration (VAAI), also known as primitives, allow vSphere environments to use advanced features of Hitachi Unified Storage. Using vStorage APIs provides a way to use the advanced storage capabilities of Hitachi Unified Storage from within the VMware interface. Processing is on the storage infrastructure directly. These performance enhancements move the I/O load from the dependant vCenter host platform into the storage controller. Instead of slowing processing by consuming processing power, memory, and bandwidth, moving these operations speed operations, shift potential bottlenecks of resource constraint, and free virtualization management for more critical tasks. When used with vSphere 5.x, Hitachi Unified Storage supports the following API primitives:
Full copy This primitive enables the storage system to make full copies of data within the storage system without having the ESX host read and write the data. Block zeroing This primitive enables storage systems to zero out a large number of blocks to speed provisioning of virtual machines. Hardware-assisted locking This primitive provides an alternative means to protect the metadata for VMFS cluster file systems, thereby improving the scalability of large ESX host farms sharing a datastore. Thin provisioning stun This primitive enables the storage system to notify the ESX host when thin provisioned volumes reach certain capacity utilization threshold. When enabled, this allows the ESX host to take preventive measures to maintain virtual machine integrity.
See Additional Information about VMware vStorage APIs for Array Integration on page 35 to find out more.
6
6
Test Environment
This describes the test environment used in the Hitachi Data Systems lab. Figure 1 shows the physical layout of the test environment.
Figure 1
7
7 Figure 2 shows the virtual machines used with the VMware vSphere 5 cluster hosted on Hitachi Compute Blade 2000.
Figure 2
8
8 Table 1 has Hitachi Compute Blade 2000 and storage components used in this lab validation report.
Table 1. Hitachi Compute Blade 2000 and Storage Configuration
Hardware Hitachi Compute Blade 2000 chassis
Description
Version A0195-C-6443
8-blade chassis 4 1/10 Gb/sec network switch modules 2 management modules 8 cooling fan modules 4 power supply modules 8 8 Gb/sec Fibre Channel PCI-e HBA ports 2 2-core Intel Xeon E5503 processors @ 2 GHz 144 GB RAM per blade 4 blades used in chassis 32 GB cache 4 8 Gb/sec Fibre Channel ports Multiple configurations of 600 GB 10k SAS disks
Hitachi E55A2 server blade
03-57
Hitachi Unified Storage 150
0915/B-H
Table 2 lists the VMware vSphere 5 software used in this lab validation report.
Table 2. VMware vSphere 5 Software
Software VMware vCenter Server VMware vSphere Client VMware ESXi
Version 5.0.0 Build 455964 5.0.0 Build 455964 5.0.0 Build 441354 All virtual machines used in this lab validation report used Microsoft Windows Server 2008 R2 Enterprise Edition 64-bit for the operating system.
9
9 Table 3 has the VMware vSphere 5 virtual disk formats tested in this lab validation report.
Table 3. VMware vSphere 5 Virtual Disk Formats
Virtual Disk Type Thin
Description Virtual disk is allocated only the storage capacity required by the guest operating system. As write operations occur, additional space is allocated and zeroed. The virtual disk grows to the maximum allotted size. The virtual disk storage capacity is pre-allocated at creation. The virtual disk does not grow in size. However, the space allocated is not pre- zeroed. As the guest operating system writes to the virtual disk, the space is zeroed as needed. The virtual disk storage capacity is pre-allocated at creation. The virtual disk does not grown in size. The virtual disk is pre- zeroed. As the guest operating system writes to the virtual disk, the space does not need to be zeroed.
Lazyzeroedthick
Eagerzeroedthick
10
10
Test Methodology
Each test case gathered timing and performance metrics from multiple sources.
All Hitachi Unified Storage management functions were performed using Hitachi Storage Navigator Modular 2 software. All VMware vSphere 5 operations were performed using the vCenter and vSphere client.
To synchronize results, captures were taken from the following locations:
Hitachi Unified Storage
Hitachi Storage Navigator Modular 2
Graphical user interface Command line interface
VMware ESX Host
ESXTOP gathered details from the virtual hosts
vCenter Server Connected via the vSphere client
Virtual machines VDBench
Generated load and logs directly from the virtual machines.
These are descriptions of the test cases: Cloning with Full Copy on page 11 Block Zeroing Disk Utilization with Hitachi Dynamic Provisioning on page 12 Block Zeroing Performance Warm up on page 14 Block Zeroing with Zero Page Reclaim on page 15 Large Scale Boot Storm with Hardware-assisted Locking on page 17 Thin Provisioning Stun on page 19
11
11
Cloning with Full Copy

The goal of this testing was to compare system performance with offloading the cloning process of virtual machines to Hitachi Unified Storage using the Full Copy primitive enabled with off-loading the cloning process disabled. The test procedure involved cloning the following from a source data store (RAID6-source) residing on a dynamic provisioning pool to a destination data store on a separate dynamic provisioning pool (RAID-6-target):
30 GB lazyzeroedthick VMDK 30 GB thin VMDK 30 GB eagerzeroedthick VMDK
Table 4 has the test cases.

Table 4. Cloning with Full Copy Test Cases
Case 1 2 3
Virtual Disk Data Format Thin format 30 GB virtual disks Lazyzeroedthick format 30 GB virtual disk Eagerzeroedthick format 30 GB virtual disks These tests used the following process for all three test cases: 1. Create a dynamic provisioning pool with a single RAID-6 (6D+2P) group using 600 GB SAS 10k RPM disks to create a VMFS datastore called RAID-6source (HDP-0). 2. Create a virtual machine running Microsoft Windows Server 2008 R2 with a thin provisioned virtual disk format on RAID-6-source (HDP-0). 3. Convert the Windows virtual machine to a template. 4. Create a second dynamic provisioning pool with a RAID-6 (6D+2P) group as a VMFS datastore called RAID-6-target (HDP-1) using 600 GB SAS 10k RPM disks. 5. For one of the test cases, provision a virtual machine of each virtual disk format below from a template on datastore RAID-6-source (HDP-0) to a target Datastore RAID-6-target (HDP-1). 6. Provision all virtual machines with the Full Copy primitive disabled and with it enabled.
12
12 Figure 3 shows the storage layout.
Figure 3
Block Zeroing Disk Utilization with Hitachi Dynamic Provisioning

The goal of this testing was to demonstrate the effect of using the Block Zeroing primitive. The test procedure involved provisioning virtual disks configured on an LU on Hitachi Dynamic Provisioning pools with VAAI on and off within a fresh datastore and a dirty datastore.
A fresh datastore is a newly formatted datastore that has not been used. A dirty datastore is a datastore that has had VMDK files created and deleted from it without enabling the block zeroing primitive or Zero Page Reclaim options.
Figure 4 shows the location and setting of the block zeroing primitive within the advanced settings of the ESX Host.
Figure 4
13
13
Provision a Fresh Dynamically Provisioned Volume

Using a 30 GB eagerzeroedthick VMDK file created on a fresh dynamically provisioned volume, time and measure the actual disk space utilization process on the storage system with VAAI disabled and enabled. The test starts with a fresh dynamically provisioned pool. 1. Set desired value for the VAAI primitive. 2. Run ESXTOP to capture all data. 3. Provision a virtual machine on a 30 GB VMDK eagerzeroedthick VMDK, 4. Measure disk space utilization on the Hitachi Dynamic Provisioning pool.
Provision a 100% Dirty Dynamically Provisioned Volume

Using a 30 GB eagerzeroedthick VMDK file created on a dynamically provisioned volume that is filled with data, delete the VMDK. This creates a dirty dynamically provisioned volume where 100% of the pages within the volume remain allocated. Then use the dirty volume to create a new VMDK. Time and measure actual disk space utilization on the storage system with VAAI enabled on a dynamically provisioned volume. The test starts with a 100% dirty dynamically provisioned volume. 1. Set VAAI primitive to enable. 2. Provision a 30 GB eagerzeroedthick VMDK. 3. Measure disk space utilization on the dynamic provisioning pool. 4. Establish baseline storage utilization. 5. Ensure VAAI is enabled for Zero Page Reclaim. 6. Run Zero Page Reclaim against LU. 7. Measure disk space utilization on the dynamic provisioning pool.
14
14
Block Zeroing Performance Warm up

The goal of this testing was to examine the use of the Block Zeroing primitive when used to create a new VMDK disk on existing SAN or block presentations. Once creating the VMDK, the way that it can be accessed depends upon its format. There are performance inconsistencies until the disk has been completely zeroed. To warm up a disk means to completely zero a disk. This testing examined of the warm up effects of the various formats: thin, lazyzeroedthick, and eagerzeroedthick. Tests were conducted with the VAAI primitive for block zeroing enabled and disabled on Hitachi Dynamic Provisioning pools. The evaluation process created a VMDK file on a new (clean) dynamic provisioning pool and then applying a 100% write load of 1000 IOPS from a virtual machine using that VMDK file. Table 5 has the different test cases used for warm up testing.
Table 5. Warm Up Test Cases
Case 1 2 3 4 5 6
VAAI Primitive Setting Enable Off Enable Off Enable Off
VMDK Format 100 GB eagerzeroedthick 100 GB eagerzeroedthick 100 GB thin 100 GB thin 100 GB lazyzeroedthick 100 GB lazyzeroedthick
The test starts with fresh dynamically provisioned volume for each of the test cases. 1. Set the VAAI primitive. 2. Provision the VMDK. 3. Load the virtual machine with I/O from VD Bench. 4. Establish baseline for storage warm up.
15
15 Figure 5 shows the storage configuration used.
Figure 5
Block Zeroing with Zero Page Reclaim

The goal of this test was to demonstrate optimization of a dynamically provisioned volume by running Zero Page Reclaim and the Block Zeroing primitive. Zero Page Reclaim leverages the virtualization of Hitachi Dynamic Provisioning pools to move thick data around to enable the removal of any pages that are empty (or full of zeroes). This shrinks the actual allocation of storage to create thin storage allocations on the LU. This testing started with VAAI disabled on a lazyzeroedthick-provisioned virtual machine, a 100 GB disk created, and 8 GB of data consumed. Consumed space is measured on the LU from the storage system. Then the VMDK is converted to an eagerzeroedthick VMDK to verify that the space is zeroed out. After provisioning, Zero Page Reclaim was run on the LU to see the space consumption. Table 6 has these test cases.
Table 6. Zero Page Reclaim Test Cases
Case 1 2
VAAI On Off
16
16 The testing involved the following steps. 1. Run Zero Page Reclaim against LU. 2. Deploy a 100 GB lazyzeroedthick VMDK then convert to it to an eagerzeroedthick VMDK. 3. Measure storage consumption and allocation from the SAN and vCenter. 4. Run Zero Page Reclaim against the LU. 5. Measure the storage consumption. Figure 6 details the storage used during these tests.
Figure 6
17
17
Large Scale Boot Storm with Hardware-assisted Locking

The goal of this test was to create a large scale boot storm by powering 400 virtual machines up concurrently to show the performance when using and not using the Hardware-Assisted Locking primitive integrated into Hitachi Unified Storage when enabled. In this test procedure, 400 linked clone virtual machines evenly distributed across four ESX 5.0 hosts on a single shared data store were powered on at the same time. This test used the following:
A 24-spindle dynamic provisioning pool (HDP-24D) A smaller 8-spindle dynamic provisioning pool (HDP-8D) to increase the likelihood of SCSI locking conflicts
Testing was conducted as follows for both spindle configurations:
Hardware-assisted locking enabled using the VMSF 5 File system Hardware-assisted locking disabled using the VMFS 3 file system
This data was collected: The conflicts per second were collected from each host using ESXTOP and measured for the number of SCSI reservation locking conflicts to determine the number of SCSI locking conflicts. The elapsed time was captured through the vSphere client to measure the time to complete the power on operations.
Figure 7 shows the storage layout.
Figure 7
18
18
Large Scale Simultaneous vMotion with Hardwareassisted Locking

The goal of this test is to simulate a rolling upgrade to capture the performance profile of Hitachi Unified Storage when using the hardware-assisted locking primitive. In this test procedure, 100 virtual machines of varying types (linked clones, thin provisioned virtual machines, and virtual machines with snapshots) are evenly deployed across four ESX 5.0 hosts. To simulate a rolling upgrade or planned downtime, a single host is placed into maintenance mode. This forces VMware vMotion to move the virtual machines on that host to the remaining three hosts in the cluster. After moving all virtual machines from the host, the host is brought back online. This operation is repeated on all four hosts with hardware-assisted locking enabled and disabled. The vSphere client collects the time required for vMotion to move all the virtual machines from each host. ESXTOP collected the conflicts per second from each host to determine the number of SCSI reservation locking conflicts. The vSphere client captured the elapsed time to complete the power on operations. Figure 8 shows the storage layout:
Figure 8
19
19
Thin Provisioning Stun

The goal of this test was to have the effect of increasing the utilization of the Hitachi Dynamic Provisioning pool capacity. In this test procedure, an over-provisioned Hitachi Dynamic Provisioning pool was created. The VDBench utility was installed inside a virtual machine to write to the virtual disk beyond the limits of the provisioned VMDK storage. The following settings were set on Hitachi Unified Storage:
Host group option:
Set DP Depletion Detail Reply Mode to Yes.
Dynamic Provisioning Pool Consumed Capacity Alert: Set Early Alert to 50%. Set Depletion Alert to 70%.
The following was the procedure used in this test. 1. Create storage with a 100 GB LUN, 2. Create the VMDK. 3. Attach the VMDK to the virtual machine. 4. Run VDBench to perform 100% writes to the disk. 5. Evaluate error messages. 6. Extend the LUN storage to allow writes.
20
20
Analysis
This is the analysis of the test results.

Testing for cloning with full copy showed the following:
Use of the full copy primitive improves efficiency by decreasing the deployment time required to provision virtual machines. Significant improvements can be seen with a thin format VMDK. Full copy commands have significantly less I/O impact on the ESX host. This frees up HBA resources, which the host can use for other operations. Use of the full copy primitive reduced the number of IOPS consumed by the host HBA. The full copy primitive reduces host-side I/O during common tasks, such as the following:
Moving virtual machines with Storage vMotion Deploying a new virtual machine from a template by instructing the storage system to copy data within the storage system, rather than sending the traffic back and forth to the ESX hosts

Testing for block zeroing disk showed the following:
Provisioning with VAAI on is faster in all cases where the Hitachi Dynamic Provisioning pool does not contain dirty pages. Using the Zero Page Reclaim utility significantly optimizes provisioning when storage is reclaimed from within a pool by disassociating the storage of unused pages to allow more flexible and rapid congruence. Using a VMDK formatted in eagerzeroedthick performs without any performance penalties from either ESX or Hitachi Unified Storage as soon as they are provisioned. If the desire is to reduce the amount of writes to a storage device without modifications to block allocation, lazyzeroedthick provides the immediate best option as zeroing is done at the initial write time.
21
21
The block zeroing primitive speeds virtual machine deployment by offloading the repetitive zeroing of large numbers of blocks to the storage system, which frees ESX host resources for other tasks. Hardware-accelerated thin provisioning is achieved when using VAAI with Hitachi Dynamic Provisioning, as shown in Table 7.
Table 7. Thin Provisioned and Primitive Conditions on Hitachi Dynamic Provisioning
Primitive Condition VAAI-ON VAAI-OFF
Thin VMDK Thin Provisioned Thin Provisioned
Lazyzerothick VMDK Thin Provisioned Thin Provisioned
Eeagerzerothick VMDK Thin Provisioned Thick Provisioned

The results of this test show that a datastore with a reduced number of spindles is prone to increased locking. This increased locking can harm performance and limit the number of virtual machines that can be run on a single datastore. The use of hardware-assisted locking primitive greatly improves the scalability of vSphere by allowing more virtual machines per datastore to run concurrently. Running more virtual machines concurrently in a datastore allows consideration of larger VMFS volumes. With the hardware-assisted locking primitive enabled, greatly reduces the likelihood of SCSI reservation locking conflicts during everyday tasks, such as Storage vMotion, creating or deleting VMDK files, or powering off or on of virtual machines.
Large Scale Simultaneous vMotion with Hardwareassisted Locking

Hardware-assisted locking reduces the time for vMotion to perform a large-scale off-loading of VMFS metadata protection to Hitachi Unified Storage. With shorter times for vMotion, there can be shorter maintenance window times.
22
22

Testing with Thin Provisioning Stun showed the following:
The thin provisioning stun primitive enables resumption of activity even if storage is full. TP Stun is recommended to be turned on in all cases where data integrity is highly critical to avoid potential data loss. Setting the storage depletion alert appropriately can give you the flexibility to recover from large scale storage growth issues. Configure early alerting to ensure capture of potential issues as early as possible.
23
23
Complete Test Results

These are the results from validation testing.

Figure 9 shows the time it took to complete the cloning with VAAI enabled and disabled.
Figure 9
With VAAI disabled, the 30 GB lazyzeroedthick virtual machine cloning task finished in 42 seconds. With VAAI enabled, the 30 GB lazyzeroedthick virtual machine cloning task finished in 36 seconds, a 14.29 percent decrease. With VAAI disabled, the 30 GB eagerzeroedthick virtual machine cloning task finished in 94 seconds. With VAAI enabled, the 30 GB eagerzeroedthick virtual machine cloning task finished in 94 seconds, a 0 percent decrease. With VAAI disabled, the 30 GB Thin VM cloning task finished in 90 seconds. With VAAI enabled, the 30 GB Thin VM cloning task finished in 65 seconds, a 27.78 percent decrease. Lazyzeroedthick is the fastest to provision.
24
24 Figure 10 has the results of the full copy tests showing the number of IOPS consumed, which was captured on the ESX host HBA.
Figure 10
Figure 10 shows that the full copy primitive reduced the number of IOPS consumed by the host HBA by more than 90 percent. With VAAI enabled, nearly all of the IOPS from the cloning process were offloaded from the ESX hosts to the storage system. With VAAI disabled, testing showed a prolonged spike in the number of IOPS on the ESX host.
25
25 Figure 11 shows the total IOPS consumed amongst all the tests.
Figure 11
26
26 Figure 12 shows the cloning process consumed 92% less IO with full copy primitive enabled.
Figure 12
27
27

These are the results of the block zeroing disk utilization testing.
Provision a Fresh Dynamically Provisioned Volume

As the storage is provisioned on the LUN by the controller and with the selection of eagerzeroedthick made on the ESX host, the only way that the SAN can provision this as thin is if the host communicates using VAAI to inform Hitachi Unified Storage that the VMDK itself is empty. Figure 13 and Figure 14 shows this.
Figure 13
Figure 14
28
28
Provision a 100% Dirty Dynamically Provisioned Volume

This non-optimized dirty storage performs as if it was thick provisioned regardless of VAAI settings and extends the provisioning time by 10 seconds. Figure 15 shows the existing eagerzeroedthick VMDK file on a dynamic provisioning volume before and after running a Zero Page Reclaim operation.
Figure 15
Block Zeroing Performance Warm up

The tests show there is a significant difference between storage provisioning with VAAI on and off. The most notable performance metric that was derived during these tests was that using eagerzeroedthick provisioning with VAAI on or off performed the best due to its immediate availability and lack of appreciable latency, as seen in Figure 16.
Figure 16
29
29 Figure 16 show that, if eagerzeroedthick is used in conjunction with VAAI on and Hitachi Unified Storage, the following is true:
Automatic thin provisioning from the SAN is kept to a minimum IOPS are 100% immediately available with no loss in performance
Both thin and lazyzeroedthick virtual disks have warm-up overhead at initial write, which creates the overhead and latency seen in Figure 17 and Figure 18. Thin virtual disks have higher warm-up overhead compared to lazyzeroedthick virtual disks. However, thin virtual disks are more than 2.5 times faster to complete zeroing of the blocks, when compared to lazyzeroedthick virtual disks. The zeroing of blocks is off-loaded to Hitachi Unified Storage through the integration of VAAI block zeroing primitive. Once blocks for virtual disks are zeroed, there is no performance penalty. Over time, thin, lazyzeroedthick, and eagerzeroedthick virtual disk have equal performance characteristics. With VAAI on and off thin comes in second as its IOPS stabilized in less than 23 minutes. Its latency was less than 1 millisecond within 9 minutes of testing as seen in Figure 17.
Figure 17
30
30 Due to the overhead and latency created by the double write, lazyzeroedthick warm up testing had to go on for a much longer duration than any of the other formatting options. Figure 18 and Figure 19 show this. Compare this to the results in Figure 16 and Figure 17.
Figure 18
Figure 19
Block Zeroing with Zero Page Reclaim

After the zero page reclaim operation ran, 92 GB of the dynamic provisioning pool space was freed. As a result, after the zero page reclaim operation, the virtual machine was still allocated the original 100 GB but only 8 GB of space was being consumed on the storage system by the virtual machine.
31
31 Provisioning time was captured using the vSphere client. Figure 20 shows that with VAAI enabled, the time required to provision a 30 GB eagerzeroedthick VMDK file is reduced by 85 percent.
Figure 20

Figure 21 shows that enabling VAAI, power on time decreased by 3.8 percent when using the datastore on the 24-spindle dynamic provisioning pool. It decreased by 10 percent when using the datastore on the 8-spindle dynamic provisioning pool.
Figure 21
32
32 With VAAI enabled, Figure 22 shows the number of SCSI locking conflicts decreased by 100 percent for the datastore on the 24-spindle dynamic provisioning pool. The conflicts decreased by 99.27 percent for the datastore on the 8-spindle dynamic provisioning pool.
Figure 22
33
33

Figure 23 shows the reduction in to time required to move the virtual machines from each host was by up to 34 percent with VAAI enabled.
Figure 23
The use of the hardware-assisted locking primitive reduced the migration times and minimizes maintenance cycles by accelerating vMotion process. In turn this limits downtime and expedites IT operations.

With the VAAI on and with storage becoming depleted on the SAN, the alerts seen in Figure 24 can be sent to the administrator using email or console logging.
Figure 24
34
34 When conducting the thin provisioning stun API primitive testing, an overprovisioned Hitachi Dynamic Provisioning pool filled to capacity sends an alert first to the administrative console. Then, when the consumed capacity overtakes the alert threshold, the virtual machine displays the dialog box in Figure 25 suggesting that you free disk space and retry.
Figure 25
If the space condition is rectified, then clicking the Retry option results in the resumption of all I/O writes and reads within the given virtual machine.
35
35
Additional Information about VMware vStorage APIs for Array Integration

This is additional information about the VMware vStorage APIs.
Full Copy
Hitachi Unified Storage uses VAAI to offload the cloning process of a virtual machine (or thousands of virtual machines in virtual desktop environments) from the ESX host when provisioning virtual machines or migrating VMDK files between datastores within a storage system using Storage vMotion. The following operations are some examples of when the full copy primitive is used:
Virtual machine provisioning The source and destination locations are within the same volume. Hitachi integrates with the full copy API to clone VMs or datastores from a golden image. This process dramatically reduces I/O between the ESX nodes and Hitachi storage. Storage vMotion The source and destination locations are on different volumes within the same storage system. This feature enables VMDK files to be relocated between datastores within a storage system. Virtual machines can be migrated to facilitate load-balancing or planned maintenance while maintaining availability. By integrating with full copy, host I/O offload for Storage vMotion operations accelerates virtual machine migration times considerably.
Figure 26 shows a comparison of copy functions with and without VAAI.
Figure 26
36
36 The full copy primitive enables an ESX host to initiate the copying of a VMDK between a source datastore location and a destination datastore location within the storage system rather than between the host and the storage system. This reduces the number of read and write operations from the ESX host, and reduces host-side I/O bandwidth when copying virtual machines.
Block Zeroing
A common operation when creating new virtual machines or virtual disks is to allocate space when creating the virtual machine or virtual disk. When using the lazyzeroedthick format, the virtual disk size is pre-allocated but not all of the space is pre-zeroed. As the guest operating system writes to the virtual disk, the space is zeroed as needed. When using the eagerzeroedthick format, the virtual disk size is pre-allocated and the space is pre-zeroed. This means that it can take much longer to provision eagerzeroedthick virtual disks. With the block zeroing primitive, these zeroing operations take place within the storage system without the host having to issue multiple commands. Figure 27 shows a comparison of block zeroing with and without VAAI.
Figure 27
Integration with the block zeroing primitive allows the quick provisioning of eagerzeroedthick or lazyzeroedthick VMDKs by writing zeros across hundreds or thousands of blocks on the VMFS datastores. This off-loads much of the process from the ESX hosts to the storage system. This is particularly useful when provisioning eagerzeroedthick VMDKs for VMware fault tolerant virtual machines due to the large number of blocks that need to be zeroed.
37
37
NoteBlock zeroing defaults to 1 MB blocks on vSphere 5 with VMDK version 5.

Any touched blocks on the LUN are in 1 MB increments
Hardware-assisted Locking
Integration with the hardware-assisted locking primitive provides a more granular LUN locking. It allows the ability to modify a logical block address atomically on virtual disk without the use of SCSI reservations or the need to lock the LUN from other hosts. Figure 28 shows a comparison of hardware-assisted locking with and without VAAI.
Figure 28
ESX 5.0 environments rely on locking mechanisms to protect VMFS metadata, particularly in clustered environments where multiple ESX hosts access the same LUN. ESX uses SCSI reservations to prevent hosts from activating or sharing virtual disk content on multiple hosts at the same time. However, these SCSI locking algorithms lock an entire LUN and do not provide the granularity to lock a particular block on the LUN. This algorithm requires four separate commands (simplified as reserve, read, write and release) to acquire a lock. Locking the entire LUN also can introduce SCSI reservation conflicts and affect scalability. Transferring the LUN locking process to Hitachi Unified Storage reduces the number of commands required to access a lock and allows more granular locking. This leads to better overall performance and increases the number of virtual machines per datastore and the number of hosts accessing the datastore.
38
38 With the release of the ESXi 5.0, the hardware-assisted locking primitive cannot be disabled for newly created VMFS 5 file systems on the Hitachi Unified Storage. Hardware-assisted locking can be disabled with VMFS 3 file systems on the Hitachi Unified Storage. For more information see vStorage API for Array Integration FAQ from VMware. Following are example use cases for hardware-assisted locking:
Migrating a virtual machine with vMotion Creating a new virtual machine or template Deploying a virtual machine from a template Powering a virtual machine on or off Creating, deleting or growing a file Creating, deleting or growing a snapshot

Thin provisioned volumes allow on-demand storage provisioning, reducing the capacity requirements over the life of the storage system. However, storage systems have a finite number of physical storage resources supporting a thin provisioned volume. This requires monitoring and early detection of depletion of these physical storage resources. Hitachi Dynamic Provisioning has a built-in mechanism to alert storage administrators at user specified thresholds. Integration with the thin provisioning stun API primitive pushes the alerts from Hitachi Dynamic Provisioning to the ESX host.
When reaching a Hitachi Dynamic Provisioning depletion alert threshold, ESX shows a space utilization warning message. When Hitachi Dynamic Provisioning is out of capacity, ESX halts affected virtual machines and captures an event log.
The combined effect alerts the storage administrator and the vSphere administrator to keep virtual machine integrity.
For More Information

Hitachi Data Systems Global Services offers experienced storage consultants, proven methodologies and a comprehensive services portfolio to assist you in implementing Hitachi products and solutions in your environment. For more information, see the Hitachi Data Systems Global Services website. Live and recorded product demonstrations are available for many Hitachi products. To schedule a live demonstration, contact a sales representative. To view a recorded demonstration, see the Hitachi Data Systems Corporate Resources website. Click the Product Demos tab for a list of available recorded demonstrations. Hitachi Data Systems Academy provides best-in-class training on Hitachi products, technology, solutions and certifications. Hitachi Data Systems Academy delivers on-demand web-based training (WBT), classroom-based instructor-led training (ILT) and virtual instructor-led training (vILT) courses. For more information, see the Hitachi Data Systems Services Education website. For more information about Hitachi products and services, contact your sales representative or channel partner or visit the Hitachi Data Systems website.
Corporate Headquarters 750 Central Expressway, Santa Clara, California 95050-2627 USA www.HDS.com Regional Contact Information Americas: +1 408 970 1000 or info@HDS.com Europe, Middle East and Africa: +44 (0) 1753 618000 or info.emea@HDS.com Asia-Pacific: +852 3189 7900 or hds.marketing.apac@HDS.com
Hitachi is a registered trademark of Hitachi, Ltd., in the United States and other countries. Hitachi Data Systems is a registered trademark and service mark of Hitachi, Ltd., in the United States and other countries. All other trademarks, service marks, and company names in this document or website are properties of their respective owners.
Notice: This document is for informational purposes only, and does not set forth any warranty, expressed or implied, concerning any equipment or service offered or to be offered by Hitachi Data Systems Corporation. Hitachi Data Systems Corporation 2012. All Rights Reserved. AS-159-00, August 2012

Advantages of Using Vmware Vaai On Hus 100 Family

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Advantages of Using Vmware Vaai On Hus 100 Family

Uploaded by

Copyright:

Available Formats

Advantages of using VMware VAAI on Hitachi Unified Storage 100 Family

Lab Validation Report

August 23, 2012

Advantages of using VMware VAAI on Hitachi Unified Storage 100 Family

When requiring maximum initial performance, implement these settings:

Hitachi Unified Storage 150

Hitachi Dynamic Provisioning

Hitachi Compute Blade 2000

VMware vStorage APIs for Array Integration (VAAI)

Hardware Hitachi Compute Blade 2000 chassis

Hitachi E55A2 server blade

Hitachi Unified Storage 150

Software VMware vCenter Server VMware vSphere Client VMware ESXi

Virtual Disk Type Thin

To synchronize results, captures were taken from the following locations:

Hitachi Unified Storage

Hitachi Storage Navigator Modular 2

Graphical user interface Command line interface

VMware ESX Host

ESXTOP gathered details from the virtual hosts

vCenter Server Connected via the vSphere client

Virtual machines VDBench

Generated load and logs directly from the virtual machines.

Cloning with Full Copy

30 GB lazyzeroedthick VMDK 30 GB thin VMDK 30 GB eagerzeroedthick VMDK

Table 4 has the test cases.

Block Zeroing Disk Utilization with Hitachi Dynamic Provisioning

Provision a Fresh Dynamically Provisioned Volume

Provision a 100% Dirty Dynamically Provisioned Volume

Block Zeroing Performance Warm up

VAAI Primitive Setting Enable Off Enable Off Enable Off

Block Zeroing with Zero Page Reclaim

Large Scale Boot Storm with Hardware-assisted Locking

Testing was conducted as follows for both spindle configurations:

Figure 7 shows the storage layout.

Large Scale Simultaneous vMotion with Hardwareassisted Locking

Thin Provisioning Stun

Host group option:

Set DP Depletion Detail Reply Mode to Yes.

Cloning with Full Copy

Block Zeroing Disk Utilization with Hitachi Dynamic Provisioning

Table 7. Thin Provisioned and Primitive Conditions on Hitachi Dynamic Provisioning

Primitive Condition VAAI-ON VAAI-OFF

Thin VMDK Thin Provisioned Thin Provisioned

Lazyzerothick VMDK Thin Provisioned Thin Provisioned

Eeagerzerothick VMDK Thin Provisioned Thick Provisioned

Large Scale Boot Storm with Hardware-assisted Locking

Large Scale Simultaneous vMotion with Hardwareassisted Locking

Thin Provisioning Stun

Complete Test Results

Cloning with Full Copy

Block Zeroing Disk Utilization with Hitachi Dynamic Provisioning

Provision a Fresh Dynamically Provisioned Volume

Provision a 100% Dirty Dynamically Provisioned Volume

Block Zeroing Performance Warm up

Block Zeroing with Zero Page Reclaim

Large Scale Boot Storm with Hardware-assisted Locking

Large Scale Boot Storm with Hardware-assisted Locking

Thin Provisioning Stun

Additional Information about VMware vStorage APIs for Array Integration

Figure 26 shows a comparison of copy functions with and without VAAI.

NoteBlock zeroing defaults to 1 MB blocks on vSphere 5 with VMDK version 5.

Thin Provisioning Stun