Professional Documents
Culture Documents
This document has been archived and will no longer be maintained or updated. For more
information go to the Storage Solutions Technical Documents page on Dell TechCenter
or contact support.
Compellent Technologies
7625 Smetana Lane
Eden Prairie, Minnesota 55344
www.compellent.com
Compellent Storage Center Compellent Best Practices with VMware ESX 4.x
Contents
Contents ....................................................................................................................... 2
Disclaimers ........................................................................................................... 4
General Syntax ..................................................................................................... 4
Conventions .......................................................................................................... 4
Where to Get Help ................................................................................................ 5
Customer Support ............................................................................................. 5
Document Revision ............................................................................................... 5
Overview ...................................................................................................................... 6
Prerequisites ......................................................................................................... 6
Intended audience ................................................................................................ 6
Introduction ........................................................................................................... 6
Fibre Channel Switch Zoning ....................................................................................... 7
Single Initiator Multiple Target Zoning .................................................................. 7
Port Zoning ........................................................................................................... 7
WWN Zoning......................................................................................................... 7
Virtual Ports .......................................................................................................... 7
Host Bus Adapter Settings ........................................................................................... 8
QLogic Fibre Channel Card BIOS Settings .......................................................... 8
Emulex Fiber Channel Card BIOS Settings .......................................................... 8
QLogic iSCSI HBAs .............................................................................................. 8
Modifying Queue Depth in an ESX Environment ......................................................... 9
Overview ............................................................................................................... 9
Host Bus Adapter Queue Depth ........................................................................... 9
Modifying ESX Storage Driver Queue Depth and Timeouts .............................. 10
Modifying the VMFS Queue Depth for Virtual Machines .................................... 11
Modifying the Guest OS Queue Depth ............................................................... 12
Setting Operating System Disk Timeouts .................................................................. 14
Guest Virtual SCSI Adapters ..................................................................................... 15
Mapping Volumes to an ESX Server ......................................................................... 16
Basic Volume Mapping Concepts ....................................................................... 16
Basic Volume Mapping in Storage Center 4.x and earlier .............................. 16
Basic Volume Mappings in Storage Center 5.x and later ............................... 16
Multi-Pathed Volume Concepts .......................................................................... 17
Multi-Pathed Volumes in Storage Center 4.x and earlier ................................ 18
Multi-Pathed Volumes in Storage Center 5.x and later ................................... 18
Configuring the VMware iSCSI software initiator for a single path ..................... 20
Configuring the VMware iSCSI software initiator for multipathing ...................... 21
VMware Multi-Pathing Policies ........................................................................... 21
Fixed Policy ..................................................................................................... 21
Round Robin ................................................................................................... 22
Most Recently Used (MRU) ............................................................................ 23
Multi-Pathing using a Fixed path selection policy ............................................... 23
Multi-Pathing using a Round Robin path selection policy .................................. 24
Asymmetric Logical Unit Access (ALUA) ............................................................ 24
Additional Multi-pathing resources...................................................................... 24
Boot from SAN ........................................................................................................... 25
Configuring boot from SAN ................................................................................. 25
Disclaimers
General Syntax
Table 1: Document syntax
Item Convention
Menu items, dialog box titles, field names, keys Bold
Mouse click required Click:
User Input Monospace Font
Conventions
Note
Notes are used to convey special information or instructions.
Timesaver
Timesavers are tips specifically designed to save time or reduce the number of steps.
Caution
Caution indicates the potential for risk including system or data damage.
Warning
Warning indicates that failure to follow directions could result in bodily harm.
Customer Support
support@compellent.com
Document Revision
Overview
Prerequisites
This document assumes the reader has had formal training or has advanced working
knowledge of the following:
• Installation and configuration of VMware vSphere 4.x
• Configuration and operation of the Compellent Storage Center
• Operating systems such as Windows or Linux
Intended audience
This document is highly technical and intended for storage and server administrators,
as well as other information technology professionals interested in learning more
about how VMware ESX 4.0 integrates with the Compellent Storage Center.
Introduction
This document will provide configuration examples, tips, recommended settings, and
other storage guidelines a user can follow while integrating VMware ESX Server with
the Compellent Storage Center. This document has been written to answer many
frequently asked questions with regard to how VMware interacts with the Compellent
Storage Center's various features such as Dynamic Capacity, Data Progression, and
Remote Instant Replay.
Compellent advises customers to read the Fiber Channel or iSCSI SAN configuration
guides, which are publicly available on the VMware ESX documentation pages to
provide additional important information about configuring your ESX servers to use
the SAN.
Please note that the information contained within this document is intended
only to be general recommendations and may not be applicable to all
configurations. There are certain circumstances and environments where the
configuration may vary based upon your individual or business needs.
Zoning your fibre channel switch for an ESX server is done much the same way as
you would for any other server connected to the Compellent Storage Center. Here
are the fundamental points:
Port Zoning
If the Storage Center front-end ports are plugged into switch ports 0, 1, 2, & 3, and
the first ESX HBA port is plugged into switch port 10, the resulting zone should
contain switch ports 0, 1, 2, 3, & 10.
Repeat this for each of the HBAs in the ESX server. If you have disjoint fabrics, the
second HBA port in the host should have its zone created in the second fabric.
WWN Zoning
When zoning by WWN, the zone only needs to contain the host HBA port and the
Storage Center front-end “primary” ports. In most cases, it is not necessary to
include the Storage Center front-end “reserve” ports because they are not used for
volume mappings. For example, if the host has two HBAs connected to two disjoint
fabrics, the fiber channel zones would look something like this:
Virtual Ports
If the Storage Center is configured to use fiber channel virtual ports, all of the Front
End ports within each Fault Domain should be included in the zone with the
appropriate ESX HBA.
Make sure that you configure the HBA BIOS settings in your ESX server according to
the latest “Storage Center System Manager User Guide” found on Knowledge
Center. At the time of this writing, here are the current Compellent
recommendations:
Overview
Queue depth is defined as the amount of disk transactions that are allowed to be “in
flight” between an initiator and a target, where the initiator is typically a server port
and the target is typically the Storage Center front-end port.
Since any given target can have multiple initiators sending it data, the initiator queue
depth is generally used to throttle the number of transactions being sent to a target to
keep it from becoming “flooded”. When this happens, the transactions start to pile up
causing higher latencies and degraded performance. That being said, while
increasing the queue depth can sometimes increase performance, if it is set too high,
you run an increased risk of overdriving the SAN.
As data travels between the application and the storage array, there are several
places that the queue depth can be set to throttle the number of concurrent disk
transactions.
The following sections explain how the queue depth is set in each of the layers in the
event you need to change it.
The appropriate queue depth for a server may vary due to a number of
factors, so it is recommended that you increase or decrease the queue
depth only if necessary. See Appendix A for more info on determining the
Caution
proper queue depth.
In addition to setting the queue depth in the driver module, the disk timeouts must
also be set within the same command. These timeouts need to be set in order for the
ESX host to survive a Storage Center controller failover properly.
Please refer to the latest documentation for instructions on how to configure these
settings located on VMware’s web site:
For each of these adapters, the method to set the driver queue depth and timeouts
uses the following general steps:
1) First, find the appropriate driver name for the module that is loaded:
a. vmkload_mod –l |grep “qla\|lpf”
i. Depending on the HBA model, it could be similar to:
1. QLogic: qla2xxx
2. Emulex: lpfcdd_7xx
2) Next, set the driver queue depth and timeouts using the esxcfg-module
command:
a. esxcfg-module -s “param=value param2=value...”
<driver_name>
b. Where:
i. QLogic Parameters: “ql2xmaxqdepth=255
ql2xloginretrycount=60 qlport_down_retry=60”
Disk.SchedNumReqOutstanding (Default=32)
This value can be increased or decreased depending on how many virtual machines
are to be placed on each datastore. Keep in mind, this queue depth limit is only
enforced when more than one virtual machine is active on that datastore.
For example, if left at default, the first virtual machine active on a datastore will have
its queue depth limited only by the queue depth of the storage adapter. When a
second, third, or fourth virtual machine is added to the datastore, the limit will be
enforced to the maximum 32 queue depth or as set by the
Disk.SchedNumReqOutstanding variable.
We recommend keeping this variable set at the default value of 32 unless your virtual
machines have higher than normal performance requirements. (See Appendix A for
more information about determining the appropriate queue depth.)
The method to adjust the queue depth varies between operating systems, but here
are two examples.
The default LSI Logic driver (SYMMPI) is an older LSI driver that must be updated to
get the queue depth higher than 32.
1) First, download the following driver from the LSI Logic download page:
a. Adapter: LSI20320-R
b. Driver: Windows Server 2003 (32-bit)
c. Version: WHQL 1.20.18 (Dated: 13-JUN-05)
d. Filename: LSI_U320_W2003_IT_MID1011438.zip
2) Update the current “LSI Logic PCI-X Ultra320 SCSI HBA” driver to the newer
WHQL driver version 1.20.18.
3) Using regedit, add the following keys: (Backup your registry first)
[HKLM\SYSTEM\CurrentControlSet\Services\symmpi\Parameters\Device]
"DriverParameter"="MaximumTargetQueueDepth=128;" (semicolon required)
"MaximumTargetQueueDepth"=dword:00000080 (80 hex = 128 decimal)
Since the default LSI Logic driver is already at an acceptable version, all you need to
do is add the following registry keys.
1) Using regedit, add the following keys: (Backup your registry first)
[HKLM\SYSTEM\CurrentControlSet\Services\LSI_SCSI\Parameters\Device]
"DriverParameter"="MaximumTargetQueueDepth=128;" (semicolon required)
"MaximumTargetQueueDepth"=dword:00000080 (80 hex = 128 decimal)
[HKLM\SYSTEM\CurrentControlSet\Services\LSI_SAS\Parameters\Device]
"DriverParameter"="MaximumTargetQueueDepth=128;" (semicolon required)
"MaximumTargetQueueDepth"=dword:00000080 (80 hex = 128 decimal)
Please visit VMware’s Knowledge Base for the most current information
about setting the queue depth with different vSCSI controllers or operating
Note systems.
Setting Operating
System Disk Timeouts
For each operating system running within a virtual machine, the disk timeouts must
also be set so the operating system can handle storage controller failovers properly.
Examples of how to set the operating system timeouts can be found in the following
VMware documents:
Here are the general steps to setting the disk timeout within Windows and Linux:
Windows
1) Using the registry editor, modify the following key: (Backup your registry first)
[HKLM\SYSTEM\CurrentControlSet\Services\Disk]
"TimeOutValue"=dword:0000003c (3c hex = 60 seconds in decimal)
Linux
For more information about setting disk timeouts in Linux, please refer to the
following VMware Knowledge Base article:
When creating a new virtual machine there are four types of virtual SCSI Controllers
you can select depending on the guest operating system selection.
BusLogic Parallel
This vSCSI controller is used for certain older operating systems. Due to this
controller’s queue depth limitations, it is not recommended you select it unless that is
the only option available to your operating system. This is because when using
certain versions of Windows, the OS issues only enough I/O to fill a queue depth of
one.
VMware Paravirtual
This vSCSI controller is a high-performance adapter that can result in greater
throughput and lower CPU utilization. Due to feature limitations when using this
adapter, we recommend against using it unless the virtual machine has very specific
performance needs. More information about the limitations of this adapter can be
found in the “vSphere Basic System Administration” guide, in a section titled, “About
Paravirtualized SCSI Adapters”.
For example:
You have three ESX servers named ESX1, ESX2, and ESX3.
You create a new volume named "LUN10-vm-storage".
This volume must be mapped to each of the ESX servers as the same LUN:
As an added benefit, when a new ESX host is placed into the server cluster, all of the
existing volume mappings assigned to the server cluster will be applied to the new
host. This means that if the cluster has 100 volumes mapped to it, presenting all of
them to a newly created ESX host is as simple as adding it to the cluster object.
Similarly, if you remove a host from the server cluster, the cluster mappings will also
be removed, so it is important that those volumes are not being used by the host
when you remove it.
Only volumes that are mapped to an individual host, such as the boot volume, will
remain once a host is removed from the server cluster.
Also in Storage Center versions 5.x and higher, you can let the system auto select
the LUN number, or you can manually specify a preferred LUN number from the
advanced settings screen in the mapping wizard.
This advanced option will allow administrators who already have a LUN numbering
scheme to continue doing so, but if a LUN is not manually specified, the system will
auto select a LUN for each volume incrementally starting at LUN 1.
When naming volumes from within the Compellent GUI, it may be helpful to
specify the LUN number as part of the volume name. This will help you
Timesaver quickly identify which volume is mapped using each LUN.
If the LUN number does not remain consistent between multiple hosts or
multiple HBA's, VMFS datastores may not be visible to all nodes,
Note preventing use of VMotion, HA, DRS, or FT.
Keep in mind that when a volume uses multiple paths, the first ESX initiator in each
server will need to be mapped to one front end port, while the second ESX initiator
will be mapped to the other front end port in that same controller. For example:
This means that when configuring multi-pathing in ESX, you cannot map a single
volume to both controllers at the same time, because a volume can only be active on
one controller at a time.
Before beginning, with certain versions of Storage Center, you may need to
enable multi-pathing from within the Storage Center GUI. From within the
Note system properties, under the "mapping" section, check the box labeled,
"Allow volumes to be mapped to multiple fault domains", then click OK.
Below is an example of how two volumes mapped from two separate controllers to a
single ESX host should look when finished.
Multipathing to an ESX host is automatic if the server object has multiple HBA’s or
iSCSI initiator ports assigned to it. In other words, you will have to use the advanced
options if you don’t want to multipath a volume.
From the advanced mapping page, here are some of the options to mapping to an
ESX 4.x host.
Select LUN: This option is to manually specify the LUN. If you do not check this box,
it will automatically pick a LUN for you.
Restrict Mapping paths: This option is used when you need to only map a volume
to a specific HBA in the ESX host.
Map to Controller: By default, the system will automatically select which controller
the volume should be mapped. If you would like to force a particular controller to
handle the I/O, use this option to do so.
Configure Multipathing: This option designates how many of the Storage Center FE
ports you will allow the volume to be mapped through. For example if each controller
has 4 Front End ports, selecting unlimited will map the volume through all 4, whereas
selecting 2 will only use 2 of the 4 front end ports. The system will automatically
select the 2 front end ports based on which already have the fewest mappings.
After following the instructions on how to configure the software iSCSI initiator to use
both NICs for multipathing, you can then add the ESX host to the Storage Center.
Since the ESX software iSCSI adapter appears differently to the Storage Center than
other software iSCSI initiators, there is an additional step to making sure both paths
are added correctly.
When adding the iSCSI Initiators to the server object in the Compellent GUI, it is
recommended that you uncheck the “Use iSCSI Names” so that each initiator can be
added and configured independently.
If the ESX Software Initiator is added with its iSCSI name, it will still multipath the
volumes mapped through it, however you will lose the ability to map a volume to a
single path if desired.
Fixed Policy
If you use the fixed policy, it will give you the greatest control over the flow of storage
traffic. However, you must be careful to evenly distribute the load across all host
HBAs, Front-End Ports, and Storage Center controllers.
When using the fixed policy, if a path fails, all of the LUNs using it as their preferred
path will fail over to the secondary path. When service resumes, the LUNs will
resume I/O on their preferred path.
Round Robin
The round robin path selection policy uses automatic path selection and load
balancing to rotate I/O through all available paths. It is important to note that round
robin load balancing does not aggregate the storage link bandwidth; it merely
distributes the load across adapters.
Using round robin will reduce the management headaches of manually balancing the
storage load across all storage paths as you would with a fixed policy; however there
are certain situations where using round robin does not make sense.
For instance, it is generally not considered good practice to enable round robin
between an iSCSI path and fiber channel path, nor enabling it to balance the load
between a 2GB FC and a 4GB FC path.
If you chose to enable round robin for one or more datastores/LUNs, you should be
careful to ensure all the paths included are identical in type, speed, and have the
same queue depth setting.
Here is an example of what happens during a path failure using round robin.
The round robin path selection policy (PSP) can be set to the default with
the following command. After setting round robin as the default and
rebooting, any new volumes mapped will acquire this policy, however,
Note
mappings that already existed beforehand will have to be set manually.
The round robin path selection policy should not be used for volumes
belonging to guests running Microsoft Clustering Services.
Caution
Example 1: (Bad)
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA1 -as- LUN 10 (Active/Preferred)
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA2 -as- LUN 10 (Standby)
This example would cause all I/O for both volumes to be transferred over HBA1.
Example 2: (Good)
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA1 -as- LUN 10 (Active/Preferred)
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA2 -as- LUN 10 (Standby)
This example sets the preferred path to more evenly distribute the load between both HBAs.
Although the fixed multi-pathing policy gives greater control over which path transfers
the data for each datastore, you must manually validate that all paths have
proportional amounts of traffic on each ESX host.
Example 1:
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA1 -as- LUN 10 (Active)
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA2 -as- LUN 10 (Active)
The benefits of booting from SAN are obvious. It alleviates the need for internal
drives and allows you the ability to take replays of the boot volume.
However there are also benefits to booting from Local disks and having the virtual
machines located on SAN resources. Since it only takes about 15-30 minutes to
freshly load and patch an ESX server, booting from local disks gives them the
advantage of staying online if for some reason you need to do maintenance to your
fibre channel switches, ethernet switches, or the controllers themselves. The other
clear advantage of booting from local disks is being able to use the VMware iSCSI
software initiator instead of iSCSI HBAs or fibre channel cards.
In previous versions of ESX, if you booted from SAN you couldn't use RDM's,
however since 3.x this behavior has changed. If you decide to boot from SAN with
ESX 3.x or 4.x you can also utilize RDM's.
Since the decision to boot from SAN depends on many business related factors
including cost, recoverability, and configuration needs, we have no specific
recommendation.
When mapping the boot volume to the ESX server for the initial install, the boot
volume should only be mapped down a single path to a single HBA. Once ESX has
been loaded and operating correctly, you can then add the second path for the boot
volume.
In Storage Center 4.x and earlier, when initially mapping the boot volume to the ESX
host, the mapping wizard will allow you to select the individual paths, making sure to
specify LUN 0. You then use the same procedure to add the second path once the
host is up and running, being sure to rescan the host HBAs once the path has been
added.
In Storage Center 5.x and later, you will need to enter the advanced mapping screen
to select a few options to force mapping down a single path.
• Check: Map volume using LUN 0
• Check: Only map using specified server ports
o Select the HBA that is selected to boot from within the HBA BIOS
• Maximum number of paths allowed: Single-path
Once the ESX host is up and is running correctly, you can then add the second path
to the boot volume by modifying the mapping. To do this, right click on the mapping
and select, “Modify Mapping”.
nd
Once the 2 path has been added, you can then rescan the HBAs on the ESX host.
The reasoning behind keeping a limited number or Virtual Machines and/or VMDK
files per datastore is due to potential I/O contention, queue depth contention, or SCSI
reservation errors that may degrade system performance. That is also the reasoning
behind creating 500GB – 750GB datastores, because this helps limit the number of
virtual machines you place on each.
The art to virtual machine placement revolves highly around analyzing the typical disk
I/O patterns for each of the virtual machines and placing them accordingly. In other
words, the “sweet spot” of how many virtual machines you put on each datastore is
greatly influenced by the disk load of each. For example, in some cases the
appropriate number for high I/O load virtual machines may be less than 5, while the
number of virtual machines with low I/O disk requirements may be up to 20.
Since the appropriate number of virtual machines you can put onto each datastore is
subjective and dependent on your environment, a good recommendation is to start
with 10 virtual machines, and increase/decrease the number of virtual machines on
each datastore as needed.
The most common indicator that a datastore has too many virtual machines placed
on it would be the frequent occurrence of “SCSI Reservation Errors” in the
vmkwarning log file. That said, it is normal to see a few of these entries in the log
from time to time, but when you notice them happening very frequently, it may be
time to move some of the virtual machines to a new datastore of their own. Moving
virtual machines between datastores can even be done non-disruptively if you are
licensed to use VMware’s Storage vMotion.
The second most common indicator that the datastore has too many virtual machines
placed on it is if the queue depth of the datastore is regularly exceeding the limit set
in the driver module. Remember that if the driver module is set to a 256 queue depth,
the maximum queue depth of each datastore is also 256. This means that if you have
16 virtual machines on a datastore all heavily driving a 32 queue depth (16 * 32 =
512), they are essentially overdriving the disk queues by double, and the resulting
high latency will most likely degrade performance. (See Appendix A for more
information on determining if the queue depth of a datastore is being correctly
utilized.)
Due to how Dynamic Block Architecture virtualizes the blocks, manual partition
alignment is generally not necessary. This is because the Storage Center
automatically aligns its 512K, 2M, or 4M pages to the physical sector boundaries of
the drives. Since the largest percentage of performance gains are seen from aligning
the Storage Center pages to the physical disk, the remaining areas that can be
aligned and tuned have a minor effect on performance.
Based on internal lab testing, we have found that any performance gains achieved by
manually aligning partitions are usually not substantial enough (±1%) to justify the
extra effort. However, before deciding whether or not to align VMFS partitions, it is
recommended that you perform testing to determine the impact that an aligned
partition may have on your particular application because all workloads are different.
To manually align the VMFS block boundaries to the Storage Center page
boundaries for your own performance testing, the recommended offset when creating
a new datastore is 8192 (or 4 MB).
This is an example of a fully aligned partition in the Storage Center, where one guest
I/O will only access necessary physical disk sectors:
In other words, you should choose your block size based on the largest virtual disk
you plan to put on the datastore.
The default block size is 1 MB, so if you need your virtual disks to be sized greater
than 256 GB, you will need to increase this value. For example, if the largest virtual
disk you need to place on a datastore is 200 GB, then a 1 MB block size should be
sufficient, and similarly, if you have a virtual machine that will require a 400 GB virtual
disk, then the 2 MB block size should be sufficient.
You should also consider future growth of the virtual machine disks when choosing
the block size. If a virtual machine resides on a datastore formatted with a 1 MB
block size, and in the future it needs one of its virtual disks extended beyond 256 GB,
the virtual machine would have to be relocated to a different datastore with a larger
block size. This is because a datastore must be re-formatted to change the block
size.
Since certain VAAI offload operations require that the source and
destination datastores have the same VMFS block size, it is worth
Note considering a standard block size for all of your datastores. Please consult
the vStorage APIs for Array Integration FAQ for more information.
When deciding how to layout your VMFS volumes and virtual disks, as discussed
earlier, it should reflect the performance needs as well as application and backup
needs of the guest operating systems.
Regardless of how you decide to layout your virtual machines, here are some basic
concepts you should consider:
There are two main reasons for separating operating system pagefiles onto their own
volume/datastore.
• Since pagefiles can generate a lot if disk activity when the memory in the
virtual machine or ESX host runs low, it could keep volume replays smaller.
• If you are replicating those volumes, it will conserve bandwidth by not
replicating the operating system pagefile data
If you decide that separating pagefiles will make an impact in reducing replay sizes,
the general recommendation is to create "pairs" of volumes for each datastore
containing virtual machines. If you create a volume that will contain 10 virtual
machines, you should to create a second volume to store the operating system
pagefiles for those 10 machines.
For example:
Often the question is asked whether or not it is a good idea to place all of the
operating system pagefiles on a single datastore. Generally speaking, this is not a
very good practice for a couple of reasons.
First, the pagefile datastore can also experience contention from queue depth
utilization or disk I/O; so too many vmdk files during a sudden memory swapping
event could decrease performance even further. For example, if a node in the ESX
HA cluster fails, and the effected virtual machines are consolidated on the remaining
hosts. The sudden reduction in overall memory could cause a sudden increase in
paging activity that could overload the datastore causing a performance decrease.
Second, it becomes a matter of how many eggs you want in each basket. Operating
systems are usually not tolerant of disk drives being unexpectedly removed. If an
administrator were to accidentally unmap the pagefile volume, the number of virtual
machines affected would be isolated to a handful of virtual machines instead of all
the virtual machines.
To help organize the LUN layout for your ESX clusters, some
administrators prefer to store their layout in a spreadsheet. Not only does
this help to design your LUN layout in advance, but it also helps you keep
Timesaver
things straight as the clusters grow larger.
There are many factors that may influence architecting storage with respect
to the placement of virtual machines. The method shown above is merely
Note a suggestion, as your business needs may dictate different alternatives.
Advantages
• Granularity in replication
o Since the Storage Center replicates at the volume level, if you have
one virtual machine per volume, you can pick and choose which
virtual machine to replicate.
• There is no I/O contention as a single LUN is dedicated to a virtual machine.
• Flexibility with volume mappings.
o Since a path can be individually assigned to each LUN, this could
allow a virtual machine a specific path to a controller.
• Statistical Reporting
o You will be able to monitor storage usage and performance for an
individual virtual machine.
• Backup/Restore of an entire virtual machine is simplified
o If a VM needs to be restored, you can just unmap/remap a replay in
its place.
Disadvantages
• You will have a maximum of 256 virtual machines in your ESX cluster.
o The HBA has a maximum limit of 256 LUNs that can be mapped to
the ESX server, and since we can only use each LUN number once
when mapping across multiple ESX servers, it would essentially
have a 256 virtual machine limit.
• Increased administrative overhead
o Managing a LUN for each virtual machine and all the corresponding
mappings may get challenging.
Raw Device Mappings (RDM's) are used to map a particular LUN directly to a virtual
machine. When an RDM, set to physical compatibility mode, is mapped to a virtual
machine, the operating system writes directly to the volume bypassing the VMFS file
system.
There are several distinct advantages and disadvantages to using RDM's, but in
most cases, using the VMFS datastores will meet most virtual machines needs.
Advantages of RDM's:
• Ability to create a clustered resource (i.e. Microsoft Cluster Services)
o Virtual Machine to Virtual Machine
o Virtual Machine to Physical Machine
• The volume can be remapped to another physical server in the event of a
disaster or recovery.
• Ability to convert physical machines to virtual machines more easily
o Physical machine volume can be mapped as an RDM.
• Can be used when a VM has special disk performance needs
o There may be a slight disk performance increase when using an
RDM versus a VMFS virtual disk due to the lack of contention, no
VMFS write penalties, and better queue depth utilization.
• The ability to use certain types of SAN software
o For example, the Storage Center's Space Recovery feature or
Replay Manager.
More information about these features can be found in the
Compellent Knowledge Center.
• The ability to assign a different data progression profile to each volume.
o For example, if you have a database VM where the database and
logs are separated onto different volumes, each can have a separate
data progression profile.
• The ability to adding a different replay profile to each volume.
o For example, a database and it’s transaction logs may have different
replay intervals and retention periods for expiration.
Disadvantages of RDM's:
• Added administrative overhead due to the number of mappings
• There are a limited number of LUNs that can be mapped to an ESX server
o If every virtual machine used RDM's for drives, you would have a
maximum number of 255 drives across the cluster.
• Physical mode RDMs cannot use ESX snapshots
o While ESX snapshots are not available for physical mode RDMs,
Compellent Replays can still be used to recover data.
Just like a physical server attached to the Storage Center, Data Progression will
migrate inactive data to the lower tier inexpensive storage while keeping the most
active data on the highest tier fast storage. This works to the advantage of VMware
because multiple virtual machines are usually kept on a single volume.
However, if you do encounter the business case where particular virtual machines
would require different RAID types, some decisions on how you configure Data
Progression on volumes must be made.
Like previously mentioned at the beginning of this section, unless you have a specific
business need that requires a particular virtual machine or application to have a
specific RAID type, our recommendation is to keep the configuration simple. In most
cases, you can use the Data Progression “Recommended” setting, and let it sort out
the virtual machine data automatically by usage.
A note about Data Progression Best Practices: You should create a replay
schedule for each volume that (at a minimum) takes one daily replay that doesn’t
expire for 25 hours or more. This will have a dramatic effect on Data Progression
Note
behavior, which will increase the overall system performance.
Introduction
Compellent’s thin provisioning feature named “Dynamic Capacity” allows less storage
to be consumed for virtual machines thus saving storage costs. The following section
describes the relationship that this feature has with virtual machine storage.
Thick
(a.k.a. “zeroedthick”) [Default]
Only a small amount of disk space is used within the Storage Center at virtual disk
creation time, and new blocks are only allocated on the Storage Center during write
operations. However, before any new data is written to the virtual disk, ESX will first
zero out the block, to ensure the integrity of the write. This zeroing of the block before
the write induces extra I/O and an additional amount of write latency which could
potentially affect applications that are sensitive to disk latency or performance.
Thin Provisioned
This virtual disk format is used when you select the option labeled “Allocate and
commit space on demand”. The Logical space required for the virtual disk is not
allocated during creation, but it is allocated on demand during first write issued to the
block. Just like thick disks, this format will also zero out the block before writing data
inducing extra I/O and an additional amount of write latency.
Eagerzeroedthick
This virtual disk format is used when you select the option labeled “Support clustering
features such as Fault Tolerance”. Space required for the virtual disk is fully
allocated at creation time. Unlike with the zeroedthick format, all of the data blocks
within the virtual disk are zeroed out during creation. Disks in this format might take
much longer to create than other types of disks because all of the blocks must be
zeroed out before it can be used. This format is generally used for Microsoft clusters,
and the highest I/O workload virtual machines because it does not suffer from the
same write penalties as the zeroedthick or thin formats.
• Zeroedthick
o Virtual disks will be thin provisioned by the Storage Center
• Thin
o Virtual disks will be thin provisioned by the Storage Center
o There are no additional storage savings while using this format
because the array already uses its thin provisioning (see below)
• Eagerzeroedthick
o Depending on storage center version, this format may or may not
pre-allocate storage for the virtual disk at creation time.
o If you create a 20GB virtual disk in this format, the Storage Center
will normally consume 20GB, with one exception. (See the “Storage
Center Thin Write Functionality” section below.)
We recommend sticking with the default virtual disk format (zeroedthick) unless you
have a specific need to pre-allocate virtual disk storage such as Microsoft clustering,
VMware Fault Tolerance, or for virtual machines that may be impacted by the thin or
zeroedthick write penalties.
When creating virtual disks on these versions of firmware, all virtual disk formats will
be thin provisioned at the array level, including eagerzeroedthick.
However, if you need to use VMware’s thin provisioning for whatever reason, you
must pay careful attention not to accidentally overrun the storage allocated. To
prevent any unfavorable situations, you should use the built-in vSphere datastore
threshold alerting capabilities, to warn you before running out of space on a datastore.
However, the “Compellent Enterprise Manager Server Agent” contains the necessary
functionality to recover this free space from Windows machines. It does this by
comparing the Windows file allocation table to the list of blocks allocated to the
volume, and then returning those free blocks into the storage pool to be used
elsewhere in the system. It is important to note though, blocks which are kept as part
of a replay, cannot be freed until that replay is expired.
The free space recovery functionality can only be used in Windows virtual machines
under the following circumstances:
For more information on Windows free space recovery, please consult the
“Compellent Enterprise Manager User Guide”.
Within an ESX server, there are three ways in which you can extend or grow storage.
The general steps are listed below, but if you need additional information, please
consult the following documentation pages:
To extend the space at the end of a Storage Center volume as shown above, you
can do so from the Compellent GUI. After the volume has been extended, and the
hosts HBA has been rescanned, you can then edit the properties of the datastore to
grow it by clicking on the “Increase…” button, and then follow through the ”Increase
Datastore Capacity” wizard.
Be careful to select the volume that is “Expandable” otherwise you will be adding a
VMFS “extent” to the datastore (see section below on VMFS extents).
If you extend a VMFS volume (or RDM) beyond the 2047GB/1.99TB limit,
that volume will become inaccessible by the ESX host. If this happens, the
Caution most likely scenario will result in recovering data from a replay or tape.
Screenshot: Growing a virtual disk from the virtual machine properties screen
For Windows machines: After growing the virtual disk from the vSphere client, you
must log into the virtual machine, rescan disks from Windows disk management, and
then use DISKPART to extend the drive.
Microsoft does not support extending the system partition (C: drive) of a
machine.
Caution
Just as with VMFS datastore volumes, it is also very important not to extend an RDM
volume past the 2047GB/1.99TB limit.
Since the subject of backing up virtual machines is so vast, this section will only
cover a few basics. If you need more information about virtual machine backup
strategies, an excellent resource is the “Virtual Machine Backup Guide” found on
VMware’s documentation pages. Depending on the version of ESX you are using,
this guide is usually found with the “VMware Consolidated Backup” documentation.
After you have completed recovering the file, it is important that you remove the
recovered virtual disk from the virtual machine before unmapping or deleting the view
volume.
If you are moving a vmdk file back to its original location, remember that you must
power off the virtual machine to overwrite the virtual disk. Also, depending on the
size of the virtual disk, this operation may take anywhere between several minutes to
several hours to finish. When moving the virtual disk from the vSphere client
datastore browser, the time required to move this virtual disk is greatly increased due
to the fact that the virtual disk being moved is automatically converted to the
eagerzeroedthick format regardless of the original format type.
If you want to preserve the original VMDK format during the copy, you can
specify either the “-d thin” or “-d zeroedthick” options when using
Note
vmkfstools through the ESX CLI. In addition to preserving the original
format, this may also reduce the time required to copy the VMDK, since
vmkfstools will not have to write the “white space” (zeros) associated with the
eagerzeroedthick format.
If virtual center detects a duplicate UUID, you may be prompted with the following
virtual machine message:
• I copied it – This option will regenerate the configuration file UUIDs and the
MAC addresses of the virtual machine ethernet adapters.
If you do not know which option to chose, you should select “I copied it”, which will
regenerate a new MAC address to prevent conflicts on the network.
Replication Overview
Storage Center replication in coordination with the vSphere 4.x line of products can
provide a robust disaster recovery solution. Since each different replication method
effects recovery a little differently, choosing the correct one to meet your business
requirements is important. Here is a brief summary of the different options.
• Synchronous
o The data is replicated real-time to the destination. In a synchronous
replication, an I/O must be committed on both systems before an
acknowledgment is sent back to the host. This limits the type of links
that can be used, since they need to be highly available with low
latencies. High latencies across the link will slow down access times
on the source volume.
o The downside to this replication method is that replays on the source
volume are not replicated to the destination, and any disruption to
the link will force the entire volume to be re-replicated from scratch.
o Keep in mind that synchronous replication does not make both the
source and destination volumes read/writeable.
• Asynchronous
o In an asynchronous replication, the I/O needs only be committed and
acknowledged to the source system, so the data can be transferred
to the destination in a non-concurrent timeframe. There are two
different methods to determine when data is transferred to the
destination:
By replay schedule – The replay schedule dictates how
often data is sent to the destination. When each replay is
taken, the Storage Center determines which blocks have
changed since the last replay (the delta changes), and then
transfers them to the destination. Depending on the rate of
change and the bandwidth, it is entirely possible for the
replications to “fall behind”, so it is important to monitor them
to verify that your recovery point objective (RPO) can be met.
Replicating the active replay – With this method, the data
is transferred “near real-time” to the destination, usually
requiring more bandwidth than if you were just replicating the
replays. As each block of data is written on the source
volume, it is committed, acknowledged to the host, and then
transferred to the destination “as fast as it can”. Keep in
mind that the replications can still fall behind if the rate of
change exceeds available bandwidth.
o Asynchronous replications usually have less stringent bandwidth
requirements making them the most common replication method.
Since block changes are not replicated bidirectionally with standard replication, this
means that you will not be able to VMotion virtual machines between your source
controllers (your main site) and your destination controller (your DR site). That being
said, there are a few best practices to replication and remote recovery that you
should consider.
• You will need compatible ESX server hardware at your DR site to map your
replicated volumes to in the event your source ESX cluster becomes
inoperable.
• You should make preparations to have all of your Virtual Center resources
replicated to the DR site as well.
• To keep your replication sizes smaller, you should separate the operating
system pagefiles onto their own non-replicated volume.
For more information, please consult the “Compellent Storage Center Best Practices
for Live Volume” guide available on Knowledge Center.
Appendixes
The general rule of thumb is to set the queue depth high enough so that you achieve
an acceptable number of IOPS from the back end spindles, while at the same time,
not setting it too high allowing an ESX host to flood the front or back end of the array.
• iSCSI
o Software iSCSI
Leave the queue depth set to default and only increase if
necessary
o Hardware iSCSI
Leave the queue depth set to default and only increase if
necessary
The best way to determine if you have the appropriate queue depth set is by using
the esxtop utility. This utility can be executed from one of the following locations:
When opening the esxtop utility, the best place to monitor queue depth and
performance is from the “Disk Device” screen. Here is how to navigate to that screen:
The quick and easy way to see if your queue depth is set correctly is to monitor the
queue depth section in coordination with the latency section.
Generally speaking, if the LOAD is consistently greater than 1.00 on one or more
LUNs, the latencies are still acceptable, and the back end spindles have available
IOPS, then increasing the queue depth may make sense.
However, if the LOAD is consistently less than 1.00 on a majority of the LUNs, and
the performance and latencies are acceptable, then there is usually no need to adjust
the queue depth.
In the screenshot above, the device queue depth is set to 32. As you can see, three
of the four LUNs consistently have a LOAD above 1.00. If the back end spindles are
not maxed out, it may make sense to increase the queue depth.
As you can see, by increasing the queue depth from the previous example, the total
IOPS increased from 6700 to 7350 (109%), but the average device latency
(DAVG/cmd) increased from 18ms to 68ms (377%). That means the latency over
tripled for a mere 9% performance gain. In this case, it may not make sense to
increase the queue depth because latencies became too high.
For more information about the disk statistics in esxtop, consult the esxtop man page,
or the VMware document: “vSphere Resource Management Guide – Appendix A”.
To add vCenter credentials into Enterprise Manager, enter the Servers Viewer
screen, and then right click on Servers then select Register Server.
After entering vCenter credentials, you will now be able to see aggregate storage
statistics, as well as being able to automatically provision VMFS datastores and
RDMs.
For example, when creating a new volume, if you select the ESX host, it will
automatically give you the option to format it with VMFS.
Conclusion
Hopefully this document has answered many of the questions you have encountered
or will encounter while implementing VMware vSphere with your Compellent Storage
Center.
More information
If you would like more information, please review the following web sites:
• Compellent
o General Web Site:
http://www.compellent.com/
o Compellent Training
http://www.compellent.com/services/training.aspx
• VMware
o General Web Site:
http://www.vmware.com/
o VMware Education and Training
http://mylearn1.vmware.com/mgrreg/index.cfm
o VMware Infrastructure 4 Online Documentation:
http://pubs.vmware.com/vsp40
o VMware Communities
http://communities.vmware.com