Professional Documents
Culture Documents
February 2017 Updated VxRail description, various diagrams and cache disk information.
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL
INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND.
Copyright 2016-2017 Dell Inc. All rights reserved. Dell and the Dell logo are trademarks of Dell Inc. in the United States and/or other jurisdictions. All
other marks and names mentioned herein may be trademarks of their respective companies.
1.1 Purpose
This document addresses the architecture design, configuration and implementation considerations for the
key components of the architecture required to deliver virtual desktops via VMware Horizon on Dell EMC
VxRail with vSphere 6.0 Update 2 on VMware vSAN 6.2.
1.2 Scope
Relative to delivering the virtual desktop environment, the objectives of this document are to:
2.1 Introduction
Dell Wyse Datacenter solutions provide a number of deployment options to meet your desktop virtualization
requirements. Our solution is able to provide a compelling desktop experience to a range of employees within
your organization from task workers to knowledge workers to power users. The deployment options for Dell
Wyse Datacenter include:
Linked Clones(Non-persistent)
Full Clone Virtual Desktops (Persistent)
RDSH with VMware vSAN
The Dell EMC VxRail appliances are very powerful Hyper Converged Infrastructure Appliances (HCIA)
delivered in 1U/2U Rack Building Blocks. The appliances are built on VMware vSAN technology on VMware
VSphere and EMC software. VxRail allows the seamless addition of additional nodes to the appliances from
the minimum supported three nodes up to 64 nodes.
The Dell EMC VxRail Appliance platforms are equipped with Broadwell processors and you can now start a
cluster with 3 nodes at 25% lower entry price to support smaller deployments and this would be ideal for small
deployments/POC environments, the recommended starting block is with a four node appliance configuration.
VxRail can now support storage-heavy workloads with storage dense nodes, graphics-heavy VDI workloads
with GPU hardware, and entry-level nodes for remote and branch office environments finally, you can
upgrade from VxRail 3.5 to 4.0 software with a single click via the VxRail manager interface.
VxRail allows customers to start small and scale as there requirements increase. Single-node scaling and
low-cost entry-point options give you the freedom to buy just the right amount of storage and compute,
whether just beginning a project or adding capacity to support growth. A single node V Series appliance can
scale from 16-40 CPU cores, it can have a maximum of 24TB raw with the hyrbid configuration and 46TB
Raw with the all-flash. A 64 Node all-flash cluster delivers a maximum of 2,560 cores and 1,840 TB of raw
storage.
vSAN
vCenter
ESXi
vRealize Log Insight
VxRail Manager
vSAN
vCenter
vRealize Log Insight
The customer is prompted during deployment to input an existing vSphere license as although ESXi is
installed as part of the factory process a license for ESXi is not included with VxRail.
Optional Software
VxRail also includes optional licensed software that is not pre-installed and configured but the customer is
entitled to licenses for this software. They are EMC CloudArray and RecoverPoint.
CloudArray
A cloud gateway that allows you to expand local storage with using capacity in the cloud.
A licensed is included with every VxRail appliance purchase - 1TB local / 10TB cloud.
1TB acts as hot cache and as it fills, colder data will be moved to the 10TB capacity in the cloud.
The license does not include the actual cloud storage, but just the ability to manage it. The cloud
storage must be purchases separately.
CloudArray is downloaded and installed from the VxRail Manager Marketplace.
RecoverPoint
vSphere Data Protection is also available to be downloaded and installed via the VxRail marketplace. This
software is licensed via vSphere and does not come licensed with VxRail.
It is fully integrated with VMware vCenter Server and VMware vSphere Web Client, providing disk-based
backup of virtual machines. This software provides full virtual machine restore and file level restore without
the need for an agent to be installed in every virtual machine. The patented, variable-length deduplication
technology across all backup jobs significantly reduces the amount of backup data disk space needed.
Networking Layer
Compute Server Layer
Management Server Layer
Storage Layer (VMware vSAN)
Thin Client Layer (please refer to section 3.6)
These components have been integrated and tested to provide the optimal balance of high performance and
lowest cost per user. The Dell EMC VxRail appliance is designed to be cost effective allowing IT departments
to implement high-performance fully virtualized desktop environments.
2.4.1 Networking
Designed for true linear scaling, Dell EMC VxRail series leverages a Leaf-Spine network architecture. A Leaf-
Spine architecture consists of two network tiers: an L2 Leaf and an L3 Spine based on 40GbE and non-
blocking switches. This architecture maintains consistent performance without any throughput reduction due
to a static maximum of three hops from any node in the network.
The best practices guide for VxRail 4.0 with S4048-ON is located here.
Dell recommends that the VDI management infrastructure nodes be separated from the compute resources,
in this configuration both management and compute are in the same vSphere HA Cluster. Optionally, the
management node can be used for VDI VMs as well with an expected reduction of 30% for these nodes only.
The 30% accounts for the amount of resources needed to be reserved for management VMs so this needs to
be factored in when sizing. Compute hosts can be used interchangeably for Horizon or RDSH as required.
VMware vSAN is software-defined storage solution fully integrated into vSphere. Once enabled on a cluster,
all the flash or magnetic hard disks present in the hosts are pooled together to create a shared data store that
will be accessible by all hosts in the VMware vSAN cluster. Virtual machines can then be created and a
storage policy can be assigned to them. The storage policy will dictate availability / performance and sizing.
From a hardware perspective, at least three ESXi hosts (four recommended) are required for the VMware
vSAN cluster. Each host will need at least one SSD for the cahe tier and one HDD/SSD for the capacity tier to
form a Disk Group. A Disk Group consists of cache and capacity devices with a maximum of 1 SSD for cache
and up to 7 devices for capacity. A maximum of 4 total Disk Groups per node is supported. The SSD acts as
a read cache and a write buffer. The read cache keeps a list of commonly accessed disk blocks and the write
cache behaves as a non-volatile write buffer. It is essential to the performance of the VMware vSAN as all I/O
goes to the SSD first. The higher the performance of the disks then the better the performance of your virtual
machines. Its important to determine the number of simultaneous write operations that a particular SSD is
capable of sustaining in order to achieve adequate performance.
All virtual machines deployed to VMware vSAN have an availability policy setting that ensures at least one
additional copy of the virtual machine data is available; this includes the write cache contents. When a write is
initiated by the VM then it is sent to both the local write cache on the owning host and also to the write cache
on the remote hosts. This ensures we have a copy of the in cache data in the event of a host failure and no
data will get corrupted. If a block is requested and not found in the read cache, the request is directed to the
HDD.
Magnetic hard disk drives (referred to as HDDs from here on) have two roles in VMware vSAN. They make up
the capacity of the VMware vSAN data store as well as making up components for a stripe width. SAS and
NL-SAS are supported.
VMware vSAN implements a distributed RAID concept across all hosts in the cluster, so if a host or a
component within a host (e.g. an HDD or SSD) fails then virtual machines still have a full complement of data
objects available and can continue to run. This availability is defined on a per-VM basis through the use of VM
storage policies.
VMware vSAN 6.2 provides two different configuration options, a hybrid configuration that leverages flash-
based devices for the cache tier and magnetic disks for the capacity tier, and an all-flash configuration. This
delivers enterprise performance and a resilient storage platform. The all-flash configuration uses flash for both
the cache tier and capacity tier.
3.1 Network
The following sections contain the core network components for the Dell Wyse Datacenter solutions. General
uplink cabling guidance to consider in all cases is that TwinAx is very cost effective for short 10 GB runs and
for longer runs use fiber with SFPs.
For more information on the S3048, S4048 switches and Dell Networking, please visit this link.
In the Local Tier 1 model, VDI sessions execute from local storage on each Compute server. The hypervisor
used in this solution is vSphere. In this model, both the Compute and Management server hosts access
VMware vSAN storage. The Management, VDI, vMotion and VSAN VLANS are configured across 2 x 10 GB
on the NDC.
The VxRail portfolio, optimized for VDI, has been designed and arranged in three top-level overarching
configurations which apply to the available physical platforms showcased below.
A5 configuration is perfect for small scale, POC or low density cost-conscience environments.
Available in the
B5 configuration is geared toward larger scale general purpose workloads, balancing performance
and cost-effectiveness.
C7 is the premium configuration offering an abundance of high performance and tiered capacity
where user density is maximized.
The NVIDIA Tesla M60 GPU accelerator works with NVIDIA GRID
software to provide the industrys highest user performance for
virtualized workstations, desktops, and applications. It allows enterprises to virtualize almost any application
(including professional graphics applications) and deliver them to any device, anywhere.
4.1 VMware
4.1.1 vSphere 6
The vSphere hypervisor also known as ESXi is a bare-metal hypervisor that installs directly on top of your
physical server and partitions it into multiple virtual machines. Each virtual machine shares the same physical
resources as the other virtual machines and they can all run at the same time. Unlike other hypervisors, all
management functionality of vSphere is done through remote management tools. There is no underlying
operating system, reducing the install footprint to less than 150MB.
VMware vSphere 6 includes three major layers: Virtualization, Management and Interface. The Virtualization
layer includes infrastructure and application services. The Management layer is central for configuring,
provisioning and managing virtualized environments. The Interface layer includes the vSphere web client.
Throughout the Dell Wyse Datacenter solution, all VMware and Microsoft best practices and prerequisites for
core services are adhered to (NTP, DNS, Active Directory, etc.). The vCenter 6 VM used in the solution is a
single Windows Server 2012 R2 VM (Check for current Windows Server OS compatibility at
http://www.vmware.com/resources/compatibility) or vCenter 6 virtual appliance, residing on a host in the
management Tier. SQL server is a core component of the Windows version of vCenter and is hosted on
another VM also residing in the management Tier. It is recommended that Composer is installed on a
standalone Windows Server 2012 R2 VM when using the vCenter Server Appliance.
4.1.2 vSAN
This release of VMware vSAN delivers following important new features and enhancements:
Deduplication and compression: VMware vSAN now supports deduplication and compression to eliminate
duplicate data. This technique reduces the total storage space required to meet your needs. When you
enable deduplication and compression on a VMware vSAN cluster, redundant copies of data in a particular
disk group are reduced to single copy. Deduplication and compression are available as a cluster-wide setting
only available as a feature on all-flash clusters.
Enabling deduplication and compression can reduce the amount of storage consumed by as much as 7x.
Actual reduction numbers will vary as this depends primarily on the types of data present, number of duplicate
blocks, how much these data types can be compressed, and distribution of these unique blocks.
RAID 5 and RAID 6 erasure coding: VMware vSAN now supports both RAID 5 and RAID 6 erasure coding to
reduce the storage space required to protect your data. RAID 5 and RAID 6 are available as a policy attribute
for VMs in all-flash clusters.
Quality of Service: With the Quality of Service addition to VMware vSAN IOPS limits are now available.
Quality of service for VMware vSAN is a Storage Policy Based Management (SPBM) rule. Because quality of
The term noisy neighbor is often used to describe when a workload monopolizes available I/O or other
resources, which negatively affect other workloads on the same platform.
For more information on whats new in VMware vSAN, please visit this link.
VMware vSAN is licensed via the Horizon Advanced or Enterprise license. The Advanced and Enterprise
Horizon licenses will cover both the Hybrid and All-Flash configurations of VMware vSAN.
RAID-5 uses x1.33 the capacity with FTT=1 and requires a minimum of four hosts in the VSAN Cluster. RAID-
6 with FTT=2 uses x1.5 the capacity and requires a minimum of six hosts in the VMware vSAN Cluster.
The general recommendation for sizing flash capacity for VMware vSAN is to use 10% of the anticipated
storage capacity before the number for FTT is considered.
We also need to factor in how much free capacity or Slack Space needs to be preserved when designing
the capacity requirement for the VMware vSAN Cluster. The recommendation by VMware is that this should
be 30%. The reasoning for this slack space size that the VMware vSAN will begin automatically rebalancing
when a disk reaches the 80% full threshold and the additional 10% has been added as a buffer. This is not a
hard limit or set via a security policy so the customer can actually use this space but should be made aware of
the performance implications of going over the 80% full threshold. More information can be found on the
design and sizing of VMware vSAN6.2 Cluster here
The scenario below is using a VM which consumes 50 GB of space. The hybrid configuration has a default
FTT value of 1 and Failure Tolerance Method (FTM) of RAID-1 which has 2x overhead and with FTT=2 that
has 3x overhead. The FTM of RAID5/6 is only available with the all-flash configuration and with FTT=1 the
overhead is 1.33x, for FTT=2 is 1.5x.
Comparing both FTT=1 scenarios below for both the hybrid and all-flash we can see the capacity savings of
over 33GBs per VM so if we had 200VMs per Host thats a capacity saving of over 660GB of usable VM
space per Host.
Prior to VMware vSAN 6.2, RAID-1 (Mirroring) was used as the failure tolerance method. VMware vSAN 6.2
adds RAID-5/6 (Erasure Coding) to all-flash configurations. While RAID 1(Mirroring) may be favored where
performance is the most important factor it is costly with regards to the amount of storage needed.
RAID-5/6 (Erasure Coding) data layout can be configured to help ensure the same levels of availability, while
consuming less capacity than RAID-1 (Mirroring). Use of erasure coding reduces capacity consumption by as
much as 50% versus mirroring at the same fault tolerance level. This method of fault tolerance does require
additional write overhead in comparison to mirroring as a result of data placement and parity.
Deduplication and Compression are two new features that are only available with the all-flash configuration.
These features cannot be enabled separately and are implemented at the cluster level. When enabled,
VMware vSAN will aim to deduplicate each block and compress the results before destaging the block to the
capacity layer. Deduplication and compression work at a disk group level and only objects that are deployed
on the same disk group can contribute towards space savings, if components from identical VMs are
deployed to different disk groups there will not be any deduplication of identical blocks of data.
The VMware vSAN Read/Write process for both hybrid and all-flash are not the same.
VMware vSAN Hybrid Read: For an object placed on a VMware vSAN datastore, when using RAID-1
configuration it is possible that there are multiple replicas when the number of failure to tolerate are set to
greater than 0.Reads may now be spread across the replicas, different reads may be sent to different replicas
according to the logical block address and this is to ensure that VMware vSAN does not consume more read
cache than is necessary, this avoids caching the data in multiple locations.
VMware vSAN All-Flash Read: Since there is no read cache in an All Flash configuration the process is
much different to the Hybrid read operation. The write buffer is first checked to see if the block is present
when a read is issued on an all-flash VMware vSAN. This is also the case on hybrid but the difference being
with hybrid is that if the block is located in the write buffer it will not be fetched from here. If the requested
block is not in the write buffer it will be fetched from the capacity tier but since the capacity tier is also SSD the
latency overhead in the first checking the cache and then the capacity tier is minimal. This is main reason why
there isnt a read cache with all-flash, the cache tier is a dedicated write buffer which in turns frees up the
cache tier for more writes boosting overall IOPS performance.
VMware vSAN All-Flash Write: The write process on all-flash is similar to the write process on hybrid, the
major difference between both is that with all-flash 100% of the cache tier is assigned to the write buffer
whereas with hybrid only 30% is assigned to the write buffer, and the other 70% is assigned to the read
cache.
VMware Horizon 7 has a built in storage policy for VMware vSAN. When creating a desktop pool with
Horizon, select Use VMware vSAN option for Storage Policy Management.
Each policy can be edited but it is recommended to refer to design and sizing guide for VMware vSAN 6.2
located here before making any change to the policy.
4.1.3 Horizon
The solution is based on VMware Horizon which provides a complete end-to-end solution delivering Microsoft
Windows or Linux virtual desktops to users on a wide variety of endpoint devices. Virtual desktops are
dynamically assembled on demand, providing users with pristine, yet personalized, desktops each time they
log on.
The Horizon License matrix can be found here . The Horizon Enterprise license will cover Just in time
desktops and App Volumes whereas these new features are not covered under the Standard and Advanced
Horizon licenses.
Horizon Connection Server (HCS) Installed on servers in the data center and brokers client connections,
The VCS authenticates users, entitles users by mapping them to desktops and/or pools, establishes secure
connections from clients to desktops, support single sign-on, sets and applies policies, acts as a DMZ security
server for outside corporate firewall connections and more.
Horizon Client Installed on endpoints. Is software for creating connections to Horizon desktops that can be
run from tablets, Windows, Linux, or Mac PCs or laptops, thin clients and other devices.
Horizon Portal A web portal to access links for downloading full Horizon clients. With HTML Access
Feature enabled enablement for running a Horizon desktop inside a supported browser is enabled.
Horizon Agent Installed on all VMs, physical machines and Terminal Service servers that are used as a
source for Horizon desktops. On VMs the agent is used to communicate with the Horizon client to provide
services such as USB redirection, printer support and more.
Horizon Administrator A web portal that provides admin functions such as deploy and management of
Horizon desktops and pools, set and control user authentication and more.
Horizon Composer This software service can be installed standalone or on the vCenter server and
provides enablement to deploy and create linked clone desktop pools (also called non-persistent desktops).
vCenter Server This is a server that provides centralized management and configuration to entire virtual
desktop and host infrastructure. It facilitates configuration, provision, management services. It is installed on
a Windows Server 2008 host (can be a VM).
Horizon Transfer Server Manages data transfers between the data center and the Horizon desktops that
are checked out on the end users desktops in offline mode. This Server is required to support desktops that
run the Horizon client with Local Mode options. Replications and syncing are the functions it will perform with
offline images.
When Instant Clone technology is used in conjunction with VMware App Volumes and User Environment
Manager, administrators can use Instant Clone Technology to rapidly spin up desktops for users that retain
user customization and persona from session to session, even though the desktop itself is destroyed when
the user logs out. Virtual desktops benefit from the latest O/S and application patches automatically applied
between user logins, without any disruptive recompose.
Blast Extreme is network-friendly, leverages both TCP and UDP transports, powered by H.264 to get the best
performance across more devices, and reduces CPU consumption resulting in less device power consumed
for longer battery life.
VMware App Volumes is a transformative solution that delivers applications to Horizon virtual desktops.
Applications installed on multi-user AppStacks or user-specific writable volumes attach instantly to a desktop
at user login. The App Volumes user experience closely resembles that of applications natively installed on
the desktop with App Volumes, applications become VM-independent objects that can be moved easily
across data centers or to the cloud and shared with thousands of virtual machines.
Policy-Managed Client Features, which enables IT to use policy to define which specific security-impacting
features, are accessible upon login. These include clipboard redirection, USB, printing, and client-drives. All
of these can be enforced contextually, based on role, evaluated at logon/logoff, disconnect/reconnect and at
pre-determined refresh intervals for consistent application of policy across the entirety of the user experience.
For example, a user logging in from a network location consider unsecured, can be denied access to USB
and printing. Additionally, PCoIP bandwidth profile settings allow IT to customize the user experience based
on user context and location.
True SSO streamlines secure access to a Horizon desktop when users authenticate via VMware Identity
Manager. A short lived VMware Horizon virtual certificate is generated, enabling a password-free Windows
login, bypassing the usual secondary login prompt users would encounter before getting to their desktop.
The total number of required virtual RDSH servers is dependent on application type, quantity and user load.
Deploying RDSH virtually and in a multi-server farm configuration increases overall farm performance,
application load balancing as well as farm redundancy and resiliency.
The general guidance for RDSH NUMA-alignment on the Dell EMC VxRail is as follows:
GRID vGPU is the industry's most advanced technology for sharing true GPU hardware acceleration
between multiple virtual desktopswithout compromising the graphics experience. Application features and
compatibility are exactly the same as they would be at the user's desk.
With GRID vGPU technology, the graphics commands of each virtual machine are passed directly to the
GPU, without translation by the hypervisor. This allows the GPU hardware to be time-sliced to deliver the
ultimate in shared virtualized graphics performance.
For more information about NVIDIA GRID vGPU, please visit this link.
M60-8A 8GB 1 2 4
M60-4A 4GB 2 4 8
1 1280x1024
M60-2A 2GB 4 8 16
M60-1A 1GB 8 16 32
M60-4A
GRID Virtual
Application
M60-2A
M60-1A
* Supported guest operating systems listed as of the time of this writing. Please refer to NVIDIAs
documentation for latest supported operating systems.
For organizations deploying For users who need a virtual For users who need to use
RDSH solutions. Designed to desktop, but also need a great professional graphics applications
deliver Windows applications at user experience leveraging PC with full performance on any
full performance. applications, browsers, and high- device, anywhere.
definition video.
**CUDA and OpenCL only supported with M10-8Q, M10-8A, M60-8Q, or M60-8A profiles
The GRID vGPU Manager, running on the hypervisor installed via the VIB, controls the vGPUs that can be
assigned to guest VMs. A properly configured VM obtains a license from the GRID license server during the
boot operation for a specified license level. The NVIDIA graphics driver running on the guest VM provides
direct access to the assigned GPU. When the VM is shut down, it releases the license back to the server. If a
vGPU enabled VM is unable to obtain a license, it will run at full capability without the license but users will be
warned each time it tries and fails to obtain a license.
VxRail Manager 2 8 1 32
Events
Composer
Initial placement of all databases into a single SQL instance is fine unless performance becomes an issue, in
which case database need to be separated into separate named instances. Enable auto-growth for each DB.
Best practices defined by VMware are to be adhered to, to ensure optimal database performance.
Align all disks to be used by SQL Server with a 1024K offset and then formatted with a 64K file allocation unit
size (data, logs and TempDB).
Pay consideration for eventual scaling, access to components that may live on one or more servers (SQL
databases, VMware services) during the initial deployment. Use CNAMEs and the round robin DNS
mechanism to provide a front-end mask to the back-end server actually hosting the service or data source.
For example, the CNAME VDISQL is created to point to SQLServer1. If a failure scenario was to occur and
SQLServer2 would need to start serving data, we would simply change the CNAME in DNS to point to
SQLServer2. No infrastructure SQL client connections would need to be touched.
The following screenshot shows the VMkernel adapter for the management network (vmk0), vMotion and
VMware vSAN Network (vmk2) on a distributed switch.
dvSwitches should be used as desired for VM traffic especially in larger deployments to ease the
management burden across numerous hosts. In the VxRail rack model both the mgmt. hosts connect to
shared storage so require additional VMK ports. Network share values should be configured equally among
the VMKernel port groups that share a physical set of network adapters.
VMware vSAN traffic can be used on 1 GB networks as well as 10 GB networks for Hybrid configuration but
10 GB recommended and is required for All Flash configuration. Standard switch configuration can be used
for Proof of Concept, while VMware distributed virtual switch configuration is highly recommended for
production versions.
Network VMkernel adapter configuration for the host management traffic using a 10 GB network with standard
switch. It is recommended that the network configuration for the VMware vSAN storage is a 10 GB network
with distributed switch configuration.
VMware NSX is implemented via a layered architecture consisting of data, control and management planes.
The NSX vSwitch exists within and requires the vSphere Distributed Switch to abstract the physical network
while proving access-level switching in the hypervisor. NSX enables the use of virtual load balancers,
firewalls, logical switches and routers that can be implemented and scaled seamlessly to suit any deployed
architecture. VMware NSX compliments Dell Networking components deployed ToR, leaf/spine or at the core.
Power of Choose from best-of-breed open networking platforms, operating systems and
Choice applications.
Accelerated Take advantage of open networking with open source standards-based tools and
Innovation expertise to help accelerate innovation.
Open All Dell Networking data center switches support the Open Network Install
Networking Environment (ONIE), allowing customers to choose between multiple operating
Platform systems and meet their unique needs.
Hardware Layer 2 gateway through VXLAN Tunnel End Points (VTEP) bridges virtual and
VTEP physical infrastructures.
Gateway
Virtual VXLAN based network overlays enable logical layer 2 overlay extensions across
Switching a routed (L3) fabric within and across data center boundaries.
Virtual Routing Dynamic routing between virtual networks performed in a distributed manner in
the hypervisor kernel, and scale-out routing with active-active failover with
physical routers.
Load L4-L7 load balancer with SSL offload and pass through, server health checks,
Balancing and App Rules for programmability and traffic manipulation.
For more information on VMware NSX and integrated offers from Dell Networking please see the Dell
Networking Solution Brief and the Reference architecture.
Virtual Desktop VMs per physical host Additional hosts and Additional RAM or CPU
Host/Compute Servers clusters added as compute power
necessary
VMware vCenter VMs per physical host Deploy additional servers Additional vCenter
and/or ESX hosts per and use linked mode to Management VMs
vCenter instance optimize management
Database Services Concurrent connections, Migrate databases to a Additional RAM and CPU
responsiveness of reads/ dedicated SQL server for the management
writes and increase the number nodes
of management nodes
File Services Concurrent connections, Split user profiles and Additional RAM and CPU
responsiveness of reads/ home directories between for the management
writes multiple file servers in the nodes
cluster. File services can
also be migrated to the
optional NAS device to
provide high availability.
Additional switches added to the existing thereby equally spreading each hosts network connections across
multiple switches.
Additional ESXi hosts added in the compute or management layers to provide N+1 protection.
Applicable VMware Horizon infrastructure server roles are duplicated and spread amongst management host
instances where connections to each are load balanced via the addition of F5 appliances.
The virtual machines that are deployed via Horizon are policy driven and one of these policy settings is
Number of failures to tolerate (FTT). The default value for FTT is FTT=1 so that will make a mirrored copy of
the Virtual Machines VMDK, so if the VMDK is 40Gb in size then 80Gb of virtual machine space is needed.
The recommended configuration by VMware for a VMware vSAN Cluster with FTT=1 and RAID 1 is four
nodes and this ensures that the virtual machines are fully protected during operational & maintenance
activities. This configuration can also survive another failure even when there is a host already in
maintenance mode.
The number of supported VMs (200*) is a soft limit and this is discussed further in section 6 of this document.
If the customer desires, some Role VMs can be optionally protected further via the form of a cold stand-by VM
residing on an opposing management host. A vSphere scheduled task can be used, for example, to clone the
VM to keep the stand-by VM current. Note In the HA option, there is no file server VM, its duties have been
replaced by introducing a NAS head.
The following will protect each of the critical infrastructure components in the solution:
For further protection in an HA configuration, deploy multiple replicated Horizon Connection Server instances
in a group to support load balancing and HA. Replicated instances must exist on within a LAN connection
environment it is not recommended VMware best practice to create a group across a WAN or similar
connection.
*The soft limit for the amount of VMs supported per host is 200, this is due to number of total objects that are
supported per cluster. This is a factor in very large clusters but for small to medium Cluster configurations this
should not be an issue.
The detailed validation results and analysis of these reference designs are in the next section.
6.1 Purpose
The purpose of this testing is to validate and provide density figures for the architectural design for Dell EMC
VxRail Appliance Nodes with VSAN 6.2 and VMware Horizon 7. This validation focuses on the V Series
platform configurations.
This testing has been completed on the V470 Hybrid B5 & C7 Node configuration.
Explain the configuration and testing methodology for the density testing.
Provide test results and performance analysis for the configurations.
Provide density limits achieved for the configurations.
Determine maximum density for pooled desktops using LoginVSI Task Worker, Knowledge Worker
and Power Worker workloads.
Analyze the test results and provide information on possible resource constraints that limit the density
or performance.
Avg CPU %: The values shown in the tables are the Compute host steady state averages for CPU
Usage.
Peak Memory Utilization: The figures shown in the tables are the peak consumed memory and peal
active memory values per host.
Peak IOPS: Results are calculated from the Disk IOPS figure at the beginning of the steady state
period divided by the number of users.
6.2.1 VM configurations
The following table summarizes the various workload and profile configuration test results performed on the
Dell EMC VxRail V470/V470F platform using Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz and Intel(R)
Xeon(R) CPU E5-2660 v4 @ 2GHz
We also need to factor in how much free capacity or Slack Space needs to be preserved when designing
the capacity requirement for the vSAN Cluster. The recommendation by VMware is that this should be 30%.
The reasoning for this slack space size that the vSAN will begin automatically rebalancing when a disk
reaches the 80% full threshold and the additional 10% has been added as a buffer. This is not a hard limit or
set via a security policy so the customer can actually use this space but should be made aware of the
performance implications of going over the 80% full threshold. More information can be found on the design
and sizing of vSAN6.2 cluster Here
6.3.1 HW configurations:
The Testing completed for the VxRail and Horizon was completed on the V470 B5 & C7 Hybrid configuration.
The B5 numbers provided at the start of this section have been calculated as a result of these numbers.
6.3.3 SW configurations
The SW included in the solutions and testing is as below:
Profile: This is the configuration of the virtual desktop - number of vCPUs and amount of RAM
configured on the desktop (i.e. available to the user). For this project, instead of using standard Dell
Wyse datacenter profiles, we used VM profiles specified by customers as described in the table of at
the start of this section
Workload: This is the set of applications used by performance analysis and characterization (PAAC)
of Dell Wyse Datacenter solutions (e.g. Microsoft Office applications, PDF Reader, Internet Explorer
etc.)
Load-testing on each of the profiles, described in the VM Profile Used Table above, is carried out using Task
Worker, Knowledge Worker and Power Worker workloads. Further information for each workload can be
found on LoginVSIs website. It is noted that for Login VSI testing, the following login and boot paradigm is
used:
PAAC Methodology
At different times during testing, the testing team will complete some manual User Experience Testing while
the environment is under load. This will involve a team member logging into a session during the run and
completing tasks similar to the User Workload description. While this experience will be subjective, it will help
provide a better understanding of the end user experience of the desktop sessions, particularly under high
load, and ensure that the data gathered is reliable.
For all workloads, the performance analysis scenario will be to launch a user session every 10 seconds.
Once all users have logged in, all will run workload activities at steady-state for 30 minutes and then logoffs
will commence.
The following table shows the key metrics for each workload.
LoginVSI test are show VSI Max was not reached for Task, Power or Knowledge Workers. Summary is in the
below table.
90
80
70
60
50
40
30
20
10
0
23:05
23:30
0:00
0:05
0:10
0:15
0:20
0:25
0:30
0:35
0:40
0:45
22:45
22:50
22:55
23:00
23:10
23:15
23:20
23:25
23:35
23:40
23:45
23:50
23:55
12:45
13:10
11:35
11:40
11:45
11:50
11:55
12:00
12:05
12:10
12:15
12:20
12:25
12:30
12:35
12:40
12:50
12:55
13:00
13:05
13:15
13:20
13:25
13:30
13:35
13:40
Knowledge 130 Worker Workload CPU Usage
20:35
21:40
19:50
19:55
20:00
20:05
20:10
20:15
20:20
20:25
20:30
20:40
20:45
20:50
20:55
21:00
21:05
21:10
21:15
21:20
21:25
21:30
21:35
21:45
21:50
21:55
Memory usage is monitored on ESXi host, memory usage monitored are consumed, active, balloon and swap
used, as swap and ballooning usage would indicate host memory reached saturation point and the VM
performance may start deteriorating. All tests were carried out on hosts with 384GB physical memory installed
and no swapping or ballooning was experienced during the tests.
350000000
300000000
250000000
200000000
150000000
100000000
50000000
0:15
23:35
23:55
0:05
0:25
0:35
0:45
22:45
22:55
23:05
23:15
23:25
23:45
12:25
13:25
11:35
11:45
11:55
12:05
12:15
12:35
12:45
12:55
13:05
13:15
13:35
Granted Active Swap used Balloon Consumed
19:55
20:05
20:15
20:25
20:35
20:45
20:55
21:05
21:25
21:35
21:45
21:55
There were no issues with network usage on any of the test runs. All tests showed that the reboot of the VM
pool before beginning Login VSI testing produced the highest spike in network activity. There is a significant
reduction in activity once the steady state phase is reached after all machines have logged on.
80000
70000
60000
50000
40000
30000
20000
10000
0:20
0:00
0:05
0:10
0:15
0:25
0:30
0:35
0:40
0:45
22:45
22:50
22:55
23:00
23:05
23:10
23:15
23:20
23:25
23:30
23:35
23:40
23:45
23:50
23:55
200000
150000
100000
50000
0
11:35
11:40
11:45
11:50
11:55
12:00
12:05
12:10
12:15
12:20
12:25
12:30
12:35
12:40
12:45
12:50
12:55
13:00
13:05
13:10
13:15
13:20
13:25
13:30
13:35
13:40
Network Usage 130 Knowledge Worker Workload
19:55
20:05
20:15
20:25
20:35
20:45
20:55
21:05
21:15
21:25
21:35
21:45
21:55
The following table shows the key metrics for each workload. Please note that on the Power Worker test a
420 user test was run with 140 virtual machines placed on each host. The graphs for this test run include data
from all three hosts.
Workload FTT Policy Density per Avg CPU Peak Memory Peak Memory Peak
Host % Consumed GB Active GB IOPS/User
Maximum CPU utilization for both the Knowledge and Power workers was in the region of the 85% threshold
indicating the number of users tested was appropriate. The Task Worker workload reached the 80% mark but
as with the V470/V470F-B5 test run, the VMs configured with only one vCPU tend to load the CPU less. The
configuration under test is the V470 Hybrid -C7 version so the E5-2698v4 processor was used.
15:35
15:05
15:10
15:15
15:20
15:30
15:40
15:45
15:50
15:55
16:00
16:05
16:10
16:15
16:20
16:25
16:30
16:35
16:40
16:45
16:50
16:55
CPU Usage 230 Task Worker Workload
10:50
10:20
10:25
10:30
10:35
10:45
10:55
11:00
11:05
11:10
11:15
11:20
11:25
11:30
11:35
11:40
11:45
11:50
11:55
12:00
12:05
12:10
CPU Usage 170 Knowledge Worker Workload
12:05
12:15
12:25
12:35
12:45
12:55
13:05
13:15
13:25
13:35
13:45
13:55
14:05
Latency on the datastore spiked temporarily during the reboot phase of the Power Worker 420 user test. It
briefly rose above the 20ms mark and then settled well below that mark thereafter with no spikes during login
and steady state reaching only 4 ms during these phases. On the Knowledge and Task Worker test runs the
latency did not rise above the 20ms threshold at any point and reaching only 3-4 ms max during logon and
steady state phases.
The IOPS peaked during the boot phase and for each profile test and then settled thereafter during the login
phase and reduced once steady state was reached.
These charts are captured from within vSphere and were a feature released with VSAN6.2 so we do not need
to use VSAN Observer as was previously the case with past VSAN validations.
The statistics below for the Task and Knowledge Worker are on a per host bases so as VSAN scales linearly
to calculate the total IOPS for a three node cluster you would multiple by three. The Power Worker figures are
for three hosts.
Memory usage is monitored on the ESXi host, memory usage metrics monitored are consumed, active,
balloon and swap used, as swap and ballooning usage would indicate host memory reached saturation point
and the VM performance may start to deteriorating. All tests were carried out on hosts with 512 GB of
physical memory installed. There was no swapping or ballooning on the Task Worker test run, but a small
amount of ballooning took place on the Knowledge and Power Worker test runs during the steady state
phase. No swapping took place on any of the test runs.
500000000
400000000
300000000
200000000
100000000
16:25
16:55
15:05
15:10
15:15
15:20
15:25
15:30
15:35
15:40
15:45
15:50
15:55
16:00
16:05
16:10
16:15
16:20
16:30
16:35
16:40
16:45
16:50
Granted Active Swap used Balloon Consumed
500000000
400000000
300000000
200000000
100000000
11:40
12:10
10:20
10:25
10:30
10:35
10:40
10:45
10:50
10:55
11:00
11:05
11:10
11:15
11:20
11:25
11:30
11:35
11:45
11:50
11:55
12:00
12:05
Granted Active Swap used Balloon Consumed
500000000
400000000
300000000 Host A
Host B
200000000 Host C
100000000
0
12:45
11:45
11:55
12:05
12:15
12:25
12:35
12:55
13:05
13:15
13:25
13:35
13:45
13:55
14:05
Consumed Memory Utilization 140 Power Worker Workload, 3 Host
There were no issues with network usage on any of the test runs. There is a significant reduction in activity
once the steady state phase is reached after all machines have logged on.
100000
80000
60000
40000
20000
0
15:20
16:35
15:05
15:10
15:15
15:25
15:30
15:35
15:40
15:45
15:50
15:55
16:00
16:05
16:10
16:15
16:20
16:25
16:30
16:40
16:45
16:50
16:55
Network Utilization 230 Task Worker Workload
100000
80000
60000
40000
20000
0
10:35
11:50
10:20
10:25
10:30
10:40
10:45
10:50
10:55
11:00
11:05
11:10
11:15
11:20
11:25
11:30
11:35
11:40
11:45
11:55
12:00
12:05
12:10
Network Utilization 170 Knowledge Worker Workload
300000
250000
200000
Host A
150000 Host B
Host C
100000
50000
0
11:55
11:45
12:05
12:15
12:25
12:35
12:45
12:55
13:05
13:15
13:25
13:35
13:45
13:55
14:05
The following testing was completed on the V470-C7 configuration with 2 x M60 GPU cards added to the
configuration. The Knowledge Worker workload was used with the M60-1Q GPU profile and with a 2 x M60
configuration the maximum amount of VMs is 32. We are currently unable to test with the maximum
supported display resolution for this profile which is 4096x2160 so 2560x1600
VSImax
0
20000000
40000000
60000000
80000000
100000000
120000000
140000000
6:35:0
6:30:00 PM 6:40:0
6:35:00 PM 6:45:0
6:40:00 PM 6:50:0
Boot
6:45:00 PM 6:55:0
6:50:00 PM 7:00:0
Boot
6:55:00 PM 7:05:0
7:00:00 PM 7:10:0
7:05:00 PM 7:15:0
7:10:00 PM 7:20:0
Logon
Logon
7:25:00 PM 7:35:0
7:30:00 PM 7:40:0
7:35:00 PM 7:45:0
7:40:00 PM 7:50:0
7:45:00 PM 7:55:0
7:50:00 PM 8:00:0
7:55:00 PM 8:05:0
8:00:00 PM 8:10:0
8:05:00 PM 8:15:0
8:10:00 PM 8:20:0
Steady State
8:15:00 PM 8:25:0
Knowledge Worker-1Q-CPU usuage
8:30:0
Steady State
8:20:00 PM
8:25:00 PM 8:35:0
Knowledge Worker-1Q-CPU -Active Memory Kbps
8:30:00 PM 8:40:0
8:35:00 PM 8:45:0
CPU usage for Knowledge worker, 32 clients with M60-1Q Profile
8:45:00 PM 8:55:0
Logoff
8:50:00 PM 9:00:0
8:55:00 PM
9:00:00 PM
84
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
171700000
171750000
171800000
171850000
171900000
6:30:00 PM 171950000
6:35:00 PM
6:30:00 PM
6:40:00 PM
6:35:00 PM
6:45:00 PM
6:40:00 PM
6:50:00 PM
6:45:00 PM
Boot
6:55:00 PM
6:50:00 PM
7:00:00 PM
Boot
6:55:00 PM
7:05:00 PM
7:00:00 PM
7:10:00 PM
7:05:00 PM
7:15:00 PM
7:10:00 PM
Logon
7:30:00 PM
7:25:00 PM
7:35:00 PM
7:30:00 PM
Logon
7:40:00 PM
7:35:00 PM
7:45:00 PM
7:40:00 PM
7:50:00 PM
7:45:00 PM
7:55:00 PM
7:50:00 PM
8:00:00 PM 7:55:00 PM
8:05:00 PM 8:00:00 PM
8:10:00 PM 8:05:00 PM
8:15:00 PM 8:10:00 PM
8:20:00 PM
Steady State
Knowledge Worker-1Q-Network KBytesps
8:15:00 PM
8:25:00 PM
Steady State
8:20:00 PM
8:30:00 PM 8:25:00 PM
Knowledge Worker-1Q-CPU -Consumed Memory
8:35:00 PM 8:30:00 PM
8:40:00 PM
8:50:00 PM
Logoff
8:45:00 PM
Logoff
8:55:00 PM 8:50:00 PM
9:00:00 PM 8:55:00 PM
9:00:00 PM
Datastore IOPS
Latency
The test follows same methodology as previous VDI solution testing using the Task Worker workload in
LoginVSI. The below graphs show the resource utilization of the physical host server (red line) and two of the
four RDSH virtual machines that reside on the physical host (blue and green lines). Note that there is no
reboot phase in the results graphs as the RDSH VMs provides the user desktops directly and there are no
actual linked clones to reboot as with the earlier testing.
VSI Max was not reached on the test run indicating there was no decrease in user experience.
The host CPU was pushed to a 92% steady state average during this test run and the two sampled RDSH
VMs to approximately 80% each. The data suggests that 350 Task Worker users are at the upper end of the
capabilities of this config.
14:17
13:01
13:05
13:09
13:17
13:21
13:25
13:29
13:33
13:37
13:41
13:45
13:49
13:53
13:57
14:01
14:05
14:09
14:13
14:21
14:25
14:29
14:33
CPU Utilization, RDSH, 350 Task Worker Workload, RDSH Host & 2 RDSH VMs
Latency on the datastore did not approach the 20ms threshold. There were no latency spikes during the logon
or steady states phases of the test run with 4.5 ms the maximum value reached and 6.4 ms the maximum
during the logoff phase.
As there was no reboot spike during RDSH testing, IOPS increased steadily as more users logged on reaching
a peak just as the steady state phase began. At peak, IOPS reached approximately 2000 resulting in
approximately 5.7 IOPS per user.
These charts are captured from within vSphere and were a feature released with VSAN6.2 so we do not need
to use VSAN Observer as was previously the case with past VSAN validations.
The statistics below are on a per host bases so as VSAN scales linearly to calculate the total IOPS for a three
node cluster you would multiple by three.
Memory usage is monitored on the ESXi host and the two sampled RDSH VMs. There was no ballooning on
the physical host or the two samples RDSH VMs. No swapping took place on the physical host.
120000000
100000000
80000000
Host
60000000 VM A
VM B
40000000
20000000
0
13:16
13:01
13:06
13:11
13:21
13:26
13:31
13:36
13:41
13:46
13:51
13:56
14:01
14:06
14:11
14:16
14:21
14:26
14:31
Active Memory Utilization, RDSH, 350 Task Worker Workload, RDSH Host & 2 RDSH VMs
200000000
150000000
Host
VM A
100000000
VM B
50000000
0
13:16
13:01
13:06
13:11
13:21
13:26
13:31
13:36
13:41
13:46
13:51
13:56
14:01
14:06
14:11
14:16
14:21
14:26
14:31
Consumed Memory Utilization, RDSH 350 Task Worker Workload, RDSH Host & 2 RDSH VMs
Network utilization was not an issue in this test with host usage reaching a maximum of approximately 65,000
KBytesps.
60000
50000
40000
Host
30000 VM A
VM B
20000
10000
0
13:01
13:06
13:11
13:16
13:21
13:26
13:31
13:36
13:41
13:46
13:51
13:56
14:01
14:06
14:11
14:16
14:21
14:26
14:31
Network Utilization, RDSH, 350 Task Worker Workload, RDSH Host & 2 RDSH VMs
6.6 Conclusion
The testing that was completed with the V470 Hybrid B5 & C7 was in line with the results we had previously
captured for VMware vSAN 6.2. The testing was completed on a Hybrid configuration for both but there would
not be a density increase/decrease if this was an All-Flash configuration. The bottleneck is the CPU/Memory
configuration and having better performing disks/more IOPs which would come with the All-Flash
configuration would not change the amount of users we can host per Appliance. The resource reservation for
the host which has the management VMs is approx. 30%, so taking that into account for a Four Node V470
C7 Hybrid Appliance would host 850 Task Worker Profile VMs.
Thanks to Mike Hayes from the Limerick CSC team for his help and support with the Graphics Functionality
Testing that was completed on VxRail. Mike is a Solutions Architect working at the Dell Customer Solution
Center in Limerick, Ireland. Responsible for Client Solutions and VDI engagements at the Center in EMEA,
Mike has a strong background in Desktop and Server Virtualization with over 15 years experience working in
Enterprise class IT environments. Highly skilled in Microsoft, VMware and Citrix platforms, Mike primarily
works on design workshop and Proof-Of-Concept activity around VDI and high performance graphics,
including Workstation and VR Technology. Twitter:@MikeJAtDell
Thanks to Kevin Corey from the Limerick CSC team for his help and support with the network setup for this
validation. Kevin is a Network Solution Architect with over 17 years experience in working with enterprise
environments. Primarily focusing on data center networking, Kevin has experience working with technology
from all major network vendors.
Thanks to Gus Chavira for his continued guidance in support for this program, Gus is the Dell CCC Alliance
Director to VMware. Gus has worked in capacities of Sys Admin, DBA, Network and Storage Admin,
Virtualization Practice Architect, Enterprise and Solutions Architect. In addition, Gus carries a B.S. in
Computer Science.
Thanks to Andrew Mc Daniel for his support during this program, Andrew is the CTO/ Strategy Director with
CCC- who is responsible for managing team that is responsible for examining new technologies and research
projects to evaluate potential benefit of internal and external partners hardware and software to Dells E2E
solutions for EUC and their strategic integration.
Thanks to Rick Biedler for his support during this program, Rick is the Engineering Director for Datacenter
Appliances at Dell, managing the development and delivery of enterprise class desktop virtualization solutions
based on Dell Datacenter components and core virtualization platforms.
Peter Fine is the Chief Architect and CTO of EUC Enterprise Engineering at Dell. Peter owns the strategy,
architecture and leads the engineering of the datacenter EUC product and solutions portfolio. Peter also has
extensive experience building and managing data centers for Fortune500 and financial services companies.
Follow Peter @ExitTheFastLane or www.ExitTheFastLane.com.