Professional Documents
Culture Documents
Abstract 1. Introduction
In the last couple of years, VMware has engaged with storage
vendors to come up with solutions like VAAI, VASA and VVols. VMware developed SampleVP as internal VASA provider for
These solutions require close engagement with core partners at functional testing. It is used for testing vSphere features like
both design and implementation phases, thus increasing VASA and VVols. With introduction of VVols, key VM life cycle
VMware’s reliance on partner code drops to release a feature. The operations such as snapshot, clone, etc are offloaded to VASA
quality of VMware code and the ability of QE to test the entire provider. From a scalability point of view, it is necessary to
feature heavily depend on the quality of partner code drops and perform these operations fast even for larger disk sizes. The
the speed with which partners can fix issues found in testing. existing architecture of SampleVP performed poorly in both the
Often release timeline differences between partners and vSphere data path and the control path. This impacts VMware’s ability to
make things harder. Thus, based on the VAAI experience, deliver newer features faster. xVP is major re-architecture of the
VMware developed an internal VASA provider (AKA SampleVP) existing SampleVP, aimed at providing performance, scalability
for VASA and VVols, which enabled developers to test features and ease of development. xVP primarily uses ZFS in its backend
before providing periodic drops to partners during development to provide instantaneous snapshot and fast clone operation. Apart
phases. Initially, SampleVP was used as a model VP, which from supporting ZFS, it allows backend plugins so that third party
helped partners in writing their VASA 1.0 providers and also storage test targets like SanBlaze can be integrated with xVP.
helped VMware QE for VASA1.0 testing. But for VVol it was
extended with SCSI and NFS backends using Linux Volume Fig 1, shows the basic architecture of xVP. It includes 4 main
manager. SampleVP was primarily designed for correctness for components: Java frontend, generic Python back end, backend
functional verification rather than high-performance for large- plugins and MySQL Database.
scale testing. SampleVP being a basic implementation, VMware
still has to rely on partner drops for large scale testing, which still
poses a serious hurdle.
VC
for large scale testing. Primarily, changes involve using ZFS as a VASA Server (VP)
than XML files. This provides high performance on data services Java Front end – tomcat service Local
Client
3. Related Work
4.2 Architecture
Fig 3 shows xVP detailed architecture. It describes details of
various xVP frontend and backend components and interaction
between them. Subsequent sections will describe more details
about xVP frontend and backend components.
5. xVP Frontend
Fig 4: ConfigDB Schemas for VVol 1.0 and VVol replication 7.3 ZFS Backend:
ZFS is a copy-on-write transactional file system, which also
supports logical volumes and has snapshot and clone capabilities.
7. xVP backend We utilize ZFS logical volumes (zvols) to store VVol objects.
ZFS supports taking snapshots of multiple volumes at the same
Primary feature of xVP backend is support of data path operations time. This is important for many VASA workflows, especially
using ZFS backend. when creating points in time for VASA 3.0 replication support.
xVP being a test target, it utilizes a single ZFS pool for the entire
Why ZFS: xVP instance including all containers and sites (in replication
The xVP backend is where API processing happens. SampleVP configurations) which allows the use of native clones across
used LVM for its storage backend. One of the major drawbacks of containers and for replication purposes. We had to add a few low
level operations to ZFS to support VASA bitmap operations on VASA3 testing: For 2016, release xVP 1.1 implements all of the
VVols. In particular, we've added support for getting allocated VASA 3.0 (VVol replication APIs) functionality and provides a
bitmaps for a zvol, diffing arbitrary zvols and copying differences solid platform for testing VVol based replication solution. VVol
between arbitrary zvols. These features were added by accessing replication Layer 0 and Layer 1 testing is currently performed on
zfs internal block metadata rather than having to read the data xVP.
itself, which improves performance significantly, especially for System Test: In recent runs where the xVP VM was deployed on
sparse disks, which are very common in vSphere environments in SSD backed VMFS5 datastore on ESX host 128GB memory,
general and in test environments in particular. Certain VASA System Test team was able to power on 400 VMs using 4 ESX
workflows such as reverting to arbitrary snapshot are not natively nodes. VMs are combination of Linked clone, full clone and non-
supported by ZFS, but we were able to overcome these limitations persistent VMs. This is tremendous improvement over SampleVP,
by utilizing native ZFS clones and keeping track of the which used to fail to power on a couple of VMs simultaneously.
relationship within the VVol object hierarchy in the xVP database. To give more perspective for these numbers, for VVol 1.0 release
Overall goal that we had and were able to achieve completely in design partners DELL and HP support 1024VVols/200VMs.
our first implementation was to implement all VASA workflows Apart from this System Test is able to Test 1024 VASA endpoint
including replication in a metadata driven manner, without the testing using xVP handful of xVP VM.
need to read, compare and copy volume data which had to be The team is currently fixing issues found in other
done in sampleVP for many workflows. System Test workflows. Based on above results, team is well on
7.4 VVolFS: course of achieving the promised goal of passing System Test on
We added a kernel module, VVolFS, which takes exports a ZFS xVP for VVol 2.0 replication feature with 500 VMs.
device as a separate file system with a single file representing that
VVol. This is used in mounting and exporting NFS non-config
VVols. 10. Future Work
Future tasks once VASA 3 is released
The rest of the data path stack remains the same. Files on ZFS will
be exported over using SCST/iSCSI and NFS for IO. Information - Support multiple VASA versions in single xVP VM
about binding of VVols to Protocol endpoints is stored in - Support VASA 1.0 APIs and other related functionality so
ConfigDB. that SampleVP can be fully deprecated.
- Support the VAIO based VASA APIs
- Implement Request Filter and SanBlaze plugin.
8. xVP Packaging
Acknowledgements
Apart from other performance limitations, one of the limitations We would like to thank Patrick Dirks, Derek Uluki, Naga Ullas
of SampleVP was slow OVF Deployment. This is because the Vankayala Harinathagupta, Deepak Babarjung and all of the other
SampleVP base image was oversized and SampleVP code was folks in VASA3 development that have contributed to the xVP
getting compiled during the installation process. In xVP, we use a project with design input, reviews, and implementation.
VMware Studio-generated VA to create a thin Ubuntu 14.4 OVF.
Since xVP will not be shared with any partner, there is no need to References
compile a stripped version of the code on xVP VM, so all xVP [1] Virtual Volume Specification (1 & 2) by the VVol team and
bits are packaged as Debian packages. Also it is possible to pass SPBM team.
OVF parameters to the VA, which means it’s easier to control
[2] vStorage API specification, v10.4,
package installation process without regenerating the OVF image.
http://url.eng.vmware.com/xpe
With these optimizations, standard xVP VM can be deployed in
approximately 70sec. This is 5-10 times faster than SampleVP. [3] T10 SCSI Architecture Model, http://www.t10.org/cgi-
bin/ac.pl?t=f&f=sam4r14.pdf
Apart from this, xVP comes with a CLI tool (xvp-mgmt) to
manage xVP setup. It also comes with pre-canned configurations [4] T10 Object-Based Storage Devices,
as required by various QE teams. http://www.t10.org/drafts.htm#OSD_Family
[5] VMware View, http://www.vmware.com/products/view/
[6] VMware RDM whitepaper,
9. Current Status www.vmware.com/pdf/esx25_rawdevicemapping.pdf
[7] VMware SRM whitepaper,
Currently, the xVP 1.0 implements all VASA 2.0 (VVol 1.0) APIs http://www.vmware.com/files/pdf/techpaper/srm5-perf.pdf
and is actively used by all QE teams in CAT runs. Due to
limitations of SampleVP snapshot implementation, CAT runs for [8] VASA 1.0 specification
snapshot, clone and migration workflows used to frequently fail [9] NetApp architecture whitepapers (FAS, Virtual Storage Tier)
due to timeout errors caused by slow storage. With xVP 1.0 we http://www.netapp.com/us/library/white-papers.html
don’t see any of these intermittent timeout errors. Also overall full
[10] EMC VPLEX and FAST, www.emc.com/VPLEX
test cycle time have also improved w.r.t SampleVP.
[11] Understanding VMware Snapshots,
http://kb.vmware.com/kb/1015180