This is Part 2 of my previous blog about VAAI (vStorage API for Array Integration) with more details about VAAI. VAAI offloads some of the I/O related functions to the VAAI-enable storage array, hence giving the hypervisor more compute and memory resource to do it other functions. And the storage array, upon receiving the VAAI command, will execute whatever that is required of it.
Why is VAAI important? What does it do that makes it so useful and important to the hypervisor?
VAAI is about a set of new SCSI commands. And there are 3 important ones:
And if you want to see the VAAI Hardware Accelerated Full Copy (aka XSET) in action, here’s a little video showing how it is done in an EMC environment.
The primary significance and noticeable benefit is definitely performance. The secondary benefit, though not so obvious, is allowing VMware and its hypervisor to scale because it does not get bogged down by some of the I/O functions that it is not meant to do.
There were some new additions in vSphere 5.0 for VAAI. From its FAQ, it listed in ESX5.0, support for NAS Hardware Acceleration is included with support for the following primitives:
- Full File Clone – Like the Full Copy VAAI primitive provided for block arrays, this Full File Clone primitive enables virtual disks to be cloned by the NAS device.
- Native Snapshot Support – Allows creation of virtual machine snapshots to be offloaded the array.
- Extended Statistics – Enables visibility to space usage on NAS datastores and is useful for Thin Provisioning.
- Reserve Space – Enables creation of thick virtual disk files on NAS.
So, there you have it folks. Why VAAI? Here’s why.
We are used to block-based approach and also the file-based approach to data. The 2 diagrams below shows the basics of how we access data in both block-based and file-based data on the storage device.
For block-based , the storage of the blocks is merely in arrays of unrelated contiguous blocks. For file-based, as seen below,
there is another layer of abstraction, and this is called the file system. But if you seen both diagrams above, there are some random numbers in light blue and that is to represent the storage device, the hard disk drive’s export of “containers” to the file system or the application that is accessing the storage device. This is usually the LBA (Logical Block Addressing), which is basically set of schematics that defines the locations on the hard disk drives. LBA tells the location of where the data is stored. For more information about LBA, check out this Wikipedia definition. But the whole idea is LBA is dumb. It is pretty much static and exported to file systems and applications so that these guys can do something with it.
There’s something brewing in the background since 1994 and it is one of the many efforts to make intelligent storage devices. This new object-based interface was part of the research project done by Carnegie-Mellon University (CMU). Initially, it was known as Network Attached Secure Disk (NASD) but eventually made its way to the working group in SNIA, and developing it for ANSI T10 INCITS standard. ANSI T10 is the guardian of all SCSI standards. This is called Object Storage Device (OSD). The SCSI architecture diagram below shows the layer where OSD resides.
The motivation for this simple: To make storage devices of today to do more computational work, in particularly I/O, relieving the hosts and the local systems to concentrate other computational processing work. And the same time, the local systems must have some level of interactivity and management between the storage object and the computational hosts.
In the diagram below which compares both block-based and OSD,
you can see the separation of file system management interface that is at the kernel-space of the local host/system and this is replaced by the OSD Management interface at the storage device.
What does this all mean? This means that using LBA type of addressing that we are familiar with in the block-based and file-based storage is no longer the way to go, because as I mentioned before, LBA is dumb.
OSD, in some way, replaces the LBA with OIDs (Object IDs). The existing local system and/or its file system will interact with the storage devices with OIDs and the OIDs links to its respective objects storage. And the object will carry a lot of metadata, that represents the object, giving it the intelligent and management capability of the object.
The prominence of the metadata in the OSD would mean that we can build much more intelligent systems in the future. The OIDs and the objects can be grouped together in a flat design or can be organized and categorized in a virtual, hierarchical model.
Object storage is an intelligent evolution of disk drives that can store and serve objects rather than simply place data on tracks and sectors. And it can bring the following benefits:
- Intelligent space management in the storage layer
- Data aware pre-fetching and caching
- Robust shared access by multiple clients
- Scalable performance using off-loaded data path
- Reliability security