One of the more powerful features of IBM’s PowerVM product suite is the ability to share I/O resources via the Virtual I/O Server (VIOS). VIOS is a virtual appliance running in its own partition(s) and is responsible managing shared access to network and storage devices. For the purposes of this post, I am going to focus on storage virtualization and the coming changes in VIOS 2.2 SP01 due out later this month.
At present there are two methods for virtualizing disk storage via the Virtual I/O Server. VIOS can either act as a storage virtualizer (vSCSI), or as a storage pass-through by sharing access to fiber channel adapters via N-port ID Virtualization (NPIV). Both methods have their pros and cons, and a single environment may simultaneously use both vSCSI and NPIV to accomplish a given design goal.
The vSCSI storage virtualization method was introduced with VIOS in 2004. With vSCSI virtualization, LUNs are assigned to the VIO server(s) and are re-presented to the client LPARs as generic SCSI disks attached to a vSCSI client adapter. Any specialized storage device drivers or multipath I/O code runs within the VIO server.
When a VIO client accesses a vSCSI disk, the vSCSI initiator makes a hypercall to the PowerVM hypervisor, which in turns notifies the VIO server of the request. The VIO server determines which physical adapter(s) are required to serve the request and sends their hardware addresses to the hypervisor. Finally the hypervisor maps the adapter physical address to the VIO client’s data buffer address to set up the data transfer directly from the physical adapter to the client. This transfer is done via the Logical Remote DMA protocol and does not require the VIO server to buffer the data transfer. This makes vSCSI data access extremely efficient as the only virtualization overhead is in the address resolution stage of the data transfer.
The primary benefit of vSCSI is that it consolidates the device driver setup and configuration into a relatively small number of logical partitions. Instead of having to maintain SDD, PowerPath, or another driver in each of the client partitions, this work is consolidated into the VIO servers. Additionally, if redundant VIO servers are deployed (which is highly recommended), storage device driver code can be updated in the VIO server without affecting the client LPARs. Having a relatively generic storage configuration in the client LPARs reduces the workload on the systems administrators and improves availability.
The most common criticism of the vSCSI approach is that it can become unwieldy to manage in a large environment. Each VIO server that can serve a given client LPAR must have all of the storage that the client LPAR requires mapped to it. In a Live Partition Mobility environment, the VIO servers on each of the physical Power Systems servers that can potentially host the client LPAR must also have the client’s storage mapped. Unless careful documentation is maintained, it is possible to accidentally reassign storage that is in use on one client to another. If this happens, data corruption is the result.
NPIV, on the other hand, uses the VIO server to create a synthetic fiber channel adapter in the client LPAR. This synthetic adapter has its own WWNN which appears to exist independently of the physical fiber channel HBA in the VIO server. When storage is provisioned to the client LPAR, it is mapped directly to the virtual adapter WWNN. The VIO server does not see the client storage and is not involved in data transfers. Provisioning storage, in effect, looks exactly like provisioning storage to a non-virtual environment.
Since LUNs are provisioned directly from the storage device to the client LPAR, any specialized device drivers and multipath I/O code must be installed in the client LPAR. This makes the managing the client LPAR a bit more complicated and can affect availability if updating the driver(s) requires a reboot.
To summarize, vSCSI places all of the intelligence in the VIO server at the expense of increasing complexity in VIO server management. NPIV places all of the intelligence in the VIO client at the expense of complexity in VIO client management. Since, in most environments, the number of VIO clients is an order of magnitude greater than the number of VIO servers, I tend to make use of vSCSI over NPIV. Both methods, however, are equally valid and the choice of which to use should be a pragmatic decision.
On October 7, 2010, IBM announced a new set of VIO server based storage management features that look to provide a third way to manage storage in a large virtual I/O environment. VIO server enhancements include thin provisioning of VIOS based storage, shared storage pools, and improved virtual storage management. These enhancements are scheduled to be included in VIOS 2.2 SP01 which is scheduled to become Generally Available in December 2010.
While I have not yet had the opportunity to test VIOS 2.2 SP01 in my lab, I am told that it provides the ability to explicitly create clusters of VIO servers that are aware of how storage is being used across the cluster. Each VIO server in the cluster will use common device names and metadata. This will allow the creation of a single storage pool that spans all VIO servers that serve a given client population instead of the current process of manually tracking vSCSI assignments. This alone addresses the majority of the problems in large vSCSI environments.
Incorporating thin provisioning and other advanced storage management techniques into the VIO server itself also has great potential. I’m looking forward to seeing how these are implemented and how they can be used to more efficiently manage storage in my virtual environments. There are interesting times coming in the VIO server world.


