It’s no news that EMC acquired Israeli storage start-up XtremeIO for $430 million. But what exactly is XtremeIO?
XtremIO storage system is based on a scale-out architecture. The system starts with a single building block, called an “X-Brick”. An X-Brick by itself is a highly available, high performance SSD only SAN storage appliance featuring redundant storage processors, redundant power and cooling, 4 x 8Gbps Fibre Channel and 4 x 10Gbps iSCSI host ports.
The building blocks are available with different amounts of usable flash capacity. The logical amount of storage available is substantially higher than the physical flash due to a real-time, inline block de-duplication.
According to XtremIO’s the array’s high performance allows desktops to rapidly suspend and resume. Suspend/resume is typically disabled in VDI deployments because commonly the underlying storage sub-system cannot meet the heavy IO demand of the operation. In a persisted desktop model this capability would allow for fewer physical servers to be deployed since at any given time a percentage of desktops could be suspended due to people being away from their desktops.
High IO performance and low latency also allow the amount of memory allocated to each virtual desktop to be reduced, thus increasing Windows pagefile IO operations on the SAN. This reduces the cost of each physical host, or allows for higher consolidation ratios. I discussed this very same concept in my article VDI Architectures using Storage Class Memory.
During some validation tests the XtremIO’s bricks hit some very interesting numbers. For the validation tests below 2 X-Brick systems and 8 Proliant BL465c G7 servers with 24 cores/total and 256GB each.
Few important modifications to the default vSphere and vCenter configurations have been made to increase performance and response time.
- Reviewing statistics from the XtremIO array was revealed that the array was not under heavy load during the linked clone operations. To resolve the issue, vCenter and View Composer concurrent task settings have been modified. The settings were changed as follows:
- The maximum number of outstanding disk requests was also modified. When two or more virtual machines access the a LUN, Disk.SchedQuantum controls the number of outstanding requests that each virtual machine can issue to the LUN. It is important to also adjust the number of consecutive requests from individual VMs.
- ESX 5.0, by default, will back guest pages with large physical pages (a 2MB contiguous region in physical RAM as opposed to a 4KB region). Furthermore, when using large pages, ESX does not use the TPS (Transparent Page Sharing) memory management technique. However, when there is contention for memory resources, for instance in a densely populated virtual desktop cluster, ESX will break down the 2MB pages into smaller 4KB pages and then utilize TPS as a way to optimally use memory. In a VDI environment VDI cluster tends to be memory bound. In an effort to optimize the process of TPS, large pages were turned off on each host. This is accomplished by setting the Mem.AllocGuestLargePage kernel parameter to zero.
In the past I have written about TPS, Large Memory Pages and your VDI environment andIncrease VDI consolidation ratio with TPS tuning on ESX. I recommend tweaking Mem.AllocGuestLargePage and Mem.ShareScanTime for higher consolidation rates.
- Power-On 1050 Virtual Desktops
Powering on 1050 generated approximately 100,000 IOPS. That’s about 95.2 IOPS per desktop on average. Because all power operations were being controlled by VMware View and View Composer the total operation time was a little over 10 minutes, leaving the total X-Brick node utilization very low, around 20%.
- Refreshing 1100 virtual desktops
A Linked Clone refresh triggers a snapshot-revert operation to a snapshot that was taken after the desktop customization is completed when the desktop was initially provisioned. This allows VMware View to preserve the sysprep or quick-prep customizations.
During the refresh operation for the 1100 virtual desktops there was a peak of about 45,000 IOPS.
- Re-Composing 1100 virtual desktops
A recompose operation allows the administrator to preserve the persistent disk and all user data inside this disk while changing the OS base disk to a new base image + snapshot. This allows administrators to easily push out OS patches and new software to users. Because a new OS disk is created during a recompose, the clone is also re-customized during the recompose and a new snapshot is taken by View Manager once the customization completes.
Linked CLone Recompose operations are perhaps one the most IO intensive operations in the whole VMware View VDI cycle. It includes, creating new replica, re-linking this replica to all virtual desktops and then power on all virtual desktops.
During the recompose process for all 1100 virtual desktops the overall CPU utilization from both X-Bricks stayed bellow 17%.
- During the simulated tests in lab the XtremeIO X-Bricks never came close to it’s saturation limit. In fact, all 1100 virtual desktops along will all linked clone operations could be satisfied by a single X-Brick unit.
- The customer running those tests estimates that a single X-Brick would be to satisfy the capacity and IO performance requirements for 2,200 virtual desktops. Please pay attention here, because your mileage may vary depending on the type of workload and use case.
- During the tests at many instances the VMware vCenter and View Composer were the bottlenecks, limiting the maximum number of simultaneous operations.
It’s is clear to me that XtremeIO has horse power for the most demanding workloads and it’s ability to provide in-line block de-duplication can considerable reduce the capacity footprint required for some, most identical workloads.
As for VDI it certainly solves the performance challenge, but if the cost/performance doesn’t make it cheaper than buying other solutions I am not so sure we will see adoption of the XtremeIO solution in large-scale.
The way I see the industry moving ahead I tend to believe that a mix between local PCIe host cards or point solutions such as Atlantis ILIO will be doing the heavy IO lift, while less powerful storage solutions provide the required capacity. Another very interesting approach for VDI are Google File System (GFS) like appliances with distributed storage, such as Nutanix and VMware Distributed Storage.
Until now the use of such storage technologies would place VDI solutions in the Non-Persistent deployment bucket. However, with innovations VMware Mirage is bringing to VMware View such as application persistence and layering this is not true anymore. I’ll soon write more about it.
Giving the Credits
The tests have been conducted by EMC vSpecialists Garrett Hartney, Itzik Reich and XtremeIO Support Engineer Amit Levin. They also executed more tests that include the provisioning of new virtual desktops and VAAI Space Reclamation.