In my opinion if you are doing VDI with Persistent Desktops using Full Clones you are rowing against the current.
I have previously stated that Floating Pools and Linked Clones (or similar technology) is the way I believe every VDI deployment should be done (either using VMware View, XenDesktop or any other broker). Persistent Desktops should be treated as exceptions to address specific user or application needs. You will find reasoning in my article Floating Pools are the way to go…
The idea of data de-duplication to provide storage savings does not make sense for VDI. The real cost of a VDI solution is on the number of IO operations you can get per $ spent, not on the usable disk space you get per $.
There is need to divide VDI in two large buckets: Full Clones and Linked Clones.
De-dupe with Linked Clones
Linked Cloning technology utilizes a single image to provide a unique system disk to all virtual machines running in a View cluster (if you are interested in a deep dive on how Linked Cloning works please read VMware View 4.5 Linked Cloning explained). This single image called Replica will be eventually serving hundreds of clone VMs and this data will often be served from your fastest pool of disks, most likely SSD. These Replicas have a very small storage footprint – often around few GB. De-dupe is not useful for this little amount of data that will be kept as hot blocks for 100% of the time.
In View 4.5 VMware introduced the concept of Disposable disks. These are .vmdk files created to host Windows temporary files such as log, Internet cache and Windows swap file. Each individual VM will have data inherent only to that particular VM and not necessarily common blocks to be de-duped. If disposable disks are used then a de-dupe solution that is not inline (I am not aware of any storage vendor providing inline de-dupe for primary storage) will not provide any benefit. Either way, just to close this thought – disposable disks are deleted with every VM power off.
It is also possible to have Persistent VMs using Linked Clones. This scenario gives you great operational flexibility, allowing use of Refresh and Recompose operations. For this end it is common to make use of Persistent Disks. Those are disks created to host user’s personal settings and sometimes user data, such as My Documents. These disks are better served from a NAS instead of primary storage array as they do not require the same performance. If there is an opportunity for de-dupe this is the place, however If you offload the data to the NAS the persistent disk will only be couple MB, hosting user and computer registries.
De-dupe with Full Clones
Full Clones are “full fat” VMs that will persist across sessions. Full Clones can be created and managed by the connection manager (eq. VMware View Manager) or created and managed by an external entity. If VMs are created by an external entity they are then added to a Manual Pool trough available APIs and/or CLI’s.
The first point about full clones that I would like to create awareness is that Full Clones do not offer a supported and easy way to provide DR. If you are interested in a deeper read on DR for VDI you may go on and read my article VMware View Disaster Recovery Scenarios & Options.
Full Clones are the only pool mode that would make use of de-duplication since all VMs will have exactly the same data in storage. However, most of the block communalities exist only up to the moment the user starts to use the VM. When the VMs are in use they will write their own pertinent data, logging files and Windows Page Files. There are block communalities; however as memory starts to swap to disk this communality will be much lesser.
In my opinion, it doesn’t make sense to have a de-dupe engine running in the background or scheduled to run out of business hours when it’s more important and expensive to buy IOPs than usable disk space for VDI solutions.
One of my customers had offline de-duplication scheduled to run overnight and the task was running into business hours affecting user’s ability to perform their work.
Either way you know my take, Floating Pools are the way to go….
Reduce your storage footprint
I am not saying that reducing storage footprint is not important, but there are other ways to achieve that without having to sacrifice your VDI infrastructure.
Increasing VM memory reservation is one of the easiest ways to reduce storage footprint. As an example, a VM with 2GB RAM will often utilize 2GB of disk space for the VM swap file. Setting the Parent VM with a 50% memory reservation in vSphere will make the storage footprint drop to 1GB. This little teak could help you save 1 terabyte of storage in a VDI solution with 1,000 desktops.
Another powerful option is the Windows profiling and customization. I recommend reading Mastering VDI Templates updated for Windows7 and PCoIP and VDI Base Image: The Missing Step for additional information on how to profile your guest OS.
My personal take is that
- De-dupe does not offer practical nor operational benefits when using Floating or Persistent Pools with Linked Clones.
- De-dupe may be beneficial in environments that use Persistent Full Clones 100%, however Full Clones and Manual Pools are not the way to move forward with your VDI strategy, specially in DR situations.
- $ per IO is far more important than $ per GB for VDI deployments.
- Finally, I believe on VMware Linked Cloning and Citrix MCS technologies.