A recent discussion with one of the engineers prompted me to write this article. It’s no secret that one of the most important Nutanix features for VDI deployments is Shadow Clones.
Shadow Clones drastically improves VDI performance and end-user experience. I discussed Shadow Clones in the past. However, before explaining how Shadow Clones work let’s recap how VMware View Linked Clones and XenDesktop MCS work.
To deploy linked clone desktops the administrator first creates a Parent virtual machine. After the Parent VM is assigned to the desktop pool or catalog a clone of this VM is created. This clone is called Replica in Horizon View and is used by all desktops as single source for the OS drive; in many cases creating a hot spot on the system.
The Nutanix Shadow Clones allow for distributed caching of a particular disk or VM data, which are in a ‘multi-reader’ scenario. This will work in any scenario, which may be a multi-reader scenario (eg. deployment servers, repositories, etc.).
Data or I/O locality is critical for the highest possible VM performance and a key structure of the Nutanix File System (NDFS). With Shadow Clones the NDFS will monitor disk access trends. When all of the requests are read I/O, the disk, in our case the replica, will be marked as immutable. Once the disk has been marked as immutable the disk will then be cached locally by each Nutanix node making read requests to it. In the background, when that happens every CVM gets the map of where the immutable disk blocks are. If the disk data is local to the node great, if not it will automatically retrieve the data, without relying on the original CVM maintaining the replica disk, thus eliminating any possible service degradation due to multiple access request to the original CVM to copy the data.
This method allows VMs on each node to read the replica disk locally from the Nutanix Extended Cache (SSD and RAM). In the VDI case, this means each node can cache the replica disk and all read requests would be served locally, drastically improving end-user experience.
VM data will only be migrated when there is a read I/O request as to not flood the network and allow for efficient cache utilization. In the case where the replica disk is modified the Shadow Clones will be dropped and the process will start over.
Shadow Clones is great with Linked Clones!
Nutanix also offers data duplication avoidance mechanism (de-duplication is the marketing term here) using VMware VAAI API and Intelligent cloning techniques where only metadata is operated. Josh Odgers wrote a good article explaining Cloning VMs – Why less (I/O & throughput) is better!
Intelligent clone are much faster because there is no need to duplicate a VM; the intelligent cloning process simply creates pointers back to the original file (which remains Read Only) and only uses I/O & capacity when new data is created. The size of the VM being cloned is irrelevant.
Intelligent Cloning is great with Full Clones!
The last concept to understand is the Nutanix In-Memory Performance Block De-Duplication. Nutanix has a de-duplication engine built into the stack that works real-time for data stored in DRAM and Flash.
Content Cache (Dynamic Read cache)
The Nutanix Content Cache is a de-duplicated read cache that spans both the Nutanix Controller VM memory and SSDs. Upon a read request of data not in the cache the data will be placed in to the single-touch pool of the content cache which completely sits in memory where it will use LRU (Least Recently Used) until it is ejected from the cache. Any subsequent read request will “move” (no data is actually moved, just cache metadata) the data into the memory portion of the multi-touch pool, which consists of both memory and SSD.
From here there are two LRU cycles, one for the in-memory piece upon which eviction will move the data to the SSD section of the multi-touch pool where a new LRU counter is assigned. Any read request for data in the multi-touch pool will cause the data to go to the peak of the multi-touch pool where it will be given a new LRU counter.
Extent Cache (In-memory read cache)
The Extent Cache is an in-memory read cache that is completely in the CVM’s memory. The Extent Cache will store non-fingerprinted extents for containers where fingerprinting and de-dupe is disabled. I wrote a nice article a while back (CBRC-like Functionality For Any VDI Solution with Nutanix) explaining how de-duplicated data blocks help to improve and accelerate user experience in VDI deployments.
Analysis and Conclusion
I wrote this introduction to provide a baseline for a simple question – What are the advantages and trade-offs between using Linked Clones with Shadow Cloning VS. Full Clones with native VAAI Cloning?
For the Nutanix stack Native Cloning using VAAI is better than Linked Cloning (even if the latter has the Shadow Clone optimization enabled). In the VAAI case Nutanix is fully aware of the common parent – and doesn’t have to rely on heuristics on when to allow caching of the common parent from multiple nodes. An important point is that Linked Cloning technology was designed to support cloning even if the storage vendor doesn’t have native VAAI support for cloning.
It is generally better to use the native cloning support from the storage array rather than build it as a layer on top. From a performance, change tracking, de-fragmentation and coalescing perspectives it is better to deploy VDI using native Full Clones desktops instead of Linked Clones; for both persistent or non-persistent use cases.
The trade-off is that many organizations use Linked Clones as a way to manage base images and deployments. If you are using this model you may continue to use it, but whenever possible you should prefer the Nutanix native cloning.
This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.