Data Locality matters… a LOT! See this video demo.

Nutanix utilizes a shared-nothing distributed architecture and it ensures data is always replicated across Flash, HDD, Nodes and Blocks for high availability. A VM may access data from anywhere on a Nutanix cluster, but Nutanix controllers (CVM) will always ensure that overtime active data blocks that belong to a VM are present in the host where the VM is currently running. This process is transparent and occurs in the background using free CPU cycles.

Data Locality is a key performance enabler for any workload, always ensuring important VM data is always as close as possible to memory and CPU, avoiding multiple hops and eliminating dependency on Storage Area Network.

Some competitors will tell you that 10-gigabit Ethernet is faster than SSD reads and that data locality isn’t important in storage distributed architecture designs. Having Data Locally with any system prevents the network from becoming a point of congestion. If you’ve got everything running over the network and it’s getting congested then it is going to impact your performance.

While it may be true that 10-gigabit Ethernet is faster than SSD reads, they forget that in order to transfer data over the 10-gigabit Ethernet the source host has to read it from a disk as well. So it is actually the transfer time + the read time.

Moving forward, when Nutanix start shipping appliances with NVMe PCIe SSDs that can drive 3000 Mbps (that’s 24-30Gbits) of sequential reads you will fully saturate a 10-gigabit Ethernet link before you can fully utilize the NVMe PCIe SSD.

We see so much benefit coming from Data Locality that Nutanix even created a special Data Locality feature for VDI workloads called Shadow Cloning.

In any case, I recorded this video demonstrating the power of Data Locality for a single VM workload running IOMeter. If you multiply the effects of this single data locality demonstration by 30, 50, or 70 VMs running in each host you will quickly notice that not having this feature is a major deal breaker when it comes to hyper-converged infrastructures , software-defined storage and enterprise workloads.

In this video the total number of IOPs drop about 200 IOps after the controller is shutdown and data locality is not active. From a latency perspective the value also changes from a minimum of 0.40 to a minimum of 0.44. It may not look much at first glance, but overtime with multiple virtual machines per server, congested networks and ToR (Top of Rack) switches, it is possible to visibly notice the performance diference. For my next video I will run a fully loaded server to better demonstrate the performance benefits.


(Watch in full screen and 1080p resolution)


[UPDATE] For those wanting to know the workload being execute in IOMeter, it’s 1GB working-set with 8KB IO size, 100% Read random. Please note that this is not a performance test and the workload, IOMeter, VM, or servers have not been tuned in any way to test performance.


This article was first published by Andre Leibovici (@andreleibovici) at

Leave a Reply

Your email address will not be published.