Nutanix 4.0 Hybrid On-Disk De-Duplication Explained

Nutanix is a distributed scale-out 3-tier platform, utilizing RAM, Flash and HDD. This combination provides access to constantly accessed data in terms of microseconds; instead of milliseconds when exclusively flash devices are used. This is an awesome feature that influence and enhance the end-user experience for any type of workload.

De-duplication allows the sharing of guest VM data on premium storage tiers (RAM and Flash). Performance of guest VMs suffers when active data can no longer fit in the premium tiers. If guest VMs are substantially similar, for example if the Nutanix cluster is used to host numerous Windows desktops, enabling de-duplication substantially improves performance. When used in the appropriate situation, de-duplication makes the effective size of the premium tiers larger so that the active data can fit.

The Nutanix de-duplication engine was designed for scale-out, providing near instantaneous application response times. Nutanix de-duplication is 100% software defined, with no controllers or hardware crutch; and because Nutanix is platform agnostic, this feature is available in whatever hypervizor or VDI solution you chose to work with (today vSphere, Hyper-V and KVM are supported).

Nutanix added de-duplication for the performance tier (RAM and Flash) in NOS 3.5. The new NOS 4.0 release is introducing de-duplication for the capacity tier, the extent store, allowing organizations to have greater VM density, more virtual machines per node.

The capacity tier de-duplication is a post-process de-duplication, meaning the common blocks are aggregated according to a curated background process; by default every 6 hours, while de-duplication in the performance tier is near real-time, meaning it happens as data blocks transverse RAM or Flash. This hybrid de-duplication approach allows Nutanix CVM to be less intrusive and utilize less CPU cycles to detect common data blocks.

The ON-DISK capacity de-duplication must be enabled per Container in NOS 4.0 GUI (picture below), and is mostly recommended for VDI persistent desktops and server workloads. However, it is possible to enable and disable de-duplication per VMDK (vDisk) using NCLI. A next version on PRISM will provide the ability to manage de-duplication per VM or VMDK.

Screen Shot 2014-04-07 at 9.04.24 PM

Every write I/O larger than 64Kb is fingerprinted with US Secure Hash Algorithm 1 (SHA1) using native SHA1 optimizations available on Intel processors, and only a single analogous data block is stored in the Nutanix cluster. In the VDI context this means that persistent desktops can be deployed without the capacity or performance penalties existent with most storage solutions.

The picture below demonstrate the new NOS 4.0 Storage View with On-Disk De-duplication Saving. (Click on the picture to see full size)

Screen Shot 2014-04-13 at 9.33.42 PM

If the VMs or desktops are using Linked-Clones Nutanix VAAI snaps, the de-duplication happens at a different level, where the linked clone track the parent/child hierarchy, and this case no fine grain de-duplication is required.

How it works

The de-duplication happens via curator full-scans. As I mentioned, full scans happen every 6 hours, and the process look for common fingerprinted data to be de-duplicated. The de-duplication process happens at the Stargate component, and during the post-process de-duplication process CVM may present CPU overhead of 10 to 15% range.

Cluster_Components

During the curated background process Curator scans the metadata as part of its scans for duplicate fingerprints. If a new duplicated SHA1 is found, Stargate will re-write the data in a new location, removing duplicated data. Behind the scenes NDFS will increment refcounts against what has already been de-duplicated, against shared extents.

In NOS 4.0 the default de-duplication uses 16Kb granularity to process SHA1. For this reason NOS 4.0 introduces the concept of two extent group types, 16Kb and 4Kb. De-duplication is done at 16Kb block size, while caching in the performance tier is done at 4Kb extent granularity for better utilization of the caching resources.

In the second part of this de-duplication article I will demonstrate how to track and optimize de-duplication for different types of workloads.

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net

8 comments

3 pings

Skip to comment form

    • forbsy on 04/16/2014 at 7:33 am

    Hi. As Nutanix starts adding traditional storage type features like dedupe (what’s next, inline compression?), how is the compute performance affected? All of these SAN/NAS like features demand processing and memory. Isn’t that coming from the same compute that the guest vm’s demand? Are the CVM’s going to be demanding more and more resources, effectively lowering the density of vm’s per node – as more of these storage features are continuously added?
    Just curious on how Nutanix can maintain that density – or hopefully increase as features that consume resources are added.

  1. @forbsy, that’s a good question an concern.

    There are a of couple things to consider before answering to your question.

    1 – It’s easier to innovate in software than in hardware, and that’s largely why we see a big shift towards software defined infrastructures. Look at network virtualization, storage virtualization, virtual devices etc.

    2 – The speed of processors due to transistor density doubles on average every 24 months.

    If you take these two inputs in consideration, doesn’t make sense to stop innovation because of VM density concerns today. Innovation must go on…

    In saying that, there must be a clear understanding of current architectures and limitations. If you look at my article “Nutanix 4.0 Features Overview (Beyond Marketing)” you will notice that Nutanix NOS 4.0 has a 20% performance improvement in relation to NOS 3.4.

    NOS uses four CPU cores, and Nutanix engineers always try to work within the boundaries of those four cores. So to your point, the answer is No, NOS is not consuming more assigned CPU than before the addition of de-duplication. However, Yes, NOS will use more CPU within those four cores when de-duplication hashing is running. As I mentioned in my article, CPU overhead of 10 to 15% range.

    These new NOS features are optional and for some workloads where de-duplication is not beneficial is not recommended to have the feature turned on. I will write more about this workloads in the near future.

    I hope that answer your question.

  2. Great Post Andre! Can you answer this for me: If all write I/O’s larger than 64k are hashed with SHA1 and written as one block, how does that affect your de-dup, which is defaulting at 16k granularity? Does that mean that the 64k (or larger) blocks are not being de-duped? Or, is this post merely stating a few different facts about the mechanisms within NOS4.0, which need to be tailored per workload / LUN?

  3. @Jason Turner, great question! A > 64k write is chunked and each 16k segment is fingerprinted (eg. a 64k write is 4 16k chunks so 4 fingerprints), so the de-duplication occurs at a 16k block chunk on disk

    • Locca on 08/28/2014 at 9:42 pm

    Great Post Andre! With inline du-duplication enabled, If the write I/O is less than 64K, will it be hashed with SHA1 in real time or done with the later background process?

  4. Locca, sequential streams of data are fingerprinted during ingest using a SHA-1 hash at a 16K granularity. Random IO Fingerprinting is done during data ingest of data with an I/O size of 64K or greater. IO with less than 64K are not de-duplicated on the capacity tier.

  5. Thanks for the info, really informative. What happens to the data < 64k being written to the performance tier with both fingerprint on write and on-disk dedup enabled ? per your last comment anything < 64k is not written to capacity tier and does it mean that the data is deduped only on performance tier ?

  6. Naveem, data with IO size <64 is fingerprinted and stored in the performance tier. Overtime, when the data becomes cold it is then moved to the capacity tier, but no additional fingerprinting is necessary. Data is de-duped on both performance and capacity tiers.

  1. […] Nutanix 4.0 Hybrid On-Disk De-Duplication Explained […]

  2. […] Nutanix 4.0 Hybrid On-Disk De-Duplication Explained […]

  3. […] Nutanix 4.0 Hybrid On-Disk De-Duplication Explained […]

Leave a Reply