«

»

Jun 09 2015

New Nutanix Erasure Coding & How it works?

Erasure coding (EC) is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media. Erasure coding is extensively used in data centers since they offer significantly higher reliability than data replication methods at much lower storage overheads. Erasure coding is broadly applicable, but especially relevant in large clusters with mission critical data, opting for RF3 configured resiliency.

Erasure coding has been traditionally implemented using RAID groups on disks; however those are commonly bottlenecked by single disk, constrained by disk geometry and generally waste space implementing hot spared. Nutanix EC is done across nodes instead of disks, optimizing availability with faster rebuilds and utilizing the entire cluster through map-reduce processes to compute block parities.

Nutanix EC is very easy to be enabled; just a click and the work will start in the background.

 

 

How it works?

Each Nutanix container has a defined replication factor (RF) for data resiliency and availability, either 2 or 3. Once data is cold, data and data copies are thinned down by computing the parity for a set of data. This process occurs as a background distributed job.

Since only cold data is erasure coded, hot data remains in the RF state. This is a good thing because if there is a node failure, then hot data is simply read from RF copies elsewhere on the cluster, without any in-flight rebuild penalty.

Important – EC works in concert with de-duplication and Compression. All 3 data reduction methods are complementary to each other.

 

How efficient is Erasure Coding?

The examples below walks you through various configurations of cluster size, the strips possible, and the savings as a result. The purple-gray nodes are nodes that are avoided when creating the Erasure strip (parity), so that these nodes could be used for rebuild, if a node were to fail. The EC Engine will balance Capacity savings with the cost and time of rebuild. 3 nodes while technically possible, are not recommended, since rebuild nodes are not available

 

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net

 

3 comments

10 pings

Skip to comment form

  1. Brian

    Does a copy still migrate to disk on the active-VM node for faster access, in addition to the copies kept above for resilience?

  2. virtualdennis

    Hey Brian – the first copy of data would still be local to the VM- EC doesn’t change our data locality feature.. Thanks!

  3. Flammi

    So basically Nutanix now uses 2/3/4+1+HS on disk.

  1. NOS 4.1.3 – What’s New! » myvirtualcloud.net

    […] Erasure coding offer significantly higher reliability than data replication methods at much lower storage overheads. Erasure coding has been traditionally implemented using RAID groups on disks; however those are commonly bottlenecked by single disk, constrained by disk geometry and generally waste space implementing hot spared. Nutanix EC is a Next-Gen Storage Optimization done across nodes instead of disks, optimizing availability with faster rebuilds and utilizing the entire cluster through map-reduce processes to compute block parities. Read more in my article New Nutanix Erasure Coding & How it works? […]

  2. Planetchopstick - What was announced at .Next 2015 by Nutanix and Dell

    […] again explains the Nutanix space savings here. One caveat is that you need a cluster of at 4 nodes but that’s not a biggy. You’ll […]

  3. Erasure Coding in NOS 4.1.3 | Invisible Infrastructure

    […] For a quick intro to EC, see: http://www.joshodgers.com/2015/06/09/whats-next-erasure-coding/ and http://myvirtualcloud.net/?p=7106 […]

  4. Welcome to the New Nutanix Native File Services (NAS) & Tech-Preview Demo » myvirtualcloud.net

    […] We also recently announced the introduction of Erasure Coding (EC-X), a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different nodes. Erasure coding offers significantly higher reliability than data replication methods with much lower storage overheads. EC-X also works in concert with de-duplication and compression. All 3 data reduction methods are complementary to each other. (link) […]

  5. Prism Central and NOS 4.1.5, NCC 2.0.2 and Tech Preview Features – What’s New! » myvirtualcloud.net

    […] Erasure coding (EC) is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media. Erasure coding is extensively used in data centers since they offer significantly higher reliability than data replication methods at much lower storage overheads. Erasure coding is broadly applicable, but especially relevant in large clusters with mission critical data, opting for RF3 configured resiliency. Read more here. […]

  6. Nutanix Acropolis Base Software 4.5 (NOS released)…be still my heart! | nealdolson

    […] out of your Nutanix cluster.  A lower $ : GB ratio is always [email protected] a good blog post describing this feature at […]

  7. Transforming Healthcare IT – Why Nutanix for Healthcare? » myvirtualcloud.net

    […] with data avoidance techniques, compression and adaptive de-duplication, Erasure Coding (link), and a cheap and deep storage only node, the Nutanix File Services is a native scale-out service […]

  8. Nutanix 5.0 Features Overview (Beyond Marketing) – Part 3 » myvirtualcloud.net

    […] Erasure Coding (EC) is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media.  Each Nutanix container has a defined replication factor (RF) for data resiliency and availability, either RF2 or RF3. Learn more about EC-X here. […]

Leave a Reply