It makes HCI expensive and not cost competitive!
Now that we got that out of the way let me explain. It is not that HCI solutions cannot provide higher levels of availability, most of them do, but vendors frequently steer customers to a resiliency factor that makes them look cost-effective – makes them look good from a financial viewpoint.
Data durability in the presence of failures is table stakes for any organization, and failure tolerance is achieved by data redundancy in some fashion. One way to achieve redundancy is with mirroring. You can mirror 2-way or 3-way.
At any scale and seriousness, you have to do 3-way replication, or you are rolling the dice on data loss. The reason is not so much that you will lose two drives at the same time. What is much more common is the following scenario: 1 drive fails, and the system starts re-mirroring data from the remaining drive. All it takes is a sector read error (also known as Latent Sector Errors ), and you have now lost data.
Over the past 2 decades, most of the industry has moved beyond 1FT (i.e., 1-drive failure tolerance). Examples of 1FT are RAID-5 in SAN arrays, RF2 from Nutanix and 1FT from VMware VSAN. No serious enterprise SAN array promotes 1FT. However, most HCI vendors still, by default, recommend 1FT. These HCI vendors have regressed in providing resiliency in order to make their products look financially viable.
Can’t argue with math, if you do not have 2-drive failure tolerance, the chance of data loss is an astonishing 0.5% per year. Gartner also recommends 3-way replication.
“Use traditional three-way mirroring with RF3 when best performance, best availability and best reprotection time of data are all equally important.” 
THE REAL REASON, THE COST!
Implementing HCI using 3-way replication will incur over 300% overhead due to the capacity required to protect and re-protect data. Furthermore, there’s a higher minimum number of hosts required for cluster protection.
Here is how the capacity math works out: Take 5 hosts (7-2 for N+2) that need to provide ~118TBs. Now, to determine the capacity required per host with 3-way mirroring, use 118TBs [useable required across the cluster] / 5 [number of hosts online] * 3 [three-way mirroring overhead] = 71TBs per host.
At this point, one may be making the correlation between the raw capacity and the useable capacity required to ensure that data can always be fully re-protected.
Typically 3x is banded about when talking about FFT2 or RF3, however that’s with 100% utilization, and no ability to re-protect data, in reality, the system requires 497TBs (71TBs * 7 hosts) of total capacity to provide 100TBs of usable capacity, this is an overhead of 4.97x.
On the host count issue, there is a minimum number of hosts required to provide the additional availability and re-protection with a 3-way mirror – and it’s higher than with 2-way mirror. HCI vendors have different architectures, but the math works similarly.
The additional cost of servers and storage capacity is thought as a deal breaker for many organizations considering HCI.
Storing three copies of data with 2FT (examples include RAID 6 for arrays, RF3—Nutanix, FTT2—VMware VSAN) or using erasure-coding techniques that tolerate two failures improves reliability significantly.
To lose data in a system that tolerates two drive failures, there needs to be either three simultaneous drive failures, two drive failures and a LSE, or one drive failure and LSEs in both redundant copies of the same chunk. All of these are very improbable events, and the chance of data loss is reduced by many orders of magnitude.
Datrium has done exhaustive studies with its data, but also with public studies and data provided by Google, Facebook, Nutanix, Netapp and Backblaze. These studies have been done by engineers with PhDs on the topic of disk failures; this is a serious study. Furthermore, recently they included the math for Flash drives, and the results do not look any better – SSDs encounter Latent Sector Errors (also called uncorrectable errors) at an alarming rate
- Single Failure Tolerance (1FT) For HCI Reliability: Myth And Fact?
- RF2 Configuration For Flash Storage: Myth And Fact
I would not be writing all that if Datrium did not implement, by default, 2FT. Datrium uses Double Fault Tolerance Erasure Coding, a Log-Structured Filesystem (LFS), and In-line Integrity Verification and Healing. I also wrote an article discussing our data integrity methodology, but the most important is to know that system protects customer data with higher levels of resiliency at a highly competitive price point, even against HCI RF2 implementations.
This article is not meant to say that HCI is not good, nor this article is picking on any specific vendor, but rather pointing at any storage vendor that wants to offer lower levels of data resiliency in exchange for a better solution cost.
HCI provide numerous benefits to enterprises, but with power comes responsibility and IT teams are responsible for data in their organizations.
I believe that anyone reading this article will agree that 3-way mirroring is better than 2-way mirroring — so as an industry we all should be advocating for better resiliency, even if the solution will cost a little more. At this point we are probably entering the Risk Management realm. “you pay your money, you take your chances”.
My recommendation to you:
If you are considering Datrium, Nutanix, VMware VSAN, or any Tier-1 storage solution, always ensure that you are comparing apples-to-apples and that your data is going to be protected with best of breed resiliency and data integrity.
Hey, I’m just one opinion here. Do you agree? Disagree? Let me know what you think.
 Key Differences Between Nutanix, VxRail and SimpliVity HCIS Appliances – Architecture and Storage I/O Published: 26 April 2017 ID: G00319293
For comments, please use the article version posted on LinkedIn.