Jul 10 2015

This is How Nutanix Protect your Data

Nutanix is paranoid about data loss and enforces multiple architectural considerations and checks to ensure data is always protected and available.

Some of Nutanix architectural considerations include zero single points of failure or bottleneck for management services, creating system tolerance to failures. Tolerance of failures is key to a stable, scalable distributed system, and ability to function in the presence of failures is crucial for availability.

Techniques like vector clocks, two-phase commit, consensus algorithms, leader elections, eventual and strict consistency, multiple replicas, dynamic flow control, rate limiting, exponential back-offs, optimistic replication, automatic failover, hinted-handoffs, data scrubbing, checksumming among others all go towards the ability of Nutanix to handle failures, but also provide the backbone to easily recover from failures.

NDFS uses replication factor (RF) and checksum to ensure data redundancy and availability in the case of a node or disk failure or corruption. In the case of a node or disk failure the data is then automatically re-replicated among all nodes in the cluster to maintain the defined replication factor and data SLA; this is called re-protection. Re-protection might happen after a Controller VM went down.

Node and Block awareness is a feature that enable the NDFS metadata layer to choose the best placement for data and metada in the cluster, always ensuring the cluster tolerates single or multiple node failures, or an entire block failure. This is a critical piece to maintain data availability across large clusters, always ensuring data is not just randomly placed in different hosts in the cluster.

Because NDFS is always writing data to multiple nodes it is extremely important that the consistent model is strict, ensuring that writes are only acknowledged once two or more copies have been successfully committed to disk in different nodes or blocks. This requires a clear understanding of the CAP theorem (Consistency, Availability and Partition Tolerance) (http://en.m.wikipedia.org/wiki/CAP_theorem).

Medusa, the metadata layer, stores and manages all of the cluster metadata in a distributed ring based upon a heavily modified Apache Cassandra, and Paxos algorithm is utilized to enforce strict consistency.

 

NDFS_Ring

 

Paxos is a family of protocols for solving consensus in a network of unreliable processors. Consensus is the process of agreeing on one result among a group of participants. This problem becomes difficult when the participants or their communication medium may experience failures. Paxos is usually used where durability is required (for example, to replicate a file or a database), in which the amount of durable state could be large. The protocol attempts to make progress even during periods when some bounded numbers of replicas are unresponsive. There is also a mechanism to drop a permanently failed replica or to add a new replica.” – This service runs on every node in the cluster
http://en.wikipedia.org/wiki/Paxos_(computer_science)

 

The larger the cluster the higher the chances of a double failures (ex. drives), which may lead to data loss. Today, with NOS 4.1, NDFS is able to implement RF 2 and 3, meaning it tolerates up to two simultaneous component failures without data loss. However, it is important to understand that the larger the cluster is, the lower the chance of a tripe disk failure causing data loss due to lower risk of the same data being stored on all three failed drives. Nutanix distribute data across all drives available on the cluster in 1Mb extents, but also always keep a copy of the data local to where the VM is running for performance reasons; this is called data locality.

The larger the cluster the faster the cluster can recover from failures too, because all nodes in the cluster effectively contribute to the rebuild of the data lost and this process also lower the chances of data loss as a result of a double drive failure as NDFS does not trash a small number of disks to recover from a drive loss ie: Repairing to a hot spare or replacement drive like in RAID groups or other hyper-converged solutions. The impact to performance during recovery from a drive failure is also much lower on Nutanix than traditional RAID systems.

Nutanix also provide native inter-cluster data replication (synchronous and asynchronous) and backup-to-cloud (AWS now; Azure soon).

 

For more information on Nutanix data protection check out the The Nutanix Bible.

 

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net

Permanent link to this article: http://myvirtualcloud.net/?p=7207

Jul 05 2015

Welcome to the New Nutanix Native File Services (NAS) & Tech-Preview Demo

During the .NEXT conference we announced a new Native File Services (NAS). Why?

Whilst Nutanix solves the primary storage performance and capacity conundrum where generally speaking the most important is $/IO and whereas flash technology play a key role in continuously driving costs down, many customers still utilizing their external NAS for storing unstructured data such as documents and user profiles for VDI use-cases, where the most important is $/GB. I am talking about cheap and deep storage for large quantities of data that does not require first tier performance for the entire dataset.

Nutanix customers have asked for the same ease of management and resiliency guarantees for delivering NAS services using SMB protocol (aka CIFS). We recognized that just offering a non-integrated NAS service on top of Nutanix NDFS was not a realistic option, and definitely not what customers expect from Nutanix technology.

After releasing compression and adaptive de-duplication, next was time to introduce the NX-6035C (link), a cheap and deep storage only node that runs alongside existing nodes but does not run VMs, it only receives data via write I/O replication from Resiliency Factor and Disk Balancing operations. (link)

 

 

We also recently announced the introduction of Erasure Coding (EC-X), a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different nodes. Erasure coding offers significantly higher reliability than data replication methods with much lower storage overheads. EC-X also works in concert with de-duplication and compression. All 3 data reduction methods are complementary to each other. (link)

 

With the quartet in place and providing a fully integrated solution that is able to deliver on both $/IOPS and $/GB, it was then time to announce the new service.

 

What is Nutanix Native File Services (NAS)?

 

Nutanix File Services is a native scale-out service that is completely integrated and managed via the PRISM user interface. When more resiliency or performance (throughput or IOps) is required the system add new NAS heads and/or increase head resources across the cluster; and when more capacity is required the system extends the NAS file system using cluster storage resources.

For Hyper-V and ESX clusters you will see the NAS head VMs running as part of the infrastructure, while when using Acropolis hypervisor (link), the deployment of NAS heads is completely seamless and invisible to administrators.

Finally, as you would expect the service supports one-click upgrades.

There’s not much more I can talk at this point in time, unless under NDA, but I will leave you with a tech-preview video by Dwayne Lessner.

 

 

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net

Permanent link to this article: http://myvirtualcloud.net/?p=7189

Jun 28 2015

How to Enable Nutanix Acropolis HA [Video]

By now you are probably aware that Nutanix released its own hypervisor called Nutanix Acropolis, but in case you missed the news check The New Nutanix Xtreme Computing Platform (XCP) & Acropolis Hypervisor. Acropolis comes with many of the features you would expect from an enterprise ready product; including VM Management (VM Operations, Resource Scheduling, Live Migration, Snapshots and Clones, High Availability, IP Management, Analytics, Remote Console) and Host Management (Host Profiles, Virtual Networking, Non-Disruptive Upgrades).

The video below quickly demonstrate how to enable High Availability on a per VM basis, but also make sure you check the Nutanix Acropolis Hypervisor Walkthrough video.

 

[Watch in 1080p Full Screen]

 

  • Best Effort works as you might expect where in the event of a node failure, VMs are powered on throughout the cluster if resources are available. In the event resources e.g.: Memory, are not available then some/all VMs may not be powered on.
  • Reserve Space also works as you might expect by reserving enough compute capacity within the cluster to tolerate either one or two node failures. If RF2 is configured then one node is reserved and if RF3 is in use, two nodes are reserved.

 

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net

Permanent link to this article: http://myvirtualcloud.net/?p=7183

Older posts «

» Newer posts