«

»

Dec 05 2016

Nutanix 5.0 Features Overview (Beyond Marketing) – Part 3

This is a 4 part blog series.

 

 

Disclaimer: Any future product or roadmap information is intended to outline general product directions, and is not a commitment, promise or legal obligation for Nutanix to deliver any material, code, or functionality.  This information should not be used when making a purchasing decision.  Further, note that Nutanix has not determined whether separate fees will be charged for any future product enhancements or functionality which may ultimately be made available, and may choose to charge separate fees for the delivery of any product enhancements or functionality which are ultimately made available.

 

For official information on features and timeframe refer to the official Nutanix Press Release (here).

 

These are the features introduced with this blog post series:

  • Cisco UCS B-Series Blade Servers Support
  • Acropolis Affinity and Anti-affinity
  • Acropolis Dynamic Scheduling (DRS++)
  • REST API 2.0 and 3.0
  • Support for XenServer TechPreview
  • Network Visualization
  • What-if analysis for New workloads and Allocation-based forecasting
  • Native Self-Service Portal
  • Snapshots – Self Service Restore UI
  • Network Partner Integration Framework
  • Metro Availability Witness
  • VM Flash Mode Improvements
  • Acropolis File Services GA (ESXi and AHV)
  • Acropolis Block Services (CHAP authentication)
  • Oracle VM and Oracle Linux Certified for AHV
  • SAP Netweaver stack Certified for AHV
  • Prism Search Improvements (support for Boolean expressions)
  •  I/O Metrics Visualization
  • 1-Click Licensing
  • LCM – Lifecycle Manager
  • Additional Prism Improvements
  • AHV Scale Improvements
  • AHV CPU and Memory Hot Add (Tech Preview)
  • Advanced Compression for Cold Data
  • Acropolis Change Block Tracking (CBT) for Backup Vendors
  • Predictable Performance with Autonomic QoS
  • (New) NCC 3.0 with Prism Integration
  • (New) 1-Node Replication Target
  • (New) Improved Mixed Workload Support with QoS
  • (New) Simplified SATADOM Replacement Workflow
  • (New) Mixed Node Support with Adaptive Replica Selection
  • (New) Dynamically Decreased Erasure Coding Stripes – Node Removals
  • (New) Multi Metadata Disk Support for use available SSDs on the node for metadata
  • (New) Erasure Coding(EC) support for changing the Replication Factor (RF) on containers
  • (New) Inline Compression for OpLog

 

Now that we have legal disclaimer out-of-the-way… let’s get into it!

 

Prism

NCC 3.0 with Prism Integration

Historically just about every interaction with NCChas  required command line access on a CVM. This was a frustration for system administrators who are not CLI savvy or customers who prefer the GUI. As of NCC 3.0 in AOS 5.0, NCC is fully integrated with PRISM and many improvements have been added.

 

  • NCC now takes ~5mins to run
  • Many improvements to existing checks
  • Bug fixes and more robust NCC infrastructure.
  • New plugins (~15+ plugins in 2.3 + 3.0)
  • XenServer support

Many aspects of NCC are now functional via PRISM

  • 300+ NCC checks can now be managed through PRISM
  • Alert associated with every check
  • Checks can be manually executed from the GUI, and results can be downloaded
  • Log collector can also be triggered from the GUI

 

 

 

Distributed Storage Fabric (DSF)

1-Node Replication Target

SMB customers need a cost-effective replication solution for branch offices. AOS 5.0 allows a single Nutanix node (NX-1155, 1N2U, 2xSSD + 10xHD) to be used as a fully on-boarded replication target for Nutanix clusters. This is a single node AHV cluster with FT-1 that doesn’t run VMs, but works integrated with Nutanix cluster replication sources using any of the supported hypervisors.

1node_replication

 

Improved Mixed Workload Support with QoS

This is one of those deep internal improvements that enormously affect the system’s ability and performance when running multiple diverse applications with different workload profiles on a single Nutanix node. AOS 5.0 separates Read and Write I/O Queues; ensuring write-intensive workloads (or write bursts) will not starve out read operations, and vice-versa. This is achieved through the replacement of the admission controller and OpLog queues with a single weighted fair queue, priority propagation and disk queue optimizations.

I won’t bother you with the details because it starts to get technical very quickly and it probably needs a dedicated article; but this new feature ensures that I/O priorities are maintained through the entire IO path augmenting performance and I/O reliability when the system is under stress.

 

 

Simplified SATADOM Replacement Workflow

The host boot disk (SATADOM) replacement involves an elaborated manual procedure that is lengthy and needs to be performed by a Nutanix system engineer. AOS 5.0 automates and simplifies the workflow allowing the system admin to drive it from within PRISM to give a one-click (nearly) experience.

 

 

Mixed Node Support with Adaptive Replica Selection

Another important feature that augment cluster balance and performance. AOS 5.0 does smart placement of data copies based on drive capacity and performance utilization providing always consistent performance levels with optimum resource utilization even with heterogeneous nodes in a cluster. E.g. Regular Node + Storage-heavy Nodes, or NX1000 + NX3000 Nodes.

The smart placement utilizes disk usage and performance stats for each disk in the cluster to create a Disk Fitness Stats. This fitness value is a function of disk fullness percentage and the disk queue length (number of operations in flight for that disk). The disk for the data write is then selected via weighted random lottery to prevent herding behaviors.

 

 

Dynamically Decreased Erasure Coding Stripes – Node Removals

Erasure Coding (EC) is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media.  Each Nutanix container has a defined replication factor (RF) for data resiliency and availability, either RF2 or RF3. Learn more about EC-X here.

Prior to AOS 5.0 If a cluster had EC container(s), removing a node was somewhat restrictive because the EC strip would be distributed across the cluster – at least 7 nodes if highest EC container RF is 2, and at least 9 nodes if highest EC container RF is 3. The solution was to turn off EC for the containers, but it took a long time to convert to non-EC bytes, and the cluster must have had enough free space.

AOS 5.0 now maintains EC protection even with node removals, keeping protection overhead limited. It does that dynamically decreasing the EC strip size once nodes are removed from a cluster, and dynamically increasing the EC strip size once new nodes are added to the cluster.

 

Multi Metadata Disk Support for use available SSDs on the node for metadata

AOS 5.0 now automatically distribute metadata across available SSDs in a node (maximum of four). The automated distribution of metadata across SSDs help to accommodate the Read/Write pressure during peak events as the metadata disk is was also used by other system components. Distributing the read/write load improve IOPS and reduce latencies, therefore removing the single-SSD bottleneck. Another benefit of distributing the metadata writes is the uniform wear across SSD media devices.

 

 

Erasure Coding (EC) support for changing the Replication Factor (RF) on containers

Erasure coding (EC) is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media.  Each Nutanix container has a defined replication factor (RF) for data resiliency and availability, either RF2 or RF3. Learn more about EC-X here.

With AOS 5.0 EC-X provides the ability to modify the Replication Factor (RF) for Containers that have Erasure Coding enabled, providing greater flexibility for customers to achieve their desired level of data protection during the application life cycle.  EC enabled containers now can go from RF3 to RF2 or vice-versa, and the EC encoding automatically changes to match that.

 

 

Inline Compression for OpLog

In AOS 5.0 Random Writes get automatically compressed Inline before hitting the OpLog. The OpLog is like a filesystem journal and is built as a staging area to handle bursts of random writes, coalesce them, and then sequentially drain the data to the extent store.

With dynamic compression the Nutanix cluster gain improved space utilization and improved burst handling for sustained random writes for the OpLog space. The OpLog space now can also absorb sustained random write bursts for a longer duration.

 

That’s it… AOS 5.0 is a huge release with major improvements in the areas of performance, reliability, availability, supportability and user experience. Several other smaller features are also part of the release, but they are not meaningful enough to be featured in this blog series.

I would like to acknowledge the huge effort from our PM, R&D, QA, Release Management and Support teams shipping such a ‘fantastic’ product release, and also for their continuous innovation efforts to bring to customers and partners by far the best HCI product on the market today, without any doubts in my opinion. A big thank you!

Now you must be asking yourself when you can 1-click upgrade your clusters to AOS 5.0. While I don’t control the Release Management train and I also cannot disclose the exact due date, I can say that it will be soon. So, stay tuned!!

 

NEWS UPDATE – A Part 4 post was added to the series – Nutanix 5.0 Features Overview (Beyond Marketing) – Part 4

 

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net

10 comments

4 pings

Skip to comment form

  1. vikrant

    Great article, I have really enjoyed your article. Actually I was waiting for the 3rd part of this series . I was little bit confused about the Distributed Storage Fabric but now you have cleared my all the doubts in this article. AOS 5.0 is a huge release with major improvements in the areas of performance, reliability, availability, supportability and user experience. Thanks for sharing . The way you explained each and everything about AOS 5.0 is really great. Thanks once again.

  2. Andre Leibovici

    @vikrant you are most welcomed! Thanks for the feedback.

  3. Tom Hardy

    Was there a section with Linux kernel upgrade?

  4. Andre Leibovici

    @tom, good point… I somehow missed the section. Need to write a Part 4. Thanks!

  5. Nutanix Admin

    Hi Andre, Awesome info. Thanks a lot for sharing. The way you explained each and everything about AOS 5.0 is really amazing. With 5.0 can I have 10 node Nutanix cluster with 7 nodes running Vmware and 3 nodes on AHV?

  6. Tom Hardy

    Are those ot most of the features listed above ahv only? Any changes or enhancements to hyper-v?

  7. Andre Leibovici

    @tom, most of the hypervisor independent features are also available for Hyper-V. That said, most of the new features will come first to AHV where Nutanix engineers have full control of the stack.

  8. Andre Leibovici

    @Nutanix Admin, with 5.0 and prior release you can have a cluster made of ESX and Storage Only nodes running AHV. Full mesh is not yet available, but Nutanix has already announced multi-hypervisor management (not in 5.0) enabling you to have a single pane of glass to manager all your hypervisors operational functions.

  9. Tom Hardy

    Performance wise how is AHV fairing compared to Esxi and Hyper-V? Can we ask have Dell XC clusters expanded with Lenovo HX nodes or with NX nodes?

  10. Andre Leibovici

    @Tom All hypervisors operate within a very minimal diferential performance limit and Nutanix continually test all hypervisors to ensure benchmarks. You will not be able to mix Dell XC and Lenovo HX on the same cluster, but that is a supportability problem between OEM vendors, not a technical restriction. If you want to expand with storage-only nodes, both OEM partners are able to provide you with such nodes. If you want change the OEM vendor, just create a new cluster and manage everything under a single-pane of glass using Prism Central.

  1. Nutanix 5.0 Features Overview (Beyond Marketing) – Part 3 — myvirtualcloud.net | Farhan Parkar's Weblog

    […] via Nutanix 5.0 Features Overview (Beyond Marketing) – Part 3 — myvirtualcloud.net […]

  2. Nutanix 5.0 Features Overview (Beyond Marketing) – Part 2 » myvirtualcloud.net

    […] Nutanix 5.0 Features Overview (Beyond Marketing) – Part 3 » […]

  3. Nutanix 5.0 Features Overview (Beyond Marketing) – Part 4 » myvirtualcloud.net

    […] « Nutanix 5.0 Features Overview (Beyond Marketing) – Part 3 […]

  4. Nutanix 5.0 Features Overview (Beyond Marketing) – Part 1 » myvirtualcloud.net

    […] Nutanix 5.0 Features Overview (Beyond Marketing) – Part 3 […]

Leave a Reply