Couple weeks ago I published the first and second parts of a multi-month announcement for NOS 4.1; this is the third part of the announcement. If you missed the first two parts you can read it at Nutanix 4.1 Features Overview (Beyond Marketing) – Part 1 and Nutanix 4.1 Features Overview (Beyond Marketing) – Part 2.
NOS 4.1 delivers important features and improvements for the areas of resiliency, security, disaster recovery, analytics, supportability and management. The first article explained the new Cloud Connect feature for cloud backup, the NX-8150 platform for heavy OLTP and OLAP/DSS workloads such as Exchange and Oracle databases, the Data At Rest Encryption for secure environments that require compliance, which I complemented with the article describing how it works New Nutanix Data-at-Rest Encryption (How it works), and finally the One-Click Hypervisor and Firmware Upgrade.
In the second part I focused on smaller improvements that permeated NOS releases between 4.0 and 4.1, 4.0.1 and 4.0.2, augmenting performance, system usability and user experience. That included things like Volume Shadow Copy Service (VSS) support for Hyper-V hosts, Multi-cluster management, Simplified drive replacement procedure, amongst others.
This third part will focus on what I consider perhaps the big milestones for this release; the Nutanix NX-9240 All-Flash appliance and the Metro Availability feature.
Nutanix scale-out solution already handles the large majority of workloads using existing NX models. For workloads with very large active datasets or write IO intensive that require additional performance Nutanix introduced, as part of the NOS 4.1 release, the NX-8150 for those Tier 1 business critical workloads and it can be mixed with existing Nutanix clusters.
The new NX-9240 appliance is built to run applications with very large working sets, such as databases supporting online transaction processing (OLTP) that not only exceptionally fast storage performance, but also demand predictable and consistent l/O latency that flash can deliver. The new NX-9240 is 100% All flash storage, offering ~20TB RAW per 2U.
Flash capacity is optimized using Nutanix’s scale-out compression and de-duplication technologies that leverage unused compute resources across all nodes in the cluster, avoiding performance bottlenecks.
Differently than other solutions, this is a true scale-out All Flash storage where storage capacity and performance are augmented simply by adding nodes, one-at-a-time, non-disruptively, for 100% linear scalability with no maximum limit.
In this first release (NOS 4.1) the NX-9240 All Flash nodes cannot be mixed with other node types because the new nodes do not have the concept of automated tiering, given it’s all flash. Therefore a new cluster must be created only with NX-9240 nodes; however all other NOS capabilities such as disaster recovery, backup and even the new metro cluster availability can be used between different clusters. A future release of NOS will allow the mix and match of All Flash and Hybrid nodes.
Over the last couple years Nutanix introduced many features around availability and resiliency to the platform. Today Nutanix has built-in self-healing capabilities, node resilience and tunable redundancy features, virtual machine centric backup and replication, automated backup to Cloud, and many other features vital for running enterprise workloads.
However, Business-critical applications demand continuous data availability. This means that access to application and user must be preserved even during a datacenter outage or planned maintenance event. Many IT teams use metro area networks to maintain connectivity between datacenters so that if one site goes down the other location can run all applications and services with minimal disruption. To keep the applications running, however, requires immediate access to all data.
The new Nutanix Metro Availability feature stretches datastores and containers for virtual machine clusters across two or more sites located up to 400km apart. The mandatory synchronous data replication is natively integrated into Nutanix, requiring no hardware changes. During the data replication Nutanix uses advanced compression technologies for efficient network communications between datacenters, saving bandwidth and speeding data management.
For existing Nutanix customers it is good to know that the implementation of the metro availability feature uses the same concepts of data protection groups existing in PRISM for backup and replication across Nutanix cluster, just now adding a synchronous replication option where administrators are also able to monitor and manage cluster peering and promote containers or break peers.
By default, the container on one side (site) is the primary point of service, and the other side (site) is the secondary and synchronously receives a copy of all the data blocks written in the primary point site. Since this is done on a container level, it’s possible to have multiple containers and datastores, and the direction of replication can be simply defined per container.
The Nutanix Metro Availability supports heterogeneous deployments and do not require identical platforms and hardware configurations at each site. Virtualization teams can now non-disruptively migrate virtual machines between sites during planned maintenance events, providing continuous data protection with zero recovery point objective (RPO) and a near zero recovery time objective (RTO).
The requirements to enable metro availability are simple, being enough bandwidth to handle the data change rate, and a round trip time of <=5ms. A redundant network link is also highly recommended.
- <=5ms RTT
- Bandwidth depends on ‘data change rate’
- Recommended: redundant physical networks between sites
- 2 Nutanix clusters, one on each site
- Mixing hardware models allowed
- ESXi (other hypervisors soon)
What I like the most about the Nutanix platform is that using One-Click NOS, Hypervisor and Firmware Upgrade customers will be able to start using the new feature, as soon it is available. This is the power of the true software-defined datacenter.
Over the next few articles I will have a chance to really deep dive into the stretch clusters technology, discuss deployment and failure scenarios, site consistency options, recovery options, VM availability etc.
- System Center Operations Manager and System Center Virtual Machine Manager
Nutanix now offers single pane of glass management in Microsoft environments with full Storage Management Initiative Specification (SMI-S) support by SNIA.
“The SMI-S defines a method for the interoperable management of a heterogeneous Storage Area Network (SAN), and describes the information available to a WBEM Client from an SMI-S compliant CIM Server and an object-oriented, XML-based, messaging-based interface designed to support the specific requirements of managing devices in and through SANs” Here is a good introduction to SMI-S by the Microsoft team.
The integration allow Microsoft administrators to monitor performance and health of Nutanix software objects such as clusters, storage containers, controller VMs and others via SCVMM; and also monitor Nutanix hardware objects such server nodes, fans, power supplies and others via SCOM.
Here are some screenshot examples, but I will overtime write more about each individual management pack.
Nutanix Cluster – Containers View (Click to Enlarge)
Nutanix Cluster – Performance >> Clusters (Click to Enlarge)
Nutanix Cluster – Disk View (Click to Enlarge)
With this post I am almost complete with this series of articles talking about the new capabilities in NOS 4.1. Well, almost… there are still couple things that I will discuss later on, but for now I will start to deep-dive a little more into all the new goodness and innovation Nutanix is delivering with this release.
This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.