Today Nutanix announced the first part of a multi-month announcement for NOS 4.1. This release is mostly focused on enhancements for the areas of resiliency, security, disaster recovery, analytics, supportability and management. However, even being a ‘dot’ release, NOS 4.1 delivers very important features, and in my option this version has enough meat to actually even be considered a major release.
It’s been only 5 months since NOS 4.0 was announced with the introductions of Hybrid On-Disk De-Duplication, Failure Domain Awareness, Disaster Recovery for Hyper-V, Snapshot Scheduling, One-Click NOS Upgrade and others. If you missed the NOS 4.0 release announcement read about it at Nutanix 4.0 Features Overview (Beyond Marketing).
This is the power of software-defined architectures, running on standard x86 hardware, with no special purpose machine doing one and one thing only. The software approach allows for faster release cycles and whenever there are hardware performance improvements you get to enjoy the benefits and performance improvements right away.
Please refer to the product Release Notes for official information from Nutanix.
Let’s start with the New Features…
- Cloud Connect
Nutanix allows administrators to implement and manage VM centric Disaster Recovery policies across multiple sites and datacenters using a multi-topology architecture.
Whilst the Nutanix built-in DR capabilities allow administrators to specify snapshot retention policies, this approach can be expensive even when data de-duplication and compression are already enabled across cluster. There must be enough storage capacity available to retain the data and be able to handle multiple backups and snapshots over a long period of time.
Nutanix has now officially announced the ability to leverage the global, distributed and highly available infrastructure of Amazon Web Services for data backup and restore. That means that an on-premise Nutanix cluster is now able to back up and restore virtual machines to AWS, while having organizations getting billed directly for AWS for EC2 and S3 costs.
With Cloud Connect organizations are able to schedule, manage and use local and remote snapshots and replication for backup and disaster recovery from within Nutanix PRISM user interface.
- Local snapshots with Time Stream
- Backup to Amazon S3
- Integration with VSS and SRM
- Quick restore and state recovery
- WAN-optimized replication for DR
- Works with ESXi and Hyper-V
- 15 minute Cloud RPO
Administrators also have the ability to control and fine tune the back-up schedule and retention policies to meet the needs of the workload. The back-up schedule can be as low as 15 minutes or as high as a day enabling you to choose the frequency that best meets the infrastructure capacity and SLAs.
A Nutanix NOS instance (software-only) runs on AWS and each AWS NOS instance is managed as a remote site in Nutanix – it’s all integrated and managed via a single pane of glass. In AWS the remote site will require a M1.xlarge instance and the AWS NOS instance may be created in an availability zone of choice, increasing reliability and resiliency as needed by your organizations. You can have as many AWS NOS instances as needed.
The NOS cluster services run on this AWS virtual instance and uses Amazon EBC for metadata and S3 to storage the backup data. All the communication is done via VPC or SSH tunnels with optimized data transfer de-duplication for both backup and restore operations. Data is de-duplicated and compressed before it is backed-up in the public cloud with net result savings of 75% in both network bandwidth usage and storage footprint. It is also possible to throttle the network bandwidth consumption of backup to ensure that applications performance is not affected in any way. In addition to that, data transfer throughput with SSH falls by approximately by 25% compared to VPC and is the recommended method.
Just like any other cloud service AWS is subject to failures and outages. For this reason the AWS NOS instance takes automatic periodic snapshots of EBS volumes stores in S3. A AWS failure or unavailability will automatically raise an alert on the on-premise NOS cluster, maintaining that single pane of glass for managing all your Nutanix clusters and instances.
I will soon be working on a video demo to publish here, but in the meantime you can watch this video with Disaster Recovery – Failover and Failback with Nutanix.
Nutanix scale-out solution already handles the large majority of workloads using existing NX models, but certain workloads with very large active datasets or write IO intensive can benefit from additional performance. The NX-8150 is the new platform for those Tier 1 business critical workloads and it can be mixed with existing Nutanix clusters.
The NX-8150 is the first platform to fully leverage the Nutanix scalable write buffer, which improves performance and drives down latencies for the most demanding applications. In addition, the larger SSD tier for active data satisfies the latency and IOPS requirements for database workloads, without the excessive cost of an all-flash appliance.
The ability to mix different nodes types into a single cluster makes it practical and easier to eliminate infrastructure silos in datacenters.
The NX-8150 is targeted at Microsoft Exchange, SharePoint, SAP, High performance databases such as MS SQL Server and Oracle RAC; and it comes with 4 times the number of flash devices compared to previous Nutanix platforms to accommodates a much larger active data set. The NX-8150 is also the first Nutanix platform that offer flexible configuration of storage and server (compute) resources from factory, and that include multiple CPU options, 3 different SSD configurations, a variety of memory profiles, and multiple options for connectivity including ability to expand to 4x 10GbE ports.
Another benefit of the NX-8150 platform with flexible configuration is the ability to minimize the number of software licenses (applications and hypervisor) to be purchased. Think vSphere and Oracle.
Tests and validations will soon come out from Nutanix performance and engineering labs. Here is an example of such validations for Microsoft Exchange workloads where using the new platform with only 2U footprint Nutanix is able to manage 6x more mailboxes per node – 3,300+ mailboxes per node following Microsoft testing guidelines and practices - can linearly scale the system to support 100,000s of mailboxes simply by adding additional NX-8150 nodes.
Security and Compliance
- Data At Rest Encryption (NX-3060-E, NX-3061-E, NX-6060-E)
Nutanix clusters are deployed in a variety of customer environments requiring different levels of security, including sensitive/classified environments. These customers typically harden IT products deployed in their datacenters based on very specific guidelines, and are mandated to procure products that have obtained industry standard certifications.
Data-at-rest encryption is one such key criteria that customers use to evaluate a product when procuring IT solutions to meet their project requirements.
Nutanix Data-at-Rest encryption satisfies regulatory requirements for government agencies, banking, financial, healthcare and other G2000 enterprise customers who consider data security products and solutions. This new feature allow Nutanix customers to encrypt all or selected partitions on persistent storage using strong encryption algorithm and only allow access to this data (decrypt) when presented with the correct credentials.
- Compliant with regulatory requirements for data at rest encryption
- Leverages FIPS 140-2 Level-2 validated self-encrypting drives
- Future proof (uses open standard protocols- KMIP, TCG)
To enable Nutanix DRE a 3rd party key management server is required. At the time of the launch only ESXi is supported and only the SafeNet KeySecure Cryptographic Key Management System is certified, but overtime other key management systems will be supported. Nutanix supports any KMIP 1.0 compliant key management system, but others have not yet been certified. The key management system can even be a VM running on the Nutanix cluster.
Currently it is not possible to mix a Nutanix DRE enabled cluster with a non-DRE cluster because the platform requires special FIPS 140-2 Level 2 SED drives to meet the data at rest encryption requirements. By breaking the homogeneity of the cluster, one will violate the data at rest encryption requirement for copies of data stored on non-SED drives. However, both DRE and non-DRE cluster can be managed via the same PRISM Central UI.
- One-Click Hypervisor and Firmware Upgrade
I kinda soft-launched this feature in my article Nutanix One-Click Upgrade now takes care of Firmware and Hypervisor too! after Nutanix CEO Dheeraj Pandey revealed the new feature in a tweet. Love it!
In addition to the already non-disruptive NOS upgrade, the One-Click upgrade feature now ensures that the hypervisor (vSphere and HyperV) is also automatically upgraded in a distributed and rolling fashion. Nutanix is solving a huge problem, where either manual intervention or external automation tools are required.
In addition to the new hypervisor upgrade feature Nutanix is now enabling NCC (Nutanix Cluster Health) upgrade and hardware firmware upgrade. Let’s talks about firmware upgrade.
Upgrading firmware on servers, controllers, disks and other interfaces is probably one of the major pain points for datacenter administrators. The workflow normally goes as follows: schedule maintenance outage, go to the manufacturer website, download correct firmware, flash firmware via web interface or with a Linux cli tool, reboot server, and then repeat this task for every server in the datacenter that require firmware upgrade. It’s complex and lengthy.
Because NOS runs in hypervisor user space Nutanix is able to execute complex firmware and hardware upgrade processes across multiple server components without requiring reboots. Using Nutanix highly distributed parallel processing the One-Click upgrade takes care of firmware upgrade for all disk devices in the cluster with zero impact to virtual machines and workloads. No reboots required, true AlwaysOn!
As a bonus the framework that is serving as the engine for the Nutanix Cluster Health, also knows as NCC, is also part of the One-Click upgrade process.
Keep tuned, Nutanix 4.1 Features Overview (Beyond Marketing) – Part 2 coming soon!
This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.