Jun 21 2016

Nutanix 4.7 and Asterix Features Overview (Beyond Marketing)

Another 4 months passed since AOS 4.6 was released and a new major Nutanix release with new features and improvements is about to be made available. The velocity with which new features and improvements are released by Nutanix is only comparable to what we see in the consumer space where apps are updated at an amazing pace for user’s delight.

This blog post contains features being made available with AOS 4.7, and also some of the features for the upcoming release, codename ‘Asterix’. If you missed previous AOS release announcements read them at

Nutanix 4.6 Features Overview (Beyond Marketing)

Nutanix 4.5 Features Overview (Beyond Marketing)

Nutanix 4.1 Features Overview (Beyond Marketing)

Nutanix 4.0 Features Overview (Beyond Marketing)

 

These are the features introduced in this blog post:

  • Broadwell Refresh and All-Flash
  • SX Platform (Nutanix Xpress)
  • Degraded Node Behavior Detection
  • Foundation 3.2
  • Acropolis File Services (Performance Optimization)
  • Acropolis File Services (TRIM Support)
  • Acropolis Block Services (aka ABS)
  • AHV Dynamic Scheduling
  • Acropolis Container Services
  • CPS Standard on Nutanix (Project La Jolla)
  • Network Visualization in Prism
  • What-If Analysis
  • Entity Browser Improvements (Custom Focus, Page Size, Shift Select)
  • Nutanix self-service
  • Pulse Reminder
  • 1-Click LSI HBA Upgrade

 

Note: I have marked the features belonging to the ‘Asterix’ release with our fearless Gaulish.

 

Platform

Broadwell Refresh and All-Flash

27151_1_intel_rejects_the_idea_that_they_are_going_bga_only_fullThere is no doubt that Flash technology is changing the shape of data centers. Workloads that require consistent and predictable low-latency responses are the ones that benefit the most out of All-Flash platforms. In addition to Flash, Intel’s fifth-generation Broadwell CPU is starting to gain big adoption. With that in mind, Nutanix is refreshing all NX platforms to provide support for Intel’s Broadwell-EP CPUs, and at the same offer All-Flash option for most platforms. The table below shows you the hardware changes between G4 and G5 hardware for the NX families. Both Dell and Lenovo also have their platforms refreshed to support the Broadwell CPU’s.

broadwell_refresh_01

 

 

SX Platform (Nutanix Xpress)

Nutanix Xpress is a new NX product line designed to address the IT needs of smaller organizations through ease of management, lower TCO, and simple, risk-free deployment, with the Nutanix world-class support for the entire infrastructure stack just a phone call away. Nutanix Xpress is shipped pre-integrated with the Xpress software edition which contains features catered to SMBs. The official press-release can be found here.

 

sx-1065-65

 

 

 

Degraded Node Behavior Detection

Nutanix has been designed with service resiliency in mind and implements a fail-safe distributed architecture. In such scenarios clusters and systems are also commonly designed with fail-stop features – and that’s how Nutanix has been designed from the ground up.

However, a partly available (degraded) node can adversely affect performance of an entire cluster. In a distributed architecture a partly available state can be caused due to various reasons like network bandwidth reduction, network packet drops, soft lockups, partially bad disks or hardware issues like a flakey DIMM with ECC errors.

As of AOS 4.7 the services running in each node of the cluster will publish scores/votes for services running on other nodes. Peer health scores are computed based on various metrics like RPC latency, RPC failures/timeouts, Network latency etc. If services running on one node are consistently receiving bad scores for period of time then other peers will convict that node as a degraded node.

Once a node is detected as degraded, leadership of critical services will not be hosted at the degraded node and the node will be put into maintenance mode and CVM will be rebooted – all without service disruption. An alert will be generated for degraded node – and if Pulse is enabled Nutanix support is automatically alerted of the condition.

This is the perfect example of one of those hidden platform features that make the best enterprise solutions even better!

 

 

Foundation 3.2

The Nutanix Foundation is the tool that allow administrators to bootstrap, deploy and configure a bare-metal Nutanix cluster from start-to-end with minimal interaction in matter of minutes. Foundation automatically configure IPMI, deploy and configure hypervisors, deploy and configure Nutanix Controller VMs and create a cluster with selected nodes. This release of Foundation introduces:

  1. Intel’s Broadwell support
  2. Nutanix Xpress Family support
  3. Error reporting improvements
  4. Automated Consistent CVM size

 

 

Distributed Storage Fabric (DSF)

Acropolis File Services (Performance Optimization)

Acropolis File Services (or AFS) is an integral and native component of DSF, removing the need for Windows File servers VMs or external NAS arrays, such as Netapp or EMC Isilon arrays.  In AOS 4.7, AFS received optimizations to support more than 10 million files/directories per node in a file server and improve latencies for metadata access of NTACL and file attributes. Nutanix AFS in a small 3-node cluster configuration is now able to scale to up to 45 million files/directories for the file-server. (AFS is still in Tech Preview)

 

Acropolis File Services (TRIM Support)

TRIM support is a mechanism for the filesystem to let the underneath storage layer know of what blocks are free. AFS runs on the top of NDFS which takes care of storage allocation. TRIM takes care of sending free blocks information to Stargate so that it has most up to date information of all free and used storage. (AFS is still in Tech Preview)

 

[Watch in full-screen 1080p]

 

 

Acropolis Block Services (aka ABS)

Acropolis Block Services provides a highly available, scalable and high performance iSCSI block storage to guests. ABS builds atop of Acropolis Volume Groups Services that have been available since AOS 4.5 release. Volume Groups provide block storage and are particularly important for enterprise applications that do not provide support for NFS datastores, or applications that require block storage “shared” across deployment instances. The use cases include: Microsoft Exchange on ESXi, Windows 2008 Guest Clustering, Microsoft SQL 2008 Clustering and Oracle RAC.

With Volume Groups in AOS 4.5 and 4.6 guests could only be of VM type, but with ABS in 4.7 guests now can be physical servers too. When ABS is enabled virtual disks are visible as iSCSI LUNs to physical and virtual guests.

ABS effectively enable bare-metal workloads to use Nutanix as their storage solution, while still letting Nutanix be used for hyper-converged workloads. ABS also make easier for organization to embrace hyper-convergence when the existing server infrastructure doesn’t need to be refreshed – it works as a transitional bridge to hyper-convergence.

 

ABS characteristics:

  • Highly available with seamless read and writes on node failures
  • Exports world-wide unique iSCSI target names
  • Target names persist across restarts, upgrades, snapshots and clones
  • Supports targets shared across multiple guests
  • Scales load even for single volume group
  • Supports affinity to a CVM node for a target
  • Supports seamless upgrade for existing iSCSI MPIO clusters

 

ABS implement load balancing using a hash function to spread iSCSI targets across cluster nodes in AOS 4.7, and in the Asterix release ABS will also provide load balancing based on each node’s resource availability. Load Balancing occurs during the iSCSI session establishment i.e. during iSCSI login from initiator to target. The iSCSI master redirects login to the appropriate CVM. The iSCSI client can also set preference for a node for load balancing and the iSCSI login is redirected to the preferred node as long as it is healthy.

 

 

 

 

 

AHV Dynamic Scheduling (Holistic DRS)

Sys admins are very familiar with the DRS concept. DRS balances computing workloads with available resources in a virtualized environment – and DRS is part of most virtualization stacks nowadays. DRS is closely associated to capacity planning – except that capacity planning takes a longer horizon viewpoint, while DRS optimizations are done within a constrained capacity environment, for a much shorter time window.

AHV Dynamic Scheduling is not much different from existing DRS implementations at first sight, but unlike legacy solutions Nutanix factor in compute, memory and storage for initial VM placement as well as optimal placement after the VM is created. Moving forward AHV DRS may also provide pre and post sizing recommendations for applications and VMs using a the same holistic view of the stack.

 

 

 

 

Application Mobility Fabric (AMF)

Acropolis Container Services

The cloud movement led by AWS enabled on-demand, automated infrastructure and propelled agile development methods. New application architectures began to emerge, such as Micro-services and Unikernels, and the distinction between operators and developers started to disappear and created a DevOps culture.

Now, Docker came in and established separation of concerns – Containers existed for a long time, but Docker brought in the notion of application containers wherein the entire application environment – including the developer code, dependencies, packages etc can be captured in an immutable object and shipped around as a container. This is the core innovation of Docker, and brings two big benefits:

 

  1. The development pipeline can be streamlined and automated much more easily because the drift between environments (Dev, QA, Staging, Pre-Prod) is tightly controlled.
  2. Mobility of applications between different environments (even different clouds) is a lot easier because of the self-contained nature of Dockerized applications.

 

Containers are ephemeral, so any local files running inside the container will not exist outside its lifetime. However, many applications require the ability to persist the user session activity, making some aspects of the application statefull. A brief look at Docker official repository tell us that apps with high interest are are the ones with containerized storage requirements.

 

Screen Shot 2016-06-19 at 10.11.25 AM Screen Shot 2016-06-19 at 10.11.43 AM Screen Shot 2016-06-19 at 10.12.23 AM

 

 

 

 

Current container persistent offerings on the market are very complex to set up and use, so many organizations continue to utilize VMs to maintain persistent state. Enterprises want persistent storage for containers, and they also want to use the same infrastructure to manage Dockerized and traditional workloads.

The Nutanix Native Container Management enable customers to seamlessly include Continuous Integration/Delivery, Containers-as-a-Service, Micro-services, Hybrid Cloud and Multi-Cloud solutions as part of the delivery infrastructure.

…and the most important is that when using Containers with persistent volumes on Nutanix organizations can still leverage all the enterprise storage capabilities, such as snapshots, cloning, replication etc.

How does it work?

Docker Machine Driver – Docker Machine is the provisioning tool that lets you create Docker hosts on your computer, cloud providers, or inside your own datacenter. It creates servers, installs the Docker Engine on them and then configures the Docker client to talk to them. The Docker Machine support enable end-users (both administrators and developers) to provision Docker engine on AHV VMs, and manage the VMs with docker-machine cli commands.

Docker Volume Plugin – The Docker Engine Volume Plugins enable Docker Engine deployments to be integrated with external storage systems, and enable data volumes to persist beyond the lifetime of a single Engine host.

Docker Datacenter (UCP) is Docker’s management solution for Containers, and with a native volume plugin users can select the driver and utilize Nutanix DSF for Persistent Storage. In AOS 4.7 Container management is only available via CLI, but in ‘Asterix’ it will be completely integrated into Prism.

[watch the video in HD]

 

 

CPS Standard on Nutanix (Project La Jolla)

Hyper-V users need to use Windows Azure Pack on Nutanix. Windows Azure Pack is a collection of Microsoft Azure technologies available to Microsoft customers. It integrates with Windows Server, System Center, and SQL Server to offer a self-service portal and cloud services such as virtual machine hosting (IaaS), database as a services (DBaaS), scalable web app hosting (PaaS), and more.

Currently the process to instantiate Windows Azure Pack is manual, cumbersome and highly error prone. In a true Nutanix fashion, the adoption of any hypervisor or cloud should have a 1-click equivalent option to deploy and configure the solution seamlessly.

We are very happy work with our friends at Microsoft! CPS Standard is a partnership project with Microsoft that eases Windows Azure Pack deployment by providing a simple turnkey solution. Nutanix is shipped with Hyper-V from factory with CPS Standard bits preloaded onto the system and natively integrated into Prism.

Here is the official press-release (Nutanix Expands Relationship with Microsoft through Jointly Engineered Hybrid Cloud Solution)

We made it as seamless as possible so that Azure Pack can be deployed in a matter of hours. CPS Standard will provide customers with the same Microsoft Azure like experience for their private cloud (namely the Azure admin and tenant portals) and they can also choose to onboard to the public Azure portal to use features like operational insights and site recovery.

 

[click the images to enlarge]

ms_cps_01 ms_cps_02

ms_cps_03

 

 

 

 

 

 

 

 

 

Prism

Network Visualization in Prism

As you know, if the network is misconfigured, an application’s VMs can stop working or their performance can be severely degraded. For example, VLAN misconfiguration can result in VMs of the same application not being able to talk to each other. And network configuration mismatches, such as MTU mismatch or link speed mismatch can cause severe performance degradation due to excessive packet drops.

What makes troubleshooting network problems hard is that a misconfiguration in any of the switches along a network path can cause a problem, and to troubleshoot that administrators need to have a global view of the network configuration.

This is exactly what Network Visualization is intended to solve.

It provides a view of the entire network: from each individual VM to the virtual switches, to the host physical NICs, to the TOR switches and so on. It displays the config of the network elements, such as VLAN configuration, in an intuitive and easy-to-use interface. And it allows system administrators to easily navigate the network, for example to group information by user, project, or host.

Nutanix uses LLDP to discover and validate the network topology, and also provide support for ESXi’s DVS and the latest versions of the standard vSwitch. To discover configuration information from switches SNMP is used. For example, the VLAN info for each port is gathered using SNMP, along with the network stats.

Once the topology is retrieved along with config and stats info from the virtual and physical network elements, Nutanix present the information in an easy-to-use interface.

[Click to Enlarge]

concept-v5-06-host-filtered

 

 

 

 

 

 

 

 

What-If Analysis

What-if analysis is a way to specify new future workloads along with the time at which they’ll come in. You can specify it in terms of existing VMs, for example as 10 additional instances of an existing VM. Or you can specify it as a percentage change of the existing workload; this supports both expanding and shrinking existing workloads. Finally, you can specify it as one of the pre-defined common workloads.

For example, you can specify your workload as a business-critical medium-sized OLTP SQL server workload, and the what-if tool will estimate the workload size. The what-if analysis tool will get accurate sizing estimates because the tool is integrated with the Nutanix sizer, which is what we use to do initial deployment recommendations. Then what-if analysis tool will support several pre-defined workloads, such as SQL server, VDI, Splunk, and Xen App.

AOS 4.6 already provides the runway component view, and it uses capacity planning algorithms to predict the runway for the different resources and the overall runway for the cluster.  Based on that, the what-if analysis can give administrators a recommendation as to the nodes that have to be added along with the dates on which they should be added, so that the runway extends all the way to the target runway.

The last component of the what-if analysis – the pane to add and delete nodes. As you add or delete nodes, the target runway is automatically updated.

Once you add the workloads and the HW, and the system gives its recommendations, whatever is shown in the what-if UI can be used as the starting plan that can be tweaked and tuned. For example, you can adjust the start dates for the various hardware recommendations based on your budget constraints to see what the runway looks like and similarly adjust workload start times; maybe some lower priority workload can come later. You can tune it until you get to the workload and HW plan that is optimal for you.

[Click to Enlarge]

 

 

 

Entity Browser Improvements (Custom Focus, Page Size, Shift Select)

 

 

 

 

 

 

 

 

 

Nutanix self-service

AOS 4.6 introduced full Openstack support and integration for AHV hypervisor, offering drivers for Nova, Cinder, Glance and Neutron. While Openstack has a big market adoption and works flawlessly with Nutanix, Openstack is not a native Nutanix solution and is not capable of leveraging many of the advanced Nutanix capabilities because Openstack was built to work with all types of underlying infrastructures.

Nutanix self-service is integrated intro Prism Pro and provides access to IT resources, policies and secure tenant-based access. The self-service portal enable tenants to deploy applications (virtual machines, containers, private cloud, hybrid cloud) without IT intervention, enabling organizations to provide developers or tenants with AWS like experience.

 

Admin Portal

  • Create/Manage Projects
  • Create/Add users and groups
  • Assign Resources
  • Assign Actions
  • Run Show-back reports

Tenant Portal

  • Deploy Applications from a Catalog (VM Template, vDisk, Images from Docker Hub, App Templates)
  • Monitor Applications
  • Monitor Resource Usage

 

 

 

Other features Included in AOS 4.7:

  • Pulse Reminder
  • AHV affinity and anti-affinity rules
  • 1-Click LSI HBA Upgrade
  • Entity Browser Improvements (Custom Focus, Page Size, Shift Select)

 

That’s a long list of features that we are baking into the next two releases of AOS, but that’s not all…. at .NEXT Europe, in November, we will make more announcements related to the ‘Asterix’ release that will blow your mind – I promise you!!! Keep tuned! 

 

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net

Older posts «