Aug 25 2014

Extend Visibility with the Stratusphere APIs


I am a big fan of programatic interfaces and have written extensively about it while working for Nutanix, EMC or VMware. Programmatic interfaces allow administrators to create custom automation workflows using scripting languages or workflow engines like vCenter Orchestrator, vCAC, Puppet, Chef, BMC and others. This is the ultimate goal for a true Software-Defined Datacenter, enabling a powerful application ecosystem that make automated decisions about the use of infrastructure resources in a simple, efficient and easy manner.

When I provided LiquidwareLabs with a space for a sponsored article I could not be happier to learn that they are now fully embracing programatic interfaces within their platform.



Sponsored Article

Byline: Kevin Cooke, Product Director, Stratusphere Solutions for Liquidware Labs

Multi-platform and heterogeneous virtual environments can be tricky to manage. Responsibility is often divided, and the silos are very much interdependent. For example, different IT groups may own the delivered application or service, the virtualization layer, and the supporting hardware; and lastly, the operations group, who are accountable to meeting SLAs and keeping things running smoothly. Knowing where constraints may be present or staying ahead of the curve can be a difficult process.

Consider desktop virtualization for a moment. You have the hosts and storage infrastructure, which may be managed by your server and storage team. There’s the hypervisor and allocation of virtual resources, possibly managed by a dedicated virtualization team. Let’s not forget the network team, who provide the pipes, supporting directory services and remote connectivity. Out on the fringe are the folks that manage data center operations, and of course, there are the magicians who provide the desktop VMs, application delivery and image management. And we certainly cannot forget the business units and end users, who expect nothing less than user experience nirvana.

Complicating matters somewhat, each of the above groups likely has adopted their own tools and workflow to best support each area of purview. And while the overall approach itself is not problematic, it does open cracks between the groups that can pose problems when users complain about poor experience.

“Is it the image?” “The hardware?” “The network … who knows?”

“Storage is showing some performance challenges, but is the issue really due to the data store or is it a lack of IOPs?”

Pressure is mounting. Management is requiring a status report, the storage team is under fire and no one has thought to look at the rogue application processes wreaking havoc on vCPU queueing and overflowing vRAM paging.

Of course, having everyone use the same monitoring solution can help to avoid some of the challenges in this scenario, but the reality is most of the discrete IT teams in the thick of this mess already have an existing tool and workflow to support their troubleshooting approach. So how do you preserve these existing methodologies and approaches, while providing information to the most relevant IT teams in a way that supports their approach and choice of tool?

Enter the Stratusphere API and Shared Visibility

Stratusphere UX from Liquidware Labs, is a solution that supports the functions of monitoring, performance validation and diagnosing the complexities highlighted in the very common real-world example outlined above. The solution—which includes the Stratusphere Hub, Database, Network Station and Connector ID Key—gathers extremely detailed metrics and information about all aspects of the virtual desktop user experience.

In a 1,000 or so desktop environment, Stratusphere UX will gather a few million data points per hour. Details about the machines, users, applications, network (pipes and services), as well as the contributing infrastructure will be captured and stored in the Stratusphere Database. This information can be correlated to constraining events and tied to a composite metric that quantifies the user experience. Better still, this information is exposed and easily accessible through the Stratusphere application programming interface (API).

In its more basic use, the Stratusphere API can be used to generate ad-hoc reporting and to support the creation of custom-defined tables that can be exported in HTML, CSV and native Microsoft Excel formats. But the real silo-busting power of the Stratusphere API lies in its ability to export information in JavaScript Object Notation (JSON) format, an open standard text format that is human-readable and easily transmitted between server and web-based applications.

In the above VDI downtime event—and with the visibility provided via the Stratusphere API—relevant metrics and information about the performance, user experience and state of other areas of the architecture can be flowed from Stratusphere UX to the native tools each IT group has previously chosen to gain visibility of their little corner of the overall architecture. For example, the JSON-outputted metrics and information can be sent to HP Operations Manager to assist the server, storage and networking groups, while BMC Remedy receives a helpdesk ticket to alert operations of the groups, users and applications affected.

The power in this approach is preserving workflow. IT groups are able to work within their existing monitoring tools and can leverage proven methods to troubleshoot and determine how their component of the architecture may be contributing to the poor performance. Further, the use of the Stratusphere API greatly facilitates inter-group trouble ticketing as a defined composite metric like the VDI User Experience can be leveraged to baseline and correlate activities across functional IT groups.

Stratusphere UX was designed to provide visibility into complex multi-platform and heterogeneous virtual environments. Its user-centric approach to monitoring, performance validation and diagnostic capabilities take much of the complexity out of next-generation desktop workspaces. And with the Stratusphere API, IT groups are able to support these virtual environments in a way that meets business goals, minimizes risk and supports both the organizational and IT changes ahead.


This article was first published by Andre Leibovici (@andreleibovici) at

Permanent link to this article:

Aug 25 2014

Not Voodoo, just genuine Datacenter Automation, delivered Today, by Nutanix GSO

I have been extensively discussing programmatic interfaces and their importance in automating the modern datacenter. Programmatic interfaces allow IT administrators to create workflows using scripting languages and workflow engines like vCenter Orchestrator, Puppet, VMware vCenter Automation Center and others.

The final goal for a true Software-Defined Datacenter is to empower applications and tools to make automated decisions about the use of infrastructure resources in a simple, efficient and easy manner.

Software-Defined Datacenter is not an all-in-one solution as some vendors claim. Furthermore integrating multiple stacks and systems that have not been built to work together is not an easy task. Lastly, integrating the data center stack still requires configurable automations for ongoing operational upkeep and efficiency.

Since its inception Nutanix engineering ensured that all features and functions were exposed via programmatic interfaces, enabling access to datacenter services within the platform.

As part of the extensible automation that enables application ecosystems I recently demonstrated The Nutanix Plugin for XenDesktop (Demo Here), a tool that extends Nutanix functionalities into 3rd party products, in this case Citrix XenDesktop. Nutanix customers are already using it in Beta and I will soon disclose other powerful integrations with different applications.

This time around I would like to share with you a powerful integration and automation delivered by Nutanix GSO for automated provisioning of Infrastructure-as-a-Service solution. Utilizing extensible Nutanix APIs, Nutanix Foundation (Demo Here), VMware vCenter Automation Center and vCenter Orchestrator Nutanix has completely automated datacenter provisioning for organizations of any size.


Watch the video below to better understand how this works.



  • Automated Provisioning

Everything starts with the vCAC reference architecture for Nutanix that includes configuration with pre-determined scenarios and solutions to provide Infrastructure-as-a-Service for mainstream enterprise. With workflows pre-configured both vCAC and VCO are used to provision datacenter nodes or blocks and racks using the Nutanix Foundation Plus, which is an extension of the Nutanix Foundation tool. In a common scenario vCAC issues service orders to VCO ,which in turn manipulate Foundations Plus APIs.

After racking and cabling your Nutanix block administrators are only few clicks away from effectively implementing the Nutanix hardware and software as usable organizational resources. These processes can even be kicked off automatically when the system detects high cluster resource utilization.


  • Backup as a Service (BaaS)

Another service fully implemented by Nutanix GSO is BaaS. Nutanix BaaS includes backup/snapshot/recovery-as-a-service, yet being fully configurable by end users. Leveraging Nutanix APIs end users are able to select what backup SLA is desired for a given workload, either pre or post-provision. There will also be a method to leverage re-hydration/application of a snapshot.

In addition to the current service mentioned above, this automation takes Nutanix to a whole different level where individual workloads or vDisks can be manipulated independently and coordinated in terms of backups, restores, compression, de-duplication, replication, quality of service, snapshots etc.


These are offerings available today and delivered via Nutanix GSO. To find out more or see a more detailed demo come to the Nutanix booth and ask for the GSO team, or contact Nutanix Sales directly.


Genuine Datacenter Automation, delivered today, by Nutanix Global Services (GSO) – ensuring customer success by replacing days of manual provisioning into a predictable and rapid provisioning of Webscale infrastructure within minutes.


I would like to acknowledge the work of the fantastic Nutanix GSO team in pulling all this automation together for Nutanix customers and partners. These are the smart guys behind all this automation and I recommend you to follow them:

  • Phil Ditzel: @philditzel
  • Artur Krzywdzinski: @artur_ka
  • Magnus Andersson: @magander3
  • Also, special acknowledgements to Bill Hussain @bhussain, Jeremy Launier and Subash Atreya


This article was first published by Andre Leibovici (@andreleibovici) at

Permanent link to this article:

Aug 19 2014

Nutanix 4.1 Features Overview (Beyond Marketing) – Part 1

Today Nutanix announced the first part of a multi-month announcement for NOS 4.1. This release is mostly focused on enhancements for the areas of resiliency, security, disaster recovery, analytics, supportability and management. However, even being a ‘dot’ release, NOS 4.1 delivers very important features, and in my option this version has enough meat to actually even be considered a major release.

It’s been only 5 months since NOS 4.0 was announced with the introductions of Hybrid On-Disk De-Duplication, Failure Domain Awareness, Disaster Recovery for Hyper-V, Snapshot Scheduling, One-Click NOS Upgrade and others. If you missed the NOS 4.0 release announcement read about it at Nutanix 4.0 Features Overview (Beyond Marketing).

This is the power of software-defined architectures, running on standard x86 hardware, with no special purpose machine doing one and one thing only. The software approach allows for faster release cycles and whenever there are hardware performance improvements you get to enjoy the benefits and performance improvements right away.

Please refer to the product Release Notes for official information from Nutanix.

Let’s start with the New Features…


Data Protection

  • Cloud Connect

Nutanix allows administrators to implement and manage VM centric Disaster Recovery policies across multiple sites and datacenters using a multi-topology architecture.

Whilst the Nutanix built-in DR capabilities allow administrators to specify snapshot retention policies, this approach can be expensive even when data de-duplication and compression are already enabled across cluster. There must be enough storage capacity available to retain the data and be able to handle multiple backups and snapshots over a long period of time.

Nutanix has now officially announced the ability to leverage the global, distributed and highly available infrastructure of Amazon Web Services for data backup and restore. That means that an on-premise Nutanix cluster is now able to back up and restore virtual machines to AWS, while having organizations getting billed directly for AWS for EC2 and S3 costs.


With Cloud Connect organizations are able to schedule, manage and use local and remote snapshots and replication for backup and disaster recovery from within Nutanix PRISM user interface.

  • Local snapshots with Time Stream
  • Backup to Amazon S3
  • Integration with VSS and SRM
  • Quick restore and state recovery
  • WAN-optimized replication for DR
  • Works with ESXi and Hyper-V
  • 15 minute Cloud RPO

Administrators also have the ability to control and fine tune the back-up schedule and retention policies to meet the needs of the workload. The back-up schedule can be as low as 15 minutes or as high as a day enabling you to choose the frequency that best meets the infrastructure capacity and SLAs.

A Nutanix NOS instance (software-only) runs on AWS and each AWS NOS instance is managed as a remote site in Nutanix – it’s all integrated and managed via a single pane of glass. In AWS the remote site will require a M1.xlarge instance and the AWS NOS instance may be created in an availability zone of choice, increasing reliability and resiliency as needed by your organizations. You can have as many AWS NOS instances as needed.

The NOS cluster services run on this AWS virtual instance and uses Amazon EBC for metadata and S3 to storage the backup data. All the communication is done via VPC or SSH tunnels with optimized data transfer de-duplication for both backup and restore operations.  Data is de-duplicated and compressed before it is backed-up in the public cloud with net result savings of 75% in both network bandwidth usage and storage footprint. It is also possible to throttle the network bandwidth consumption of backup to ensure that applications performance is not affected in any way. In addition to that, data transfer throughput with SSH falls by approximately by 25% compared to VPC and is the recommended method.

Just like any other cloud service AWS is subject to failures and outages. For this reason the AWS NOS instance takes automatic periodic snapshots of EBS volumes stores in S3. A AWS failure or unavailability will automatically raise an alert on the on-premise NOS cluster, maintaining that single pane of glass for managing all your Nutanix clusters and instances.

I will soon be working on a video demo to publish here, but in the meantime you can watch this video with Disaster Recovery – Failover and Failback with Nutanix.




  • NX-8150

Nutanix scale-out solution already handles the large majority of workloads using existing NX models, but certain workloads with very large active datasets or write IO intensive can benefit from additional performance. The NX-8150 is the new platform for those Tier 1 business critical workloads and it can be mixed with existing Nutanix clusters.

The NX-8150 is the first platform to fully leverage the Nutanix scalable write buffer, which improves performance and drives down latencies for the most demanding applications. In addition, the larger SSD tier for active data satisfies the latency and IOPS requirements for database workloads, without the excessive cost of an all-flash appliance.

The ability to mix different nodes types into a single cluster makes it practical and easier to eliminate infrastructure silos in datacenters.


The NX-8150 is targeted at Microsoft Exchange, SharePoint, SAP, High performance databases such as MS SQL Server and Oracle RAC; and it comes with 4 times the number of flash devices compared to previous Nutanix platforms to accommodates a much larger active data set. The NX-8150 is also the first Nutanix platform that offer flexible configuration of storage and server (compute) resources from factory, and that include multiple CPU options, 3 different SSD configurations, a variety of memory profiles, and multiple options for connectivity including ability to expand to 4x 10GbE ports.


Another benefit of the NX-8150 platform with flexible configuration is the ability to minimize the number of software licenses (applications and hypervisor) to be purchased. Think vSphere and Oracle.

Tests and validations will soon come out from Nutanix performance and engineering labs. Here is an example of such validations for Microsoft Exchange workloads where using the new platform with only 2U footprint Nutanix is able to manage 6x more mailboxes per node – 3,300+ mailboxes per node following Microsoft testing guidelines and practices - can linearly scale the system to support 100,000s of mailboxes simply by adding additional NX-8150 nodes.


Security and Compliance


  • Data At Rest Encryption (NX-3060-E, NX-3061-E, NX-6060-E)

Nutanix clusters are deployed in a variety of customer environments requiring different levels of security, including sensitive/classified environments. These customers typically harden IT products deployed in their datacenters based on very specific guidelines, and are mandated to procure products that have obtained industry standard certifications.

Data-at-rest encryption is one such key criteria that customers use to evaluate a product when procuring IT solutions to meet their project requirements.

Nutanix Data-at-Rest encryption satisfies regulatory requirements for government agencies, banking, financial, healthcare and other G2000 enterprise customers who consider data security products and solutions. This new feature allow Nutanix customers to encrypt all or selected partitions on persistent storage using strong encryption algorithm and only allow access to this data (decrypt) when presented with the correct credentials.


  • Compliant with regulatory requirements for data at rest encryption
  • Leverages FIPS 140-2 Level-2 validated self-encrypting drives
  • Future proof (uses open standard protocols- KMIP, TCG)


To enable Nutanix DRE a 3rd party key management server is required. At the time of the launch only ESXi is supported and only the SafeNet KeySecure Cryptographic Key Management System is certified, but overtime other key management systems will be supported. Nutanix supports any KMIP 1.0 compliant key management system, but others have not yet been certified. The key management system can even be a VM running on the Nutanix cluster.

Currently it is not possible to mix a Nutanix DRE enabled cluster with a non-DRE cluster because the platform requires special FIPS 140-2 Level 2 SED drives to meet the data at rest encryption requirements. By breaking the homogeneity of the cluster, one will violate the data at rest encryption requirement for copies of data stored on non-SED drives.  However, both DRE and non-DRE cluster can be managed via the same PRISM Central UI.




  • One-Click Hypervisor and Firmware Upgrade

I kinda soft-launched this feature in my article Nutanix One-Click Upgrade now takes care of Firmware and Hypervisor too! after Nutanix CEO Dheeraj Pandey revealed the new feature in a tweet. Love it!

In addition to the already non-disruptive NOS upgrade, the One-Click upgrade feature now ensures that the hypervisor (vSphere and HyperV) is also automatically upgraded in a distributed and rolling fashion. Nutanix is solving a huge problem, where either manual intervention or external automation tools are required.

In addition to the new hypervisor upgrade feature Nutanix is now enabling NCC (Nutanix Cluster Health) upgrade and hardware firmware upgrade. Let’s talks about firmware upgrade.

Upgrading firmware on servers, controllers, disks and other interfaces is probably one of the major pain points for datacenter administrators. The workflow normally goes as follows: schedule maintenance outage, go to the manufacturer website, download correct firmware, flash firmware via web interface or with a Linux cli tool, reboot server, and then repeat this task for every server in the datacenter that require firmware upgrade. It’s complex and lengthy.

Because NOS runs in hypervisor user space Nutanix is able to execute complex firmware and hardware upgrade processes across multiple server components without requiring reboots. Using Nutanix highly distributed parallel processing the One-Click upgrade takes care of firmware upgrade for all disk devices in the cluster with zero impact to virtual machines and workloads. No reboots required, true AlwaysOn!

As a bonus the framework that is serving as the engine for the Nutanix Cluster Health, also knows as NCC, is also part of the One-Click upgrade process.




Keep tuned, Nutanix 4.1 Features Overview (Beyond Marketing) – Part 2 coming soon!


This article was first published by Andre Leibovici (@andreleibovici) at

Permanent link to this article:

Older posts «

» Newer posts