Jan 01 2015

My Top 12 Predictions for 2015 and Few Shockingly Bad Predictions

 Last week I published on Twitter some of my enterprise datacenter predictions for 2015. While predictions are easy to be made up because they do not hold anyone accountable, I personally believe my top 12 will hold true for 2015. Here they are:



Screen Shot 2015-01-02 at 3.56.28 PM

“I think there is a world market for maybe five computers.” — Thomas Watson, chairman of IBM, 1943.


Screen Shot 2015-01-02 at 3.56.13 PM

“This ‘telephone’ has too many shortcomings to be seriously considered as a means of communication. The device is inherently of no value to us.” — Western Union internal memo, 1876


Screen Shot 2015-01-02 at 3.56.02 PM

“The wireless music box has no imaginable commercial value. Who would pay for a message sent to nobody in particular?” — David Sarnoff’s associates in response to his urgings for investment in the radio in the 1920s.


Screen Shot 2015-01-02 at 3.55.54 PM

“There is not the slightest indication that nuclear energy will ever be obtainable. It would mean that the atom would have to be shattered at will.” — Albert Einstein, 1932.


Screen Shot 2015-01-02 at 3.55.42 PM

“There will never be a bigger plane built.” — A Boeing engineer, after the first flight of the 247, a twin engine plane that holds ten people.


Screen Shot 2015-01-02 at 3.55.34 PM

“While theoretically and technically television may be feasible, commercially and financially it is an impossibility.” — Lee DeForest, inventor.


Screen Shot 2015-01-02 at 3.55.24 PM

“We don’t like their sound, and guitar music is on the way out.” — Decca Recording Co. rejecting the Beatles, 1962.


Screen Shot 2015-01-02 at 3.55.17 PM

“Where a calculator on the ENIAC is equipped with 18,000 vacuum tubes and weighs 30 tons, computers in the future may have only 1,000 vacuum tubes and weigh only 1.5 tons.” — Popular Mechanics, 1949


Screen Shot 2015-01-02 at 3.55.05 PM

“Everyone’s always asking me when Apple will come out with a cell phone. My answer is, ‘Probably never.'”—David Pogue, The New York Times, 2006


Screen Shot 2015-01-02 at 3.54.56 PM

World Economic Forum, 2004: “Two years from now, spam will be solved.”—Bill Gates


Screen Shot 2015-01-02 at 3.54.49 PM

“Apple [is] a chaotic mess without a strategic vision and certainly no future.”—TIME, February 5, 1996


Screen Shot 2015-01-02 at 3.54.32 PM“Remote shopping, while interlay feasible, will flop.”—Time magazine, 1968.


This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.

Permanent link to this article: http://myvirtualcloud.net/?p=6834

Dec 19 2014

VMware AppVolumes and Nutanix Shadow Cloning Marriage

My colleague Kees Baggerman published a brilliant post on how Nutanix Shadow Cloning not only improve performance for Horizon Linked Clones and XenDesktop MCS, but also hugely improve performance for VMware AppVolumes. I recommend reading his article here.

In his single VM test Kees Baggerman saw an improvement of almost 360% for Adobe Reader through the use of Shadow Cloning and data locality.

I would like to shed some light on how important these performance improvements are in the context of distributed storage architectures.

The performance improvements seen by Kees were due to Nutanix being able to analyze the IO access pattern for the AppVolume vmdk and determine that the vmdk was serving only read requests. Based on that Nutanix create local copies, in RAM or SSD, of the vmdk for each server accessing the AppVolume stack.

If you look at the picture below, where I have a large cluster running VDI with hundreds or thousands of virtual desktops you will notice that all desktops access the same AppVolume vmdk (e.g. MS Office 2013) hosted and managed by node number 1.

In a distributed storage architecture the vmdk is distributed across servers (the methodology for how the vmdk is distributed varies according to the solution), but there is commonly a single server actively serving the vmdk. In this case the server number 1 is actively serving the vdmk.

These operations not only can saturate the network with unnecessary operations but also overload the active serving vmdk server with requests, throttling resources.


Screen Shot 2014-12-18 at 10.55.34 PM


What kills network performance is not latency but congestion. Congestion can come in many forms – microbursts (200Mbps burst for 10ms, which equates to 20Gbps equivalent of traffic on the 10G port for that 10ms, resulting in high traffic getting dropped if switches do not have enough buffer), or for e.g. a misbehaving NIC sending PAUSE frame and slowing down the network.

Caching is also extensively used by distributed storages. Caching can use SSD or RAM and it uses ingest and eviction policies according to data access and capacity allocation. Once data is evicted from cache it needs to be once again read from server number 1 over the network, when required. VDI environments commonly have very random IO pattern causing data eviction to be very frequent, especially if no data de-duplication is available. Performance of virtual desktops suffers when the active dataset can no longer fit in premium storage tiers, and that’s how data de-duplication help performance.

Shadow Cloning do not allow data eviction to happen. This approach completely eliminate the need for communication over networking to re-load data. The data is only migrated on read as to not flood the network and allow for efficient cache utilization.  In the case where the AppVolume vmdk is modified the Shadow Clones will be dropped and the process will start over.

Hopefully now you understand how AppVolumes and Shadow Cloning provide better performance and resource utilization. Now, visualize a VDI environment without Shadow Cloning with 1000+ users and the chaos to provide access to a multitude of AppVolumes stacks across the network on a consistent basis with cache ingestions, evictions and the network congestion.


In a next article I will introduce you to AppVolumes stack replication to different sites and datacenters. Nutanix is one of the only solutions that is able do vmdk based backups and replication, instead of virtual machines.


This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.

Permanent link to this article: http://myvirtualcloud.net/?p=6825

Dec 18 2014

Nutanix Metro Availability Operations – Part 3

In the first part of this Nutanix Metro Availability series I discussed datacenter failure recovery operation (failover) to a secondary site. In the second part I talk about the operational procedure to resume normal operations after a successful failover. In this third and last part I discuss the operational procedure to recover an entire datacenter to a new Nutanix cluster in a new site.

If you missed the announcement of NOS 4.1, please refer to All Nutanix 4.1 Features Overview in One Page (Beyond Marketing).


Datacenter Failure Recovery to a new Cluster – Operation

The example below follows a datacenter failure recovery (failover) as described in my first two articles.

I had two sites that were replicating distinctive containers to each other (bi-directional). After a network or datacenter outage the Metro Availability peering was automatically broken to allow each surviving cluster to operate independently. However in this case, let’s assume that Site 1 was completely lost due to flooding or another natural disaster.

In this case site 2 had all the data for site 1, and site 1 is completely down. The administrator decides to move the entire workload belonging to site 1 to Site 3. (Please note that the administrator may choose to temporally run the workload from site 1 in site 2 until it’s time to move to site 3).

Just like the other metro cluster operations, re-establishing operations in a brand new site is just couple clicks away. After racking, stacking and configuring the new cluster in site 3 the administrator need to establish connection between sites 2 and 3. This can easily be done via the PRISM UI.

The first picture demonstrates the scenario described, where site 2 is lodging the workload from site 1 (blue) and it’s own workload (green). Now data must be migrated to a complete new cluster in a new site (1).



Screen Shot 2014-12-18 at 9.30.03 PM

The next step is to enable container replication (blue) for between sites (2).


Screen Shot 2014-12-18 at 9.30.23 PM

The replication, if bidirectional, will start to synchronize the data in the container (3) between both sites. Please note that metro availability replication works in conjunction with all NOS data management features such as compression, de-duplication, shadow clones, automated tiering and others. Additionally, Metro availability also offers compression over the wire to reduce the amount of bandwidth required.


Screen Shot 2014-12-18 at 9.30.32 PM


Once the replication is complete the next step is to promote the blue container (4) in site 3. The container promotion tells the Nutanix cluster in site 2 to now run the virtual machines on site 1. When that is done the virtual machines will automatically restart on the new cluster (5) and operations will be resumed. The promotion step is a one-click manual procedure, but it can also fully automated with some basic scripting or run-book automation tools. Please note that this scenario assumes stretch clusters and stretch VLANS are in use.


Screen Shot 2014-12-18 at 9.30.41 PM


Screen Shot 2014-12-18 at 9.31.04 PM


I have been in the technology and infrastructure space for a long time and have managed very large datacenters. I have never seen a solution that allows efficient data migration with failover and failback operations in such simple and elegant manner.

If you are interested in reading the first two parts of this series: Nutanix Metro Availability Operations – Part 1 and Nutanix Metro Availability Operations – Part 2.


This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.

Permanent link to this article: http://myvirtualcloud.net/?p=6816

Older posts «

» Newer posts