Jan 11 2015

The One with the GPU Graphics Performance

Those who follow my articles know that I have always been a big supporter of VDI as a method to simplify desktop administration, reduce costs, improve user experience, increase productivity, provide remote access and enable BYOD, and provide an effective disaster recovery solution for workspaces. Whilst VDI has delivered on these promises, there has been previously room for improvement in a few areas.

Brian Madden cleverly highlights in his article Desktop Virtualization in 2015: What’s coming, what do we need? He makes the point that vendors have now delivered most of the desired features to enable VDI to be a successful replacement for physical desktops. In particular, one of the areas that needed most attention was graphics performance.

These days we have better remoting protocols, which support multimedia redirection, lower bandwidth and higher latencies. We also have full support for physical GPUs (Graphics Processing Units) installed in each hypervisor that can be used on a 1:1 basis or shared across multiple virtual desktops, increasing graphics performance and user experience.

There are other industry innovations in other areas like storage and image management that are making administrators life much easier, but for this article I would like to focus on graphics performance because there has been some public discussions and perhaps misunderstandings about the use of the technology.

Graphics Performance is critical in VDI deployments and I have extensively written in the past about VDI User Experience and User Acceptance and how they play a big part in successful deployments. Furthermore, it’s not coincidence that the most popular article in my blog for the past couple years was How to improve VDI with Hardware Accelerated 3D Graphics. Administrators and users are both concerned about obtaining the best user experience, just like the most recent laptops with advanced Graphics Processing Units.

Citrix is supporting shared vGPU with XenServer today and VMware will be supporting it with the upcoming vSphere 6.0 release. Additionally, there are some interesting upcoming technologies that will enable local GPU in our tablets and laptops to decode h.264 via silicon instead of software. Gunnar Berger does an excellent job explaining how this will work here http://youtu.be/OJlj-OMry-M.

For VDI deployments vGPU is a ‘must have’ when working with 3D imaging applications such as CAD, CATIA, MAYA, Lumion and others. vGPU may also be required for medical applications when visualizing tri-dimensional images. VDI with vGPU is a fantastic replacement for physical dedicated graphics workstations. Even Microsoft Office and Internet Explorer are now making use of vGPU if the card is available on the server executing the virtual desktop.

Some say that it’s just a matter of time before GPUs are available in every device, and I agree with them. I believe GPU’s will eventually be fully baked into server motherboards and CPUs, but that is just not happening yet.


However, I do not agree with the position that some are taking, saying that vGPU is a ‘must have’ for every single VDI deployment.


Now, before I move forward I need to clarify that everything I am writing here is my own opinion about the subject and that does not necessarily represent my employer positioning about the subject; as matter of fact many of my colleagues think differently than me and I respect that. However, I have always been open about my thoughts and will guide my readers in the direction I believe is the correct one.

While analyzing market data we will find out that GPU penetration is approximately 1-3% of the entire VDI market and the VDI market grows at approximately 10-15% CAGR. Just this data alone points me to the conclusion that ‘today’ VDI is being successfully deployed without vGPU. That also means that Software 3D Rendering is in use for the large majority of VDI deployments. Maybe the GPU market penetration is so low because Shared GPU support has been only available for XenServer; however in conversation with one of the GPU vendors I was told that the GPU market will peak at 10% of the entire VDI market. I personally think it will be bigger overtime.


Screen Shot 2015-01-11 at 9.39.40 PM

Sr. Director, Product Management for Clients and Protocols


Many of the very large deployments I see happening today, with anything between 1K to 90K desktops, do not have major GPU requirements. They are actually being sized for single vCPU and 1.5-2GB RAM on average, for better or worse!

I clearly see the user experience benefits that GPU provides, but I also see a large majority of VDI workloads where ‘today’ GPU is a nice addition that increase graphics performance, but it is not a requirement. Think call centers (the biggest VDI consumer today), hospitals (when imaging is not required), task workers (where only Office is required), Federal and GOV (where security is driving the deployment) etc.

GPUs are somewhat expensive today (approximately $2K per card per server to support 64 desktops with 128MB vRAM per user) and may reduce the overall VDI project ROI. However, I have to admit that I was pleasantly surprised with the price drop. Last time I looked at them they were ~$7K a unit.

The use of GPU certainly offloads CPU processing cycles, and some will argue that this allows for increased consolidation. That is true, but to implement GPUs one would need servers with larger datacenter space footprint, additional power consumption and additional cooling. These servers, generally speaking, can support more memory allowing more virtual desktop to be consolidated, but also increasing failure domains in case of a server outage. Also, when a single GPU is added per server one would be creating SPOF (single point of failures), and also not allowing for vMotion operations therefore creating problems during server maintenance tasks (I believe XenMotion is supported, but someone please correct me if I am wrong).

As of today, even when graphics rendering is offloaded to GPU, the protocol encoding still happens via CPU and software. That is deemed to change in the future, but for now the offload is not complete.


Another point I would like to highlight is that with more graphics processing your systems will be delivering more graphic data at a faster rate via the network, increasing the amount of bandwidth required per user. This certainly is not a problem when on local networks, but more bandwidth also means additional WAN costs (OPEX).

Here is an interesting paper from VMware on use of GPU with Horizon View, however it doesn’t have consolidation comparisons between a GPU and a non-GPU deployment. Worth reading anyway: VMware Horizon 6 and Hardware Accelerated 3D Graphics – Performance and Best Practices.

Another change about to happen is the move from Ivy Bridge processor to Haswell. Both are very similar chips with the differences being mostly related to architecture. That being said, there is a 5-7% increase in performance that will help to offload CPU if your VDI deployment is bound by CPU today.


I am always in favor of better user experience and best of breed graphics and performance, but I am also of the opinion that the large majority of workloads today can do just fine with Software 3D rendering; and market data tell us that this is what is happening today. It’s up to you as a business decision maker to analyze and make a decision if you want to make the investment now, later or never.

I could perhaps create a similar parallel with electric cars such as the amazing Tesla being best of breed; however it’s up to you to decide when it’s time to move to an electric car, and you may decide to move to a cheaper model for a number of different reasons.

In the GPU topic, perhaps a different solution would be to have nodes with GPU for users requiring more graphics performance, and nodes without GPU, but this approach will also increase the administration efforts involved in maintaining the platform.


In Summary…

The developments around improving the VDI experience have really matured the technology and there are now lots of options of how to deploy VDI. This is great news for IT organizations. From the stance of architecting and managing many VDI implementations, I think the approach we should take with any deployment is to understand requirements and do a true opportunity cost/benefit analysis of including vGPU on a case-by-case basis rather than a cookie cutter approach. At the end of the day choice is good and especially good for consumers.


… For those that didn’t notice, the title is a reference to the american TV series Friends and their episode titles.


This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.

Permanent link to this article: http://myvirtualcloud.net/?p=6861

Jan 01 2015

My Top 12 Predictions for 2015 and Few Shockingly Bad Predictions

 Last week I published on Twitter some of my enterprise datacenter predictions for 2015. While predictions are easy to be made up because they do not hold anyone accountable, I personally believe my top 12 will hold true for 2015. Here they are:



Screen Shot 2015-01-02 at 3.56.28 PM

“I think there is a world market for maybe five computers.” — Thomas Watson, chairman of IBM, 1943.


Screen Shot 2015-01-02 at 3.56.13 PM

“This ‘telephone’ has too many shortcomings to be seriously considered as a means of communication. The device is inherently of no value to us.” — Western Union internal memo, 1876


Screen Shot 2015-01-02 at 3.56.02 PM

“The wireless music box has no imaginable commercial value. Who would pay for a message sent to nobody in particular?” — David Sarnoff’s associates in response to his urgings for investment in the radio in the 1920s.


Screen Shot 2015-01-02 at 3.55.54 PM

“There is not the slightest indication that nuclear energy will ever be obtainable. It would mean that the atom would have to be shattered at will.” — Albert Einstein, 1932.


Screen Shot 2015-01-02 at 3.55.42 PM

“There will never be a bigger plane built.” — A Boeing engineer, after the first flight of the 247, a twin engine plane that holds ten people.


Screen Shot 2015-01-02 at 3.55.34 PM

“While theoretically and technically television may be feasible, commercially and financially it is an impossibility.” — Lee DeForest, inventor.


Screen Shot 2015-01-02 at 3.55.24 PM

“We don’t like their sound, and guitar music is on the way out.” — Decca Recording Co. rejecting the Beatles, 1962.


Screen Shot 2015-01-02 at 3.55.17 PM

“Where a calculator on the ENIAC is equipped with 18,000 vacuum tubes and weighs 30 tons, computers in the future may have only 1,000 vacuum tubes and weigh only 1.5 tons.” — Popular Mechanics, 1949


Screen Shot 2015-01-02 at 3.55.05 PM

“Everyone’s always asking me when Apple will come out with a cell phone. My answer is, ‘Probably never.'”—David Pogue, The New York Times, 2006


Screen Shot 2015-01-02 at 3.54.56 PM

World Economic Forum, 2004: “Two years from now, spam will be solved.”—Bill Gates


Screen Shot 2015-01-02 at 3.54.49 PM

“Apple [is] a chaotic mess without a strategic vision and certainly no future.”—TIME, February 5, 1996


Screen Shot 2015-01-02 at 3.54.32 PM“Remote shopping, while interlay feasible, will flop.”—Time magazine, 1968.


This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.

Permanent link to this article: http://myvirtualcloud.net/?p=6834

Dec 19 2014

VMware AppVolumes and Nutanix Shadow Cloning Marriage

My colleague Kees Baggerman published a brilliant post on how Nutanix Shadow Cloning not only improve performance for Horizon Linked Clones and XenDesktop MCS, but also hugely improve performance for VMware AppVolumes. I recommend reading his article here.

In his single VM test Kees Baggerman saw an improvement of almost 360% for Adobe Reader through the use of Shadow Cloning and data locality.

I would like to shed some light on how important these performance improvements are in the context of distributed storage architectures.

The performance improvements seen by Kees were due to Nutanix being able to analyze the IO access pattern for the AppVolume vmdk and determine that the vmdk was serving only read requests. Based on that Nutanix create local copies, in RAM or SSD, of the vmdk for each server accessing the AppVolume stack.

If you look at the picture below, where I have a large cluster running VDI with hundreds or thousands of virtual desktops you will notice that all desktops access the same AppVolume vmdk (e.g. MS Office 2013) hosted and managed by node number 1.

In a distributed storage architecture the vmdk is distributed across servers (the methodology for how the vmdk is distributed varies according to the solution), but there is commonly a single server actively serving the vmdk. In this case the server number 1 is actively serving the vdmk.

These operations not only can saturate the network with unnecessary operations but also overload the active serving vmdk server with requests, throttling resources.


Screen Shot 2014-12-18 at 10.55.34 PM


What kills network performance is not latency but congestion. Congestion can come in many forms – microbursts (200Mbps burst for 10ms, which equates to 20Gbps equivalent of traffic on the 10G port for that 10ms, resulting in high traffic getting dropped if switches do not have enough buffer), or for e.g. a misbehaving NIC sending PAUSE frame and slowing down the network.

Caching is also extensively used by distributed storages. Caching can use SSD or RAM and it uses ingest and eviction policies according to data access and capacity allocation. Once data is evicted from cache it needs to be once again read from server number 1 over the network, when required. VDI environments commonly have very random IO pattern causing data eviction to be very frequent, especially if no data de-duplication is available. Performance of virtual desktops suffers when the active dataset can no longer fit in premium storage tiers, and that’s how data de-duplication help performance.

Shadow Cloning do not allow data eviction to happen. This approach completely eliminate the need for communication over networking to re-load data. The data is only migrated on read as to not flood the network and allow for efficient cache utilization.  In the case where the AppVolume vmdk is modified the Shadow Clones will be dropped and the process will start over.

Hopefully now you understand how AppVolumes and Shadow Cloning provide better performance and resource utilization. Now, visualize a VDI environment without Shadow Cloning with 1000+ users and the chaos to provide access to a multitude of AppVolumes stacks across the network on a consistent basis with cache ingestions, evictions and the network congestion.


In a next article I will introduce you to AppVolumes stack replication to different sites and datacenters. Nutanix is one of the only solutions that is able do vmdk based backups and replication, instead of virtual machines.


This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.

Permanent link to this article: http://myvirtualcloud.net/?p=6825

Older posts «

» Newer posts