TPS, Large Memory Pages and your VDI environment

If you have been running VMware View 4.0 or greater the chances are that you have vSphere 4.0 U2 or vSphere 4.1 deployed in your VDI infrastructure. You may have VMware Infrastructure 3.5 U5 but I am hoping that at this stage you have already been trough the upgrade.

If you haven’t yet upgraded you will experience the behaviour I am explaining below when you upgrade to vSphere 4.0 and greater. If you have already upgraded your ESX hosts and the hosts are Nehalem based Xeon 5500 series CPUs, you may or may not have noticed that after the upgrade vCenter and esxtop show that the hosts are consuming more physical memory than before the upgrade even thou the number of virtual desktops have not changed.

The behaviour is not exclusive to VDI infrastructures, however in VDI, especially if 32-bit Windows XP is in use; there is a significant amount of TPS (Transparent Page Sharing) sharing across all virtual desktops. Because of a reduced TPS ratio across all hosts the behaviour is more evident in VDI infrastructures.

I must admit here that I may be out of my depth when compared to other bloggers such as Duncan Epping and Frank Denneman, who have written a book solely based on vSphere memory management techniques (HA and DRS deepdive).

Simply put, with vSphere 4.0 few changes have been introduced to the memory management techniques. TPS now, by default, only analyse 2MB memory blocks (large memory pages) to try to find common memory blocks. Only when the host server is under memory stress TPS will break those large memory blocks into smaller 4KB blocks to search for additional common memory blocks. This happens because vSphere relies on hardware-assisted memory virtualization, EPT(Intel) or RVI(AMD), to eliminate the need for shadow page tables, reducing kernel overhead.

VMware published a KB 1020524 about the behaviour – Transparent Page Sharing is not utilized under normal workloads on Nehalem based Xeon 5500 series CPUs.

I recently have been to a customer who after the upgrade to vSphere 4.1 noticed this memory consumption behaviour and asked me what was happening. I thought would be interesting to share.

The customer claimed that before the upgrade the memory consumption for each host in the View Cluster (8 hosts) was 70% and now is approximately 90%.


The esxtop screenshot above demonstrate the After the upgrade. Few things to observe here:

  • Memory is in high state (6% or more memory available – no memory contention)
  • MEMCTL (ballooning) is enabled
  • ZIP (memory compression) is in use and almost 1GB is currently compressed
  • SWAP is occurring and currently 637MB are swapped to disk

Interesting to note that normally, without memory contention, we should not see ZIP and SWAP. I did a little research and found out that when TPS is not sharing upfront it leads to memory contention that wouldn’t otherwise have happened. When large pages are broken down into small 4KB pages it will only try to combine these smaller bocks with other small blocks; not being able to share small blocks across all virtual desktops given that many large pages have not been broken. This leads to ballooning/swapping and possibly thrashing, as working the set memory accidentally gets swapped out.

After disabling Large Page support for Guest OS setting Mem.AllocGuestLargePage to 0 the customer immediately started to see memory consumption decreasing to previous numbers, approximately 70%.


The resxtop screenshot above demonstrate After the parameter change. Few things to observe here:

  • ZIP (memory compression) is reducing
  • Host SWAP is reducing

Instead of host setting it is also possible to turn off large pages support for each individual virtual machine setting monitor_control.disable_mmu_largepages = TRUE in the .VMX file.


Performance Hit when Disabling Large Pages support

Disabling Large Pages support will likely produce a CPU performance hit in your hosts. Duncan Epping discussed and tested this in his article Re: Large Pages (@gabvirtualworld @frankdenneman @forbesguthrie).

In Frank’s article he says: “Using Large pages shows a different memory usage level, but there is nothing to worry about. If memory demand exceeds the availability of memory, the VMkernel will resort to share-before-swap and compress-before-swap. Resulting in collapsed pages and reducing the memory pressure.”

Most VDI environments are memory bound, not CPU, therefore if you know that an additional 15% to 20% CPU hit is not a biggie you may decide to disable Large Pages support. The fact that only small pages are used will help to augment the memory consolidation ratio without hitting hard walls such as ZIP and SWAP.

My customer was experiencing SWAP and ZIP and didn’t have visibility of the memory % in use. They decided to keep large pages disabled given that their CPU utilisation was as low as 50%.


Even more Memory Consolidation

I previously wrote an article on how to Increase VDI consolidation ratio with TPS tuning on ESX (read the article for full context). Basically reducing the Mem.ShareScanTime from Default 60 to a lower number virtual machines will have their physical memory pages scanned faster and more common blocks may be found. As always, be aware of the CPU hit when optimizing memory advanced parameters.


Related recommended reading:
Large Pages, Transparent Page Sharing and how they influence the consolidation ratio
Re: impact of large pages on consolidation ratios


3 pings

Skip to comment form

    • PiroNet on 03/17/2011 at 10:35 pm

    Great post from the field!

    The ESXTOP exhibits do show values for metrics ‘SWP’ and ‘curr’ but that doesn’t mean swapping to disk is currently happening.

    Metrics to look at are SWR/s and SWW/s. If different than zero, the swapping is actively occurring which is not good.


  1. Great post, It can be scary seeing that for the first time. I have a post that shows TPS kicking in but it well over 90% usage. I have just left the settings as is. Not a big deal to me but the alarms are misleading with large pages.
    I will be directing my staff to give this post a read.

  2. In the screenshots above, both before and after large pages were disabled, the host is 90% overcommitted on memory. I have to say that I have never seen a host that far overcommitted before, even in VDI environments. Just curious – do you know approximately how many VMs were running on the host at that time?

    I think that large pages should be left enabled as every Windows OS since Vista has used large pages by default. If you’re using an OS that is older than Vista then disabling it may not reduce performance at all, though VMware has shown that guests backed with large pages perform better even if they do not natively support them.

    Every environment is different though and customers have different requirements, so this is good information to share. Thanks!


    • Duncan on 03/18/2011 at 12:23 am

    A couple of things here:

    @Matt 1: Even if the OS has it enabled it doesn’t mean it is using the Large Page. Even with Windows 2008 or Windows 7 most pages will be small, but are backed a large page by ESX and that is what Andre is referring to I guess.

    @Matt 2: How can you see that large pages are disabled in both screenshots?

    1) Aggressive backing with Large Pages started with 3.5 actually. Although only with AMD RVI based proc at that time frame.
    2) So this article also applies to AMD procs that support RVI
    3) Our book doesn’t solely deal about memory management, there is an aspect mentioned in their but that is it.


  3. Duncan – I can’t see that large pages are disabled in both screenshots. I was going by what Andre said in the post that one screenshot was after the upgrade and the other was after disabling large pages.

    One thing that I am a little confused about actually is that TPS was working very well even before disabling large pages. The first screenshot shows 50GB of savings and almost 71GB after disabling. I thought that once TPS kicks in during memory contention it breaks up all pages into small. If that is the case, wouldn’t then disabling large pages have no real effect since they are already in essence disabled because memory is so overallocated? Or does it not break up pages for all VMs at once?

    Sorry Andre, hope I’m not derailing your comments too much.

  4. @Matt Liebowitz
    Unfortunately I don’t have the screenshot before the upgrade. The upgrade was from ESX 4 to 4.1 and it was a upgrade, not a clean install. The first screenshot demonstrate the ‘before’ I disabled large page support.

    TPS was properly working before the upgrade, however after the upgrade it was not as efficient because of the aggressive overcommitment with large pages enabled. It could be that when breaking down the pages not many small blocks are shareable until allmlarge pages have been broken down. Either way it’s good to be able to disable Large Pages in such memory pressure scenarios if required.

  5. Was the customer experiencing a performance problem with the lower page sharing rate of large pages?

    If the only problem was that memory utilisation appeared higher then it is a cosmetic thing and I wouldn’t disable large page support.

    Disabling large pages makes TLB misses more likely so can hurt memory performance for VMs with very active memory footprints, like VDI.

  6. @Alastair Cooke
    The hosts were indeed using memory compression and swap, and ideally those should be the last resort. Apparantly the aggressive Transparent Page Sharing collapsing after breaking down the large pages isn’t aggressive enough as it jumps to ZIP/ Ballooning straight afterwards. Duncan is talking to developers to understand the bevaviour.

    So, answering to your question… in this case it isn’t cosmetic.

    • Duncan on 03/18/2011 at 11:37 pm

    Matt Liebowitz :
    The first screenshot shows 50GB of savings and almost 71GB after disabling. I thought that once TPS kicks in during memory contention it breaks up all pages into small. If that is the case, wouldn’t then disabling large pages have no real effect since they are already in essence disabled because memory is so overallocated? Or does it not break up pages for all VMs at once?

    Not sure what you are seeing as it is not the full screenshot. But suspect that those savings are for a large portion “zero pages” which are also reflected in the counter Andre shows. It is difficult to really come to a good conclusion based on the provided info.

  1. […] just give you that little extra you need. (For more on memory tuning for VDI read Andre’s article that he coincidentally published today.) No comments yet – be the first Leave a Reply Click here […]

  2. […] the past I have written about TPS, Large Memory Pages and your VDI environment andIncrease VDI consolidation ratio with TPS tuning on ESX. I recommend tweaking […]

  3. […] MyVirtualCloud: TPS, Large Memory Pages and your VDI environment […]

Comments have been disabled.