Host Swapping (another Case Study)

Recently a good article named “Is this VM actively swapping?” was posted by Duncan Epping. Coincidently, in a recent engagement I stumbled upon an interesting swapping scenario.
Just to put in context:

  • DRS Cluster with 8 hosts across 2 different blade enclosures
  • Cluster is not memory overcommitted
  • All VMs have latest VMware Tools installed
  • TPS is enabled
  • All VMs hosted in shared storage with low latency
  • No memory reservations for VMs individually

16078 Free -> Currently 16GB memory available
High State -> Hypervisor is not under memory pressure
SWAP /MB 7222 Cur -> 7GB has been swapped
0.00 r/s -> No reads from swap currently
0.00 w/s -> No writes to swap currently

service_console

Interesting circumstance here, where even with staggering 16GB (PMEM/MB 16078 free) of free memory resources available, Memory Overcommit at AVG 0.49, and no VM memory reservations the host was swapping. The host swapped a total of 7GB (SWAP/MB 7222).

Paying attention to column MCTL? you will notice that the Ballooning Driver is not running for two VMs.

Balloon drivers not running or disabled. If the balloon driver is not running, or has been deliberately disabled in some VMs, the amount of balloonable memory on the host will be decreased. As a result, the host may be forced to swap even though there is idle memory that could have been reclaimed through ballooning.

Two of the VMs with respectively memory size of 16GB and 4GB were not ballooning, therefore not freeing up memory resources to satisfy host’s memory overcommitment levels. When host’s memory is scarce or when a VM hits a Limit, the kernel needs to reclaim memory and prefers ballooning over swapping. The balloon driver is installed inside the guest OS as part of the VMware Tools installation and is also known as the vmmemctl driver.

The ballooning driver was disabled by a previous administrator.

It seems that some administrators believe that ballooning will not guarantee the total amount of RAM allocated to a given VM and add extra disk I/O. That might be true in an overcommitted environment but the gains from being able to overcommit memory in conjunction with DRS functionality IMHO outcast any benefits of disabling ballooning. A good alternative to disabling memory ballooning is to set memory reservations at VM level to guarantee the minimum amount of RAM available to the VMs.

In my opinion the important here is learn how to read the information provided by performance analysis tools such as esxtop, vCenter performance graphs, and pay attention to details.

Good readings on this topic:

Performance Troubleshooting for VMware vSphere 4
Is this VM actively swapping? (helping @heiner_hardt) from Duncan Epping
Swapping? from Duncan Epping

7 comments

1 ping

Skip to comment form

  1. Two of the VMs with respectively memory size of 16GB and 4GB were not ballooning, therefore not freeing up memory resources to satisfy host’s memory overcommitment levels. When host’s memory is scarce or when a VM hits a Limit, the kernel needs to reclaim memory and prefers ballooning over swapping.

    Do you imply here that by disabling the balloondriver, the guest will not comply to reclaimingtechniques?

    if so, this is not true, by installing the balloondriver you are ensuring that the most optimal form of memory reclaiming will occur. If the kernel reclaims memory it will look at the virtual machines memory entitlement and will reclaim if necessary memory from specific virtual machines, if no vmmemctl driver is installed the vmkernel has no other choice than reclaim the memory by vmkernel swapping.

    If by disabling the balloondriver on a virtual machine would keep the vmkernel from reclaiming memory, this would be some sort of implicit way to reservation\guarantee or prioritize memory reclaimation among virtual machines.

  2. Frank are you saying that the host won’t start by reclaiming memory from VMs with balloon drivers but will instead grab a portion of each? I’d expect it to push those with balloons first before swapping in an attempt to have the least amount of negative impact on the VMs.

  3. @Frank Denneman
    Yes, I meant that if no vmmemctl driver is installed the vmkernel has no other choice than reclaim the memory by vmkernel swapping. If ballooning was enabled at the 16Gb VM the swap maybe would be avoided.

    Your take on bootstorm is interesting thou.
    The host was up for 41 days. For how long the SWAP /MB information is kept?

    Thanks for your comments.

  4. Andrew, thats not what I mean, the vmkernel will decide based on the level of contention how much memory it must reclaim, it then selects the virtual machines to reclaim memory. This is based on the resource entitlement from the virtual machines.

    The resource entitlement is based on the resource allocation settings (Reservation, shares, limits), configured memory and idle memory tax. If resource allocation settings are identical between two virtual machines, but one virtual machine is idling, the vmkernel will select that virtual machine to reclaim the memory. It tries to reclaim by ballooning, if thats not possible it will reclaim by swap.

    The vmkernel will not decide to back of the “victim” because it hasn’t installed the balloon driver, and go and look for another virtual machine with the balloon driver activated, it wants to reclaim it from that virtual machine and will continue.

    In my opinion, the balloon driver shouldn’t be an option. Most administrators disable the balloondriver without the proper knowledge of the mechanism and the impact it has on the performance of the virtual machine itself.

    Andre,

    With bootstorm I meant restarting or powering up virtual machines, during the powercycle of windows, it will touch every page within its configured memory. If many virtual machines or a few big ones boot up, the ESX server can be overcommitted. Due to the fact that the balloondriver runs within the Gues OS level, it can’t be used during the boot process. The VMkernel needs to swap this memory. These swap files contain zero’s and because the vmkernel don’t actively pre-faults swapped memory, the pages are never swapped-in because the guest os isn’t really interested in a bunch of zero’s.
    Thats why we see a high amount of memory swapped, without having an impact on the performance of the virtual machines.

  5. Frank, that makes sense I was over thinking things (happens after midnight). I was imagining a case where two identical VMs could equally be chosen to reclaim from in which case I would argue it would make more sense to take from the balloon capable one – but you’re right one will ALWAYS be preferred over another even if its only because it has 0.1% more idle memory – so my entire imaginary scenario is garbage. 🙂

    Time for bed.

  6. @Andre,

    You said at the start that the cluster was not memory overcommitted, but this host is. TPS is saving 26GB of RAD and only 16GB is free, that is 10GB of overcommit. Happily that’s overcommit on configured RAM, not on active RAM.

    A consequence of that overcommit is that restarting a large VM, which writes to it’s memory entire footprint, causes a low memory state and the swapping you observe.

    Now that TPS has re-shared the large VMs memory footprint there is no memory stress, Swap Target is now 0MB.

    As the swapped VM’s touch the pages that are swapped they will be de-swapped, but not until they are touched by the VM. As Frank says this is not a performance issue.

  7. Thank you all for your comments. I really appreciate.
    I will try to post more real case scenarios for discussion.

  1. […] I’ve noticed more and more people referring to stats and figures they have collected on their environments via ESXTOP/RESXTOP. […]

Leave a Reply