In ESX, the VMware Kernel scans the guest physical memory pages randomly with a base scan rate specified by Mem.ShareScanTime, which specifies the desired time to scan the virtual machine’s entire guest memory. The default value for Mem.ShareScanTime is 60 minutes.
This means that virtual machines will have their physical memory pages scanned within 60 minute timeframe. This is done by the hypervizor to search for common block pages across the host physical memory footprint. Common memory blocks are seen as page sharing opportunities (TPS) and are merged in memory to reduce host memory footprint.
The 60 minute magic number is an optimal value defined by VMware that will offer marginal or minimal CPU impact. However, is this number optimal for server workloads, virtual desktop workloads, or both?
VDI workloads fundamentally differ from server workloads. I would like to highlight couple key points:
BootStorms: because all virtual desktops run the same GuestOS, during a bootstorm the common blocks are even more similar than during bootstorms for various server workloads. Now, because ESX is configured to scan for common blocks over the period of 60 minutes TPS will not provide major benefits during a bootstorm, and the host will potentially need to swap memory to disk if all GuestOS use the allocated memory at the same time.
GuestOS and Application: Just like during bootstorm there are a large number of memory block communalities across all virtual desktops that not necessarily will be picked up by TPS in time for block merge.
User Interaction: There is a coherent and predictable use pattern for virtual desktops. Yes, users are more alike than you imagine and they will likely fire up and use applications around the same time. As an example, an employee gets into the office at 8:50am power on his computer/laptop/thinclient and connects to his virtual desktop session. If you setup VDI to logoff users automatically to save power the user will login and start outlook. While outlook is downloading messages from the server the user goes to the cafeteria and get a coffee – chat little bit with colleagues – and they all get back to their workstations at the same time to fire up their financial application.
The interaction pattern is very similar for most users and this pattern offers a great deal of common memory blocks. However, if TPS is not fast enough to utilise this communalities the hypervisor will be losing opportunity to reduce memory footprint.
In general virtual desktop environments are memory constrained, not CPU, and any memory savings could improve the consolidation ratio. Because of the reasons above I have been testing Mem.ShareScanTime set to 10 minutes. Initially I thought that there would be large VMkernel CPU overhead, but it proved to be negligible.
Using the setting above I was able to achieve TPS savings at the order of 75% for WindowsXP guest OS running IDLE.
Because most virtual desktops were running IDLE during the tests this is not a valid exercise or scientific benchmark. During tests I observed that TPS savings increased on average 30% just augmenting the number of memory scan cycles per virtual desktop.
The image below depicts 12 virtual desktop running with 1.5GB RAM and achieving 13.7GB TPS savings. This is 74.4% savings.
The image below depicts 12 virtual desktop running with 1.5GB RAM and achieving 16.6GB TPS savings. This is 90.4% savings.
I suggest that you do your own tests with one of your VDI hosts and compare the results with other hosts on the cluster. Mem.ShareScanTime is set per host and can be defined through vCenter Advanced settings or rCLI.
It is also possible to change the maximum number of page scan rate in MB/sec per GHz of Host. This would allow you to augment even more the number of pages processed. I have not done tests changing Mem.ShareScanGHz.
If you decide to run your own tests I would be interested in your results for comparison.