The performance improvement provided by x64 architecture for Windows XP in a virtual environment might go beyond the appearances. That’s what I found out when a 32-bit application was performing with overwhelming superior performance in a Windows XP 64-bit than on a 32-bit. Note that the application is 32-bit.
I was contacted by a customer reporting that an existing 32-bit application was poorly performing on their newly deployed VMware View POC environment when compared to the legacy Citrix NFuse. Upon analysis I could not find any kind of resource contention inside the virtual desktops nor at the host, storage or network levels; and the CPU usage during a certain operation would only use about 50%. Found out later that Citrix NFuse was running on a 64-bit version of Windows 2003, so I decided to deploy a Windows XP 64-bit to the View environment.
Result = the performance was remarkably faster than Windows XP 32-bit and also much faster than Citrix NFuse running on a dedicated powerful server (same type and speed as VMware View) Windows 2003 64-bit with loads of memory.
Initially I thought that running a 32-bit application on a 64-bit architecture would perform slower because of the WoW64 (Microsoft emulation layer), the larger pointers and the OS overhead. Also, despite Wow64, 32-bit programs on 64-bit versions of Windows cannot take advantage of the larger 64-bit address spaces or wider 64-bit registers on 64-bit processors.
The result of the test instigated my curiosity and put me towards an additional speed comparison test between the two versions of Windows XP (32-bit and 64-bit). We created a baseline virtual desktop and used PCMark to compare the two virtual desktops under the same conditions and environment. PCMark has been run three times for each virtual desktop and the numbers below are the average.
Baseline environment deployed for test:
- Same ESX host running vSphere 4
- Same shares, with no limits or reservations.
- 1 vCPU
- 2 GB RAM
- Windows Page file set to System Managed
- Both VMs located on same LUN and Datastore
- SCSI Controller LSI Logic Parallel
- Network Adapter: E1000 (64b) and vmxnet03 (32b)
I have correlated the findings in this test with a Microsoft paper on Benefits of Windows x64 architecture over x86.
· CPU – and Memory read/write.
Registers are fast, local slots inside a processor in which applications can store values that will be needed shortly. Data stored in registers is available for reuse at full processor speeds and is faster than even cached data in an on-chip cache. The additional (and wider) general-purpose registers of the x64 architecture allow for significant gains in compiler efficiency and overall application speed. With more registers, there is less need to write out persistent data to memory, only to have to read it back a few instructions later.
Another gain with the additional and wider registers is faster function calls. Up to four arguments can be passed in registers to a function, a big improvement over the x86 approach of pushing and popping arguments onto the stack for every floating-point operation.
· Disk – Random Seek +RW.
Another performance improvement with the x64 architecture is the gain in overall I/O efficiency and throughput. With support for greater physical memory and memory address space, caches can be substantially larger than in 32-bit Windows, enabling the Windows x64 Editions to fully utilize the improved I/O hardware available to improve overall I/O performance. The larger address space allows more I/O to be in progress simultaneously. Even 32-bit applications can benefit from this improvement, especially those that needed to use the /3GB switch. When using the /3GB switch, Windows is forced into a constrained address space, limiting the amount of non-paged pool available. This can cause non-paged pool to be exhausted when there are several I/O requests outstanding.
Overall the x64 architecture makes better use of the resources available, with some particular exceptions; and is able to provide not only larger virtual memory address space but also support for more physical memory.
|General Memory Limits||32-Bit||64-Bit|
|Total virtual address space (based on a single process)||4 gigabyte (GB)||16 terabyte (TB)|
|Virtual address space per 32-bit process||2 GB (3 GB if system
is booted with /3GB switch)
|4 GB if compiled with /LARGEADDRESSAWARE (2 GB otherwise)|
|Virtual address space per 64-bit process||Not applicable||8 TB|
|Paged pool||470 megabyte (MB)||128 GB|
|Non-paged pool||256 MB||128 GB|
|System Page Table Entry (PTE)||660 MB to 900 MB||128 GB|
|Physical Memory and CPU Limits||32-Bit||64-Bit|
|Windows XP Professional||4 GB / 1 to 2 CPUs||128 GB / 1 to 2 CPUs|
Should I convert my Windows XP templates to 64-bit?
The answer is depends. Depends on the applications you currently have deployed at your VMware View environment. If you have 16-bit legacy applications they might not work properly with a 64-bit OS. Additionally, there might be some compatibility issues or difficulty to find the correct drivers for the x64 architecture.
However, given the increasing number of 64-bit processors, and the release of Windows 7 x64 Edition I would bet my chips that this is the way vendors want us to move forward. If you are planning a Green Fields VDI deployment I would recommend to look into x64 architecture as a serious option.
The Windows x64 Editions are offered at the same price as the x86 versions. There is no premium charged for the extra power.
Will Windows 7 x64 provide the same or better performance improvements on resource utilization? Well, better we run some more tests.
Have you done a similar test? If so, please share your results.