We have recently released a white paper on Microsoft SQL Server, and the purpose of this paper is to demonstrate the “real-world” performance of a virtualized Microsoft SQL Server running on the Datrium platform; and I emphasize “real-world” because absolute performance more often than not, is not the only design consideration.
The Benchmark Setup
For this benchmark, we used a Datrium CN2000 server as compute node with an Intel Xeon E5 running at 2.4GHz clock speed.
The first step of any database benchmarking effort should be to decide what you want to measure. Are you trying to compare the “raw performance” or the “real world performance”? Quite often people who want to test raw performance desire to measure and compare some extremely low-level metrics such as IOPS. However, industry standard benchmarks are designed to approximate real world workloads where performance is measured in a more meaningful way such as Transaction per Minute (or TPM).
For testing the system, we used HammerDB. HammerDB is an open-source database benchmarking tool widely used in the industry, for both transactional and analytic scenarios. We also have chosen the TPC-C benchmark to build a roughly 500GB database of 5,000 warehouses and used more than 500 concurrent users to drive the database transactions.
For those not familiar with the TCP-C, it simulates online transaction processing workloads (OLTP). The HammerDB workload simulated roughly a 70:30 split of read and write transactions at 8KB block size.
We also chose to perform tests with a SQL Server VM that has 12 vCPU and 256 GB RAM because we realized that this configuration would be most similar to the vast majority of SQL deployments.
The Datrium Config
While the minimum configuration is one compute node and one data node the Datrium DXV solution allows scalability up to 128 compute nodes and 10 data nodes in a single system. In this benchmark, we used a single compute node and a single data node.
If the choice for this benchmark were to scale-out with multiple MSSQL servers and multiple databases the system-wide performance and capacity would be up to 200 GB/s of read bandwidth and 18 Million IOPS, 10 GB/s Write throughput and 1.7 petabytes (PB) of effective capacity.
Furthermore, with Datrium, data services are always-on by default, including checksumming, deduplication, compression, and erasure coding. Therefore, during the benchmark, these features are turned On and working as usual.
However, before I go into the performance numbers it is important to remember that the Datrium architecture enables performance isolation between compute nodes, maintaining reads local and most importantly eliminating writes across servers, and eliminates possible noisy neighbor issues and creating more predictable service levels – unlike HCI solutions.
The picture below demonstrates the HammerDB SQL Server performance numbers achieved in Fast Mode. Fast mode is the regular operation mode for a Datrium DVX compute node, whereas a maximum of 20% host CPU utilization is allocated to the storage IO and data services. Datrium also provides the ability to enable Insane Mode allowing compute nodes to utilize up to 40% of host CPU to improve storage IO operations. However, this benchmark is not using Insane Mode; we are using Fast Mode.
The SQL database achieved over 2 Million transactions per minute with the average read latency of 0.8ms and local flash read hit rate of 100%. The average write latency was just 1.9ms.
When talking about database performance TPM is important, but perhaps more important is to maintain latencies low for both reads and writes.
The next test is during what we call worst-case scenario. We define worst-case as when space reclamation and snapshots are enabled and in use to protect VMs and applications. Generally, during worst-case, space reclamation processes would kick-in when the data cluster reaches 75% capacity utilization. This test is important to enable us to understand if and how performance is affected by common daily tasks. For that end, we created a protection group to protect the SQL VM with multiple native snapshots being created during the performance benchmark.
The graph below shows us real-time statistics captured while space reclamation and snapshots are in used to protect the SQL Server VM. In this test, the counter achieved more than 1.8 Million TPM, and we can see that that SQL performance is not much affected by those operations. The average read latency was just 0.9ms, and the local flash read hit rate was also 100%. The average write latency was just slightly above 2ms.
The results demonstrate that Datrium can host Microsoft SQL Server databases with sub-millisecond latencies for read operations. A single Microsoft SQL Server VM with 12 vCPU and 256 GB of RAM can achieve 2 Million TPM before reaching 100% VM vCPU utilization.
It is important to remember that this benchmark was not designed to drive maximum IOPs or maximum HammerDB TPM consistently. The focus is to create a real-world performance benchmark, similar to what most organizations would have in-house. As you can see the Datrium DVX platform provides a solution that is fast and efficient, but most importantly, predictable for running Microsoft SQL workloads.
Download the full white paper using the link HERE.
This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net