Table of Contents
The fundamental underpinnings of most organizations are their transactions. These must be done well, with integrity, and with good performance. Not only has transaction volume soared of late, but the level of granularity in the transaction details has also reached new heights. Fast transactions greatly improve the efficiency of a high-volume business. Performance is incredibly important.
Recent trends in information management indicate organizations are shifting their focus to cloud-based solutions. In the past, the only clear choice for most organizations was an on-premises database using on-premises hardware. However, costs of scale are chipping away the notion that this is still the best approach for all of a company’s transactional needs. The factors driving operational and analytical data projects to the cloud are many, and the advantages–like data protection, high availability, and scale–are realized with infrastructure as a service (IaaS) deployment. In many cases, a hybrid approach serves as an interim step for organizations migrating to a modern, capable cloud architecture.
This report outlines the results from a GigaOm Transactional Field Test, derived from the industry-standard TPC Benchmark™ E (TPC-E), to compare two IaaS cloud database offerings:
- Microsoft SQL Server on Amazon Web Services (AWS) Elastic Cloud Compute (EC2) instances
- Microsoft SQL Server on Microsoft Azure Virtual Machines (VM)
Both are installations of Microsoft SQL Server, and we tested using Red Hat Enterprise Linux (RHEL).
The results of the GigaOm Transactional Field Test are valuable to all operational functions of an organization, such as human resource management, production planning, material management, financial supply-chain management, sales and distribution, financial accounting and controlling, plant maintenance, and quality management. The data underlying many of these departments today is in SQL Server, which is also frequently the source for operational interactive business intelligence (BI).
With the Azure local cache, Microsoft SQL Server on Microsoft Azure Virtual Machines showed 2.9x better performance over AWS when tested on RedHat Enterprise Linux 8.2. Moreover, SQL Server on Microsoft Azure Virtual Machines had up to 64% better price-performance when comparing both on-demand and pay-as-you-go rates.
Testing hardware and software across cloud vendors is very challenging. Certain configurations favor one cloud vendor over another in feature availability, virtual machine processor generations, memory amounts, storage configurations for optimal input/output, network latencies, software and operating system versions, and the benchmarking workload itself. Our testing demonstrates a narrow slice of potential configurations and workloads.
As the sponsor of the report, Microsoft selected the particular Azure configuration it wanted to test. GigaOm chose the AWS instance configuration closest to it in terms of CPU, memory, and disk configuration. There were tradeoffs, which resulted in an input-output operations per second (IOPS) disadvantage for AWS (which we discuss in the report).
We leave the issue of fairness for the reader to determine. We strongly encourage you, the reader, to look past marketing messages and discern for yourself what is of value. We hope this report is informative and helpful in revealing some of the challenges and nuances of platform selection.
In the same spirit as the TPC, price-performance is intended to be a normalizer of performance results across different configurations. Of course, this has its shortcomings, but at least you can determine that “what you pay for and configure is what you get.”
The parameters to replicate this test are provided. We used the BenchCraft workload driver, which was audited by a TPC-approved auditor who reviewed all updates to BenchCraft. All the information required to reproduce the results are documented in the TPC-E specification. BenchCraft implements the requirements documented in Clauses 3, 4, 5, and 6 of the benchmark specification. There is nothing in BenchCraft that alters the performance of TPC-E or this TPC-E derived workload.
The scale factor in TPC-E is defined as the number of required customer rows per single transaction per second. We did change the number of initial trading days (ITD). The default value is 300, which is the number of 8-hour business days to populate the initial database. For these tests, we used an ITD of 30 days rather than 300. This reduces the size of the initial database population in the larger tables. The overall workload behaves identically with an ITD of 300 or 30 as far as the transaction profiles are concerned. Since the ITD was reduced to 30, any results obtained would not be compliant with the TPC-E specification and, therefore, not comparable to published results. This is the basis for the standard disclaimer that this is a workload derived from TPC-E.
However, BenchCraft is just one way to run TPC-E. All the information necessary to recreate the benchmark is available at TPC.org (this test used the latest version 1.14.0). Just change the ITD, as noted above.
We have provided enough information in the report for anyone to reproduce this test. You are encouraged to compile your own representative queries, data sets, and data sizes, and to test compatible configurations applicable to your requirements.
2. Cloud IaaS SQL Server Offerings
Relational databases are the foundation of an organization’s data ecosystem. While alternative SQL platforms are growing along with the data deluge and have their place, today, workload platform decision-makers usually choose the relational database, for good reason. Since 1989, Microsoft SQL Server has proliferated to near-ubiquity as the relational server of choice for the original database use case—online transaction processing (OLTP)—and beyond. Now SQL Server is available on fully functional infrastructure offered as a service (IaaS), taking complete advantage of the cloud. These IaaS cloud offerings provide predictable costs, cost savings, fast response times, and strong non-functionals.
As our testing confirms, the major difference between SQL Server on Azure and SQL Server on AWS is the storage I/O performance.
Microsoft SQL Server on Azure Virtual Machines Storage Options
For operationally intensive, business-critical workloads, Azure recommends either Premium Managed Disk or Ultra Disk. While Ultra Disk is Azure’s high-end disk, we chose to test Premium Managed Disks, which are high-performance SSDs designed to support I/O-intensive workloads and provide high throughput and low latency, but with a balanced cost compared to Ultra Disk. Premium SSD Managed Disks are provisioned as a persistent disk with configurable size and performance characteristics. They can also be detached and reattached to different virtual machines.
The cost of Premium SSD Managed Disks depends on the number and size of the disks selected, and on the number of outbound data transfers. The various disk sizes provide different IOPS, throughput (MB/second), and monthly price per GiB. Attaching several persistent disks to a VM can support petabytes of storage per VM. Premium four-disk configurations can achieve up to 80,000 IOPS and 1,600 MB per second disk throughput per VM—which translates to less than one millisecond latency for read operations with applications that can take advantage of read caching. Premium SSD Managed Disks are supported by Ds-series, Fs-series, and GS-series VMs. The largest single disk is the P80 with 32TB of storage, IOPS up to 20,000, and 900MB per second of throughput.
For additional performance, Azure offers local cache options of Read/Write and Read-only. The local cache is a specialized component that stores data, typically in memory, so that it can be accessed more quickly. The read cache attempts to reduce read latency, while with the write cache, data to be written to permanent storage is queued in the cache. This feature is not available on all disk types nor is it available for temporary storage disks. Also, not all applications can leverage the cache. According to Microsoft:
Caching uses specialized, and sometimes expensive, temporary storage that has faster read and write performance than permanent storage. Because cache storage is often limited, decisions need to be made as to what data operations will benefit most from caching. But even where the cache can be made widely available, such as in Azure, it’s still important to know the workload patterns of each disk before deciding which caching type to use.
Use of the write cache could cause data loss. Specifically, with regard to write caching, Microsoft cautions:
If you are using Read/Write caching, you must have a proper way to write the data from cache to persistent disks. For example, SQL Server handles writing cached data to the persistent storage disks on its own. Using Read/Write cache with an application that does not handle persisting the required data can lead to data loss, if the VM crashes.
Additionally, for the SQL Server temporary database (tempdb), we used a high-performance NVMe solid state drive. NVMe SSDs deliver high random I/O performance with very low latency.
Storage Options for Microsoft SQL Server on Amazon Web Services Elastic Cloud Compute Instances
Amazon Web Services offers Elastic Block Store (EBS) as an easy-to-use, high-performance block storage service designed for use with Amazon EC2. EBS supports a broad range of workloads, such as relational and non-relational databases, enterprise applications, containerized applications, big data analytics engines, and file systems. With EBS, AWS customers can choose from four different volume types to balance optimal price and performance. You can achieve single-digit-millisecond latency for high performance database workloads.
Amazon EBS has several different types of solid state drives: General Purpose SSDs (gp2 and gp3) and Provisioned IOPS SSDs (io1 and io2). Provisioned IOPS disks are more akin to Azure Ultra Disk, so we chose the latest General Purpose SSD (gp3) volumes to balance price and performance for these workloads. AWS recommends this drive type for most workloads. To create a comparable configuration, we provisioned the gp3 disks with 1,024GB, 5,000 IOPS, and 200MB/s throughput to match the Azure P30 disks we used.
For the test, we chose one of AWS’s Nitro-based instances to give AWS the best possible performance using gp3 volumes. As with Azure, for the SQL Server temporary database (tempdb), we used a high-performance NVMe solid state drive.
One of our main objectives in this benchmark was to test an I/O-intensive workload on Amazon’s and Azure’s speed-cost balanced SSD volume types head-to-head—to understand both the performance and price-per-performance differences of the two leading cloud vendors’ SQL Server offerings. AWS does not offer the local cache feature available in Azure, and we wanted to see the difference in performance when the read-only cache was enabled on Azure data disks compared to AWS without the benefit of the local cache.
3. Field Test Setup
GigaOm Transactional Field Test
The GigaOm Transactional Field Test is a workload derived from the well-recognized industry-standard TPC Benchmark™ E (TPC-E). The workload (the transaction mix) was modified from the standard TPC-E benchmark for ease of benchmarking and thus the results generated are not comparable to official TPC results. From tpc.org:
TPC Benchmark™ E (TPC-E) is an OLTP workload. It is a mixture of read-only and update-intensive transactions that simulate the activities found in complex OLTP application environments. The database schema, data population, transactions, and implementation rules have been designed to be broadly representative of modern OLTP systems. The benchmark exercises a breadth of system components associated with such environments.
The TPC-E benchmark simulates the transactional workload of a brokerage firm with a central database that executes transactions related to the firm’s customer accounts. The data model consists of 33 tables, 27 of which have 50 foreign key constraints. The results of TPC-E are valuable to all operational functions of an organization, many driven by SQL Server and frequently the source for operational interactive business intelligence.
Field Test Data
The data sets used in the benchmark were generated based on the information provided in the TPC Benchmark™ E (TPC-E) specification. For this testing, we used a scaled-down database configured for 1,000,000 customers. This scaling determined the initial data volume of the database. For example, that total of 1,000,000 was multiplied by 17,280 to determine the number of rows in the TRADE table: 17,280,000,000. For ease of benchmarking, we did scale down the number of initial trading days from 300 to 30. This reduced the initial population of the following tables:
- Cash Transaction
- Holding History
- Holding Summary
- Trade History
All of the other tables were scaled according to the TPC-E specification and rules. On our Azure virtual machines and AWS EC2 instances, we allocated 4TB of storage—which was more than enough for this workload. Besides the scale factor of 1,000,000 customers, the test offers a few other “knobs” we turned in order to determine the database engine’s maximum throughput capability for both AWS and Azure, as shown in Table 1.
Table 1. Configuration Changes to Maximize Throughput for Both Microsoft Azure and AWS
We completed three runs per test on each platform, with each test run lasting at least two hours. We then took the average transactions per second for the last 30 minutes of the test runs. Between each run, we restored a full backup to reset the database back to its original state. The results are shared in the Field Test Results section.
Selecting and sizing the compute and storage components for comparison can be challenging, particularly across two different cloud vendors’ offerings. Each offers various choices for transaction-heavy workloads. As you will see in Table 2, at the time of testing and publication, there was no exact match across the offerings, either in processors or memory.
We considered the variety of offerings on AWS and selected the memory-optimized R5 family as we have used it in other testing and believe it to be a solid performer. R5 is also described as very similar to the Azure offering, “optimized for memory-intensive and latency-sensitive database workloads, including data analytics, in-memory databases, and high-performance production workloads.”
On the Azure side, we expect mission-critical-minded customers to gravitate toward the Ev4 family, which is described as “ideal for memory-intensive enterprise applications.” We opted for the ESv4 series, which offers premium storage disks, as opposed to the standard disk offered with the Ev4 series. Thus, we decided on R5d for AWS RDS and E64ds_v4 for Azure SQL VM. Our approach was to find the “nearest neighbor” best fit. The challenge was selecting a balance of both CPU and memory. R5d.16xlarge on AWS has 64 vCPUs and 512GiB memory. Azure offers a 64-core instance in E64ds_v4, but it only has 504GiB of memory, which is slightly less than the r5d.16xlarge. This was our best, most diligent effort at selecting compute hardware compatibility for our testing.
In terms of storage, our objective was to test a data disk with at least 4TB of storage and the same number of maximum IOPS for both Azure Premium Disks and AWS General Purpose (gp3). The read-only cache on Azure was not available for a volume size as large as 4TB, so instead we deployed four 1TB disks, each with the read-cache enabled, and combined them using RAID0 disk striping on RHEL. Striping is not required to enable caching for a disk on Azure—caching is available for a single disk. Striping the disks was recommended to us by Azure because of the design of the platform. Azure Premium Storage is offered in pre-fixed sizes and IOPS caps. Thus, striping of equal-size disks is the publicly documented and recommended best practice to achieve a desired size/IOPS configuration. Each Azure 1TB disk has 5,000 maximum IOPS, yielding a total of 20,000 IOPS when they were striped together.
On AWS, we mimicked the same storage configuration of four gp3 volumes striped with RAID0. This array had a capacity of 4TB and a maximum potential IOPS of 20,000—the same as Azure.
The possibility of SQL Server data loss (a potential downside to RAID0 disk striping) should be countered by having an asynchronous disaster recovery replica or by restoring from a backup.
A configuration difference to note that may have impacted our results is that for both the Azure virtual machine and AWS EC2 instance we used the locally attached temporary storage for the SQL Server tempdb database. The SQL Server tempdb stores internal objects created by the database engine, such as work tables for sorts, spools, hash joins, hash aggregates, and intermediate results. Having tempdb on local temporary storage usually means better I/O performance.
Both the Azure and AWS configurations had 20,000 IOPS available through the storage configurations. We don’t know for sure how much the Read Cache helps Azure. We configured both AWS and Azure to have the same 4TB of storage and the same number of IOPS.
Results may vary across different configurations, and again, you are encouraged to compile your own representative queries, data sets, data sizes, and test compatible configurations applicable to your requirements. All told, our testing included two different database environments.
Table 2. Configurations Used for Tests
For these tests, we assessed the platforms with the tuned SQL Server configurations shown in Table 3 and Table 4, using identical settings for both Azure and AWS:
Table 3. OS Kernel Settings
Table 4. SQL Server Settings
4. Field Test Results
This section analyzes the transactions per second (tps) from the fastest of the five runs of each of the three GigaOm Transactional Field Tests described in Figure 1. A higher tps is better—meaning more transactions are being processed every second.
Figure 1. Azure Premium SSD Managed Disks vs. AWS EBS General Purpose SSD (gp3) in Transactions per Second
For Azure SQL Server 2019 Enterprise on RHEL 8.2, the best tps was slightly less than triple the best tps of AWS SQL Server 2019 Enterprise on RHEL 8.2.
We sought to understand the gap in performance—given the configurations we chose. In reviewing the performance of AWS, we uncovered some interesting findings that may begin to explain the differences we saw. We even sought advice and confirmation from AWS, and they did not recommend any different configuration or provide any conflicting results in the telemetry they observed from their side.
In terms of the BenchCraft tests, the tool does provide output of all transactions that occurred and their timeline. Thus, we looked at Linux SAR CPU and IOSTAT output. The charts in Figure 2 detail the IOPS and CPU utilization we measured during AWS’s best run.
Figure 2. AWS Data Disk IOPS and CPU Utilization
As you can see, the CPU on the AWS EC2 instance spent an average of 38% of its time waiting for I/O bandwidth to become available.
Also, from the Benchcraft output, we were given the response times for the different transaction types, shown in Table 5. The performance metric, transactions per second, is taken from Trade Result transactions.
Table 5. Azure vs. AWS GigaOm Transactional Field Test Transaction Details
On the AWS side, the average response times are high for other important transactions, like Market Feed and Trade Order. Trade Order averages almost 1 second with a standard deviation 0.681. This behavior indicates a backing up—ultimately slowing down the overall Trade Result metric.
To be diligent in our analysis, we performed some other testing to ensure it was not the BenchCraft driver itself that was the bottleneck, even though BenchCraft does not throttle transactions (when No Pacing is set as it was for these tests). We used the Linux utility FIO (Flexible I/O Tester).
Our findings do point to performance below our expectations and may explain why AWS did not perform as well on the GigaOm Transactional Field Test test.
On a 100% random read of block device performance with 8 threads on our Data Disk RAID0 array, we observed:
- Reads: 20.3K IOPS
- Throughput: 159 MB/sec
While the IOPS achieved the 20,000 stated in the AWS specification, the throughput was lower than we expected for software-raided NVMe devices.
On a random 60% read/40% write performance with 8 threads on our Data Disk RAID0 array, we observed:
- Reads: 12.9K IOPS
- Read Throughput: 95.2 MB/sec
- Writes: 8,116 IOPS
- Write Throughput: 63.4 MB/sec
Throughput and latency for reads and writes seems low.
While we cannot know for certain what ultimately caused the performance gap between the two platforms, these statistics do offer some light.
5. Price Per Performance
The price-performance metric is price/throughput (tps). This is defined as the cost of running each of the cloud platforms continuously for three years divided by the transactions per second throughput revealed in our tests 1-3. The calculation is as follows:
Price Per Performance = $/transactions per second (tps) =
[(Compute with on-demand SQL Server license Hourly Rate × 24 hours/day × 365 days/year × 3 years)
+ (Data disk(s) monthly cost per TB × 4 TB × 12 months × 3 years)
+ (Log disk monthly cost per TB × 1 TB × 12 months × 3 years)] ÷ tps
When evaluating price-per-performance, the lower the number, the better. This means you get more compute power, storage I/O, and capacity for your budget.
As shown in Figure 3, we performed this calculation across two different pricing structures:
- Azure pay-as-you-go versus AWS on-demand
- Azure 1-year reserved versus AWS standard 3-year term reserved
The prices in Table 6 were in effect at the time of testing and reflect the US West 2 region on AWS and West US region on Azure. The compute prices include both the actual AWS EC2/Azure VM hardware itself and the license costs of the operating system and Microsoft SQL Server Enterprise Edition. We also included Azure Hybrid Benefit versus AWS License Mobility rates for existing SQL Server license holders.
Figure 3. Price-performance, Azure vs. AWS on RHEL 8.2, Pay-As-You-Go and a 3-Year Commitment Pricing without SQL Server License Mobility (lower means less cost to complete same workload)
Table 6. Azure and AWS Pricing
Note: Prices do not include support costs for either Azure or AWS.
Each platform has different pricing options. Buyers should evaluate all of their pricing choices, not just the ones presented in this paper.
AWS price-performance is much less than the price-performance of Azure SQL Server on RHEL without license mobility, as shown in Figure 4.
Figure 4. Price-performance, Azure vs. AWS on RHEL 8.2, Pay-As-You-Go and 3-Year Commitment Pricing, with SQL Server License Mobility (lower means less cost to complete same workload)
AWS price-performance is more than double the price-performance of Azure SQL Server on RHEL with license mobility.
This report outlines the results from a GigaOm Transactional Field Test to compare equivalent SQL Server IaaS offerings of two cloud vendors: Microsoft SQL Server on Amazon Web Services Elastic Cloud Compute instances and Microsoft SQL Server Microsoft on Azure Virtual Machines.
We learned that the database, along with the cloud and storage, affects latency, which is the killer for important transactional applications. As we saw, Microsoft Azure presents a powerful cloud infrastructure offering for the modern transactional workload.
Microsoft SQL Server on Microsoft Azure Virtual Machines showed 2.9x better performance on RHEL 8.2 than Microsoft SQL Server on Amazon Web Services Elastic Cloud Compute in our configurations, which include similar maximum IOPS. This performance difference is seemingly due to enabling the local cache on the Azure disks and the additional IOPS on the Azure disks compared to the AWS gp3 volumes of the same size. Microsoft SQL Server on Azure Virtual Machines had up to 64% better price-performance when comparing on-demand to pay-as-you-go rates.
Keep in mind that tests are configured to get the best from each platform according to publicly documented best practices. Optimizations on both platforms would be possible as their offerings evolve or internal tests point to different configurations.
Performance is important but it is only one criterion for a business-critical database platform selection. This test is a point-in-time check into specific performance. There are numerous other factors to consider in selection across factors of administration, integration, workload management, user interface, scalability, vendor, reliability, and numerous other criteria. It is also our experience that performance changes over time and is competitively different for different workloads. Also, a performance leader can hit up against the point of diminishing returns and viable contenders can quickly close the gap. The benchmark setup was informed by the TPC Benchmark™ E (TPC-E) specification. The GigaOm Transactional Field Test workload was derived from TPC-E and is not an official TPC benchmark nor may the results be compared to official TPC-E publications.
GigaOm runs all of its performance tests to strict ethical standards. The results of the report are the objective results of the application of queries to the simulations described in the report. The report clearly defines the selected criteria and process used to establish the field test. The report also clearly states the data set sizes, the platforms, the queries, etc. used. The reader is left to determine for themselves how to qualify the information for their individual needs. The report does not make any claim regarding the third-party certification and presents the objective results received from the application of the process to the criteria as described in the report. The report strictly measures performance and does not purport to evaluate other factors that potential customers may find relevant when making a purchase decision.
This is a sponsored report. Microsoft chose the competitors, the test, and the Microsoft configuration. GigaOm chose the most compatible configurations for the other tested platform and ran the testing workloads. Choosing compatible configurations is subject to judgment. We have attempted to describe our decisions in this paper.
8. About Microsoft
Microsoft (Nasdaq “MSFT” @microsoft) enables digital transformation for the era of an intelligent cloud and an intelligent edge. Its mission is to empower every person and every organization on the planet to achieve more.
Microsoft offers SQL Server on Azure. To learn more about Azure SQL Database visit https://azure.microsoft.com/en-us/services/sql-database/.
9. About William McKnightWilliam McKnight
William McKnight is a former Fortune 50 technology executive and database engineer. An Ernst & Young Entrepreneur of the Year finalist and frequent best practices judge, he helps enterprise clients with action plans, architectures, strategies, and technology tools to manage information.
Currently, William is an analyst for GigaOm Research who takes corporate information and turns it into a bottom-line-enhancing asset. He has worked with Dong Energy, France Telecom, Pfizer, Samba Bank, ScotiaBank, Teva Pharmaceuticals, and Verizon, among many others. William focuses on delivering business value and solving business problems utilizing proven approaches in information management.
10. About Jake DolezalJake Dolezal
Jake Dolezal is a contributing analyst at GigaOm. He has two decades of experience in the information management field, with expertise in analytics, data warehousing, master data management, data governance, business intelligence, statistics, data modeling and integration, and visualization. Jake has solved technical problems across a broad range of industries, including healthcare, education, government, manufacturing, engineering, hospitality, and restaurants. He has a doctorate in information management from Syracuse University.
11. About GigaOm
GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.
GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.
GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.