Cloud Data Warehouse Performance Testingv1.0

Product Evaluation: Cloudera Data Warehouse, Amazon Redshift, Microsoft Azure Synapse, Google BigQuery, and Snowflake

Table of Contents

  1. Summary
  2. Cloud Analytics Platform Offerings
  3. Test Setup
  4. Test Results
  5. Price-Performance
  6. Conclusion
  7. Disclaimer
  8. About Cloudera
  9. About William McKnight
  10. About Jake Dolezal

1. Summary

Big data analytics platforms load, store, and analyze volumes of data at high speed, providing timely insights to businesses. Data-driven organizations leverage this data, for example, for advanced analysis to market new promotions, operational analytics to drive efficiency, or for predictive analytics to evaluate credit risk and detect fraud. Customers are leveraging a mix of relational analytical databases and data warehouses to gain analytic insights.

This report focuses on relational analytical databases in the public cloud because deployments are at an all-time high and poised to expand dramatically. The cloud enables enterprises to differentiate and innovate with these database systems at a much more rapid pace than was ever possible before. The cloud is a disruptive technology, offering elastic scalability vis-à-vis on-premises deployments, enabling faster service deployment and application development, and allowing less costly storage. For these reasons and others, many companies have leveraged the cloud to maintain or gain momentum.

This report outlines the results from an analytic performance test derived from the industry-standard TPC Benchmark™ DS (TPC-DS) to compare Cloudera Data Warehouse service (CDW)—part of the broader Cloudera Data Platform (CDP)—with four prominent competitors: Amazon Redshift, Azure Synapse Analytics, Google BigQuery, and Snowflake. Overall, the test results were insightful in revealing query execution performance of these platforms.

In terms of price per performance, Cloudera ran the Field Test 20% more cost-effectively than the nearest competitor, Amazon Redshift, 40% more cost-effectively than Azure Synapse, and 80% more cost-effectively than Snowflake. Cloudera ran the Field Test 5.5 times more cost-effectively than Google BigQuery.

Introduction

Performance is important but is only one criterion for a data warehouse platform selection. This is only one point-in-time check into specific performance. There are numerous other factors to consider in selection across factors of administration, integration, workload management, user interface, scalability, vendor, reliability, and numerous other criteria. It is also our experience that performance changes over time and is competitively different for different workloads. Also, a performance leader can hit up against the point of diminishing returns and viable contenders can quickly close the gap.

GigaOm runs all of its performance tests to strict ethical standards. The results of the report are the objective results of the application of queries to the simulations described in the report. The report clearly defines the selected criteria and process used to establish the field test. The report also clearly states the data set sizes, the platforms, the queries used, and more. The reader can determine how to qualify the information for individual needs. The report does not make any claim regarding third-party certification and presents the objective results received from the application of the process to the criteria as described in the report. The report strictly measures performance and does not purport to evaluate other factors that potential customers may find relevant when making a purchase decision.

This is a sponsored report. Cloudera chose the competitors, the test, and the Cloudera cluster size. The default configurations were chosen. GigaOm set up the environments and ran the queries. Choosing compatible configurations is subject to judgment. We have attempted to describe our decisions in this paper.

In this writeup, all the information necessary is included to replicate this test. You are encouraged to compile your own representative queries, data sets, data sizes, and compatible configurations and test for yourself.

Full content available to GigaOm Subscribers.

Sign Up For Free