Table of Contents
- Modernizing Your Use Case
- Performance Comparison
- Total Cost of Ownership
- Appendix: GigaOm Analytic Field Test
- About William McKnight
- About Jake Dolezal
- About GigaOm
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a platform designed to address multifaceted needs by offering multifunction data management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion. They also need a selection that allows a worry-less experience with the architecture and its components.
The chosen platform should bring a multitude of data services onto a single, cohesive space. A key differentiator among platforms is the overarching management, deployment, governance, billing, and security of those services, which can reduce complexity in administration and scaling data pipelines. As more components are added, and more integration points among those components arise, complexity will increase substantially. Greater complexity will lead to more technical debt and administrative burden as organizations cobble together and maintain the flow of data between point solutions.
We decided to take four leading platforms for machine learning under analysis. We have learned that the cloud analytic framework selected for an enterprise and an enterprise project matters in terms of cost.
By looking at the problem from a cost perspective, we’ve learned to be wary of architectures that decentralize and decouple every component by business domain, which enables flexibility in design, but blows up the matrix of enterprise management needs.
Some architectures look integrated but, in reality, may be more complex and more expensive. When almost every additional demand of performance, scale, or analytics can only be met by adding new resources, it gets expensive.
Based on our approach described in the next section, and using the assumptions listed in each section mimicking a medium enterprise application, Azure was the lowest-cost platform. It had a three-year cost of $3M to purchase the analytics stack for a “medium-size” organization. AWS was 19% higher, while Google and Snowflake were more than double the cost.
Highlights of the Azure stack include Azure Synapse Dedicated, Azure Synapse SQL Pool, Azure Data Factory, Azure Stream Analytics, Azure Synapse Spark, Azure Synapse Serverless, Power BI Professional, Azure Machine Learning, Azure Active Directory P1, and Azure Purview.
The AWS stack includes Amazon Redshift ra3, Amazon Redshift Managed Storage, AWS Glue, Amazon Kinesis, Amazon EMR + Kinesis, Amazon Redshift Spectrum, Amazon Quicksight, Amazon SageMaker, Amazon IAM, and AWS Glue Data Catalog.
The Google stack is Google BigQuery Annual Slots, Google BigQuery Active Storage, Google Dataflow Batch, Google Dataflow Streaming, Google Dataproc, Google BigQuery On-Demand, Google BigQuery BI Engine, Google BigQuery ML, Google Cloud IAM, and Google Data Catalog.
We labeled the fourth stack Snowflake, since that is the featured vendor for dedicated compute, storage, and data lake, but it is really a multi-vendor heterogeneous stack. This includes Snowflake database, AWS Glue, Kafka Confluent Cloud, Amazon EMR + Kinesis, Tableau, Amazon SageMaker, Amazon IAM, and AWS Glue Data Catalog.
Azure was also the lowest-cost platform for large enterprises at a $9M one-year (annual) cost to purchase. AWS was 32% higher, while Google and the Snowflake stack were more than two times higher.
Dedicated compute is the largest configuration cost, ranging from 54% for the AWS stack to 78% for the Google stack. Data integration is second in all stacks.
A three-year total cost of ownership analysis for medium enterprises, which includes labor costs, reveals that Azure is the platform with the lowest cost of ownership at $3 million. AWS is at $4 million, Google at $7.6 million, and Snowflake is $8 million. For large enterprises, Azure three-year TCO is $8.5 million, AWS $12.3 million, Google $19.2 million, and Snowflake $22 million. (Figure 1)
Figure 1. Three-Year Total Cost of Ownership for Each Platform