This GigaOm Research Reprint Expires: Feb 26, 2023

Key Criteria for Evaluating Cloud Management Platformsv1.0

An Evaluation Guide for Technology Decision Makers

1. Summary

Today’s IT infrastructures are growing ever more complex as demand for digital transformation—and response to evolving business needs—increases. Multi-cloud and hybrid cloud infrastructures are becoming the norm, but that hasn’t meant that data centers and legacy applications have disappeared.

Enterprises today must deal with a wide variety of far-flung users and disparate systems, and must be able to provision and support them both, safely and efficiently, However, few enterprise IT organizations know how to manage multi-cloud and hybrid cloud environments, let alone with the agility today’s businesses demand. Cloud management platforms (CMPs) enable customers to manage their complex environments more efficiently and with less cost.

Cloud management has three aspects that can be separate tools or intergrated to some level. The three areas are automation, operational optimization, and financial optimization and reporting. The names of these three groups are Cloud Management Platform, Cloud Resource Optimization, and Financial Operations (FinOps). The following graphic shows how these areas relate. This report covers Cloud Management Platforms, and GigaOm has other reports on the other two topics.

Figure 1: Three Aspects of Total Cloud Management

This Gigaom Key Criteria report details the key issues and trends to consider around the use of CMPs. Indeed, we’ll identify key criteria and evaluation metrics for selecting a management tool platform as well as identify vendors and products that excel. This report will give you an overview of the key enabling technology that can be obtained today and will help decision-makers evaluate existing platforms and decide where to invest.

How to Read this Report

This GigaOm report is one of a series of documents that helps IT organizations assess competing solutions in the context of well-defined features and criteria. For a fuller understanding consider reviewing the following reports:

Key Criteria report: A detailed market sector analysis that assesses the impact that key product features and criteria have on top-line solution characteristics—such as scalability, performance, and TCO—that drive purchase decisions.

GigaOm Radar report: A forward-looking analysis that plots the relative value and progression of vendor solutions along multiple axes based on strategy and execution. The Radar report includes a breakdown of each vendor’s offering in the sector.

Solution Profile: An in-depth vendor analysis that builds on the framework developed in the Key Criteria and Radar reports to assess a company’s engagement within a technology sector. This analysis includes forward-looking guidance around both strategy and product.

2. Cloud Management Platforms Primer

As organizations increasingly move their applications to the cloud, the need to manage them over time becomes a critical success factor for IT organizations.

In traditional data center deployments, processes like asset tracking and dependency mapping were seen as critical issues in managing the lifecycle of IT solutions. These processes were run in data centers with redundancy to protect against outages and ensure high levels of uptime, and used the asset tag of a physical server to track where an application was running.

However, the move to the cloud requires a new paradigm in which “design for failure” is an application requirement and not an infrastructure mandate. Today’s hardware is ephemeral. The move to virtual machines (VMs) on-premises and now to clouds requires that the focus be on the business problem or solution itself and that other attributes be treated as short-lived. A CMP’s ability to track such processes at a high-level is critical to the multi-year management needs of IT as their organizations move to hybrid and multiple clouds.

Dependency mapping is also critical. Many IT departments had to deal with tertiary servers, which were treated like utility servers, where ad hoc processes were run to support business applications. But the servers were not paid for or maintained by the developers of the business solutions. It was easy to add processes to the utility server, but over time this increased the criticality of a system that was essentially invisible to management. Eventually, these servers either failed—often with a large, negative business impact—or were orphaned and maintained in the data center for years after they were no longer needed. In a few worse-case scenarios, the utility server was running without proper security or even outside of the secure data centers.

In traditional on-premises data centers, the number of utility servers was typically one for every 100-500 servers. When cloud migration began early in the century, this exploded to one for every 10 virtual machines. Complicating management, the number of CPUs and the amount of RAM, storage, and network connections physical servers used was static and sized for peak expected workloads two to five years in advance. This resulted in over-sizing, early wasted capacity, and strained end-of-life performance.

Meanwhile, in the cloud, none of these values are fixed. This is part of the promise of the cloud—to pay for only what you used. To live up to this expectation, management systems must be able to track utilization, performance, and cost and relate them to the solution the business is paying for.

In the cloud, however, VMs are ephemeral and should be treated like a system that will be replaced during the next solution change—which means memory, CPU (count, type, speed, generation), storage, and network properties have to be tracked. Furthermore, there are scale-out requirements or “open for business” timers for systems that are only needed for part of a day or week to support a business need. All of these attributes need to be tracked and priced so they can be managed.

This is where the CMP’s ability to get real-time or near real-time situational awareness of the health and performance of a business solution is critical. This helps businesses to avoid overspending or unplanned outages due to lack of capacity management.

CMPs should be able to automatically remediate common incidents like restarting a hung virtual machine or container. Because every CMP has limits to what it can automatically resolve, they need the ability to make log files available so engineers and developers (DevOps) can do a complex review of an incident and determine how to restore the operational functionality.

To resolve incidents, the CMP must also be able to deploy a “Product”. The more powerful a CMP, the greater its ability to deploy a Product into an environment with no external tool having to provide oversight and orchestration. The best CMPs are the masters of orchestrators, and they coordinate with other tools to remove 100% of human tasks for a common Product deployment.

To provide segregation of duties, CMPs should be able to work with 3rd party Security information and event management (SIEM) tools that are run by Security and not IT or DevOps. This means security-relevant metrics, events, and log files need to be easily configured to be consumed by popular SIEM tools.

CMP tools need to support common ITIL processes. This includes IT Service Management (ITSM), Configuration Management DataBase (CMDB) or IT Knowledge Management (ITKM) which shows the current and historical configuration changes of a Product, and IT Portfolio Management (ITPM) which includes asset management.

Examples of common vendors covering these areas include:

  • ITSM: ServiceNow, Cherwell, BMC, Broadcom (formerly CA), and IBM
  • ITKM: ServiceNow, Micro Focus (formerly owned by HPE), BMC, and Ivanti
  • ITPM: PlanView, BMC, and ServiceNow

Market Categories and Deployment Models

CMPs may be targeted at different sizes of companies and offer one or more deployment models. This report considers four market segments: Small-to-medium businesses (SMBs), mid-market, large enterprises, and managed service providers (MSPs). We also consider three deployment models for hosting the CMP engine:

  • Software as a Service (SaaS): The CMP engine runs as a service, using a cloud-based solution to manage other cloud-based deployments. This may include both hybrid and multi-cloud, which may or may not include private cloud management.
  • Customer Managed: The CMP engine runs on systems traditionally found on-premises but can also run on dedicated cloud environments the customer maintains. These are often traditional enterprise automation tools, now recast as a CMP tool. The software may be able to run in VMs or containers that a customer hosts within their environment on a public cloud. In this category, the customer runs the software where they want and owns all operational responsibilities. In this model the software is either not offered as a SaaS offering or the SaaS offering differs from the customer managed.
  • Holistic: In this approach, customers can choose the location of the CMP engine without impacting system functionality. Orchestration is from either the cloud or customer managed resources.

3. Report Methodology

A GigaOm Key Criteria report analyzes the most important features of a technology category to help IT professionals understand how solutions may impact an enterprise and its IT organization. These features are grouped into three categories:

  • Table Stakes: Assumed Value
  • Key Criteria: Differentiating Value
  • Emerging Technologies: Future Value

Table stakes represent features and capabilities that are widely adopted and well-implemented in a technology sector. As these implementations are mature, they are not expected to significantly impact the value of solutions relative to each other and will generally have minimal impact on the total cost of ownership (TCO) and return on investment (ROI).

Key criteria are the core differentiating features in a technology sector and play an important role in determining potential value to the organization. Implementation details of key criteria are essential to understanding the impact that a product or service may have on an organization’s infrastructure, processes, and business. Over time, the differentiation provided by a feature becomes less relevant and it falls into the table stakes group.

Emerging technologies describe the most compelling and potentially impactful technologies emerging in a product or service sector over the next 12 to 18 months. These emergent features may already be present in niche products or designed to address very specific use cases, however at the time of the report they are not mature enough to be regarded as key criteria. Emerging technologies should be considered mostly for their potential downfield impact.

Over time, advances in technology and tooling enable emerging technologies to evolve into key criteria, and key criteria to become table stakes, as shown in Figure 2. This Key Criteria report reflects the dynamic embedded in this evolution, helping IT decision makers track and assess emerging technologies that may significantly impact the organization.

Figure 2. Evolution of Features

Understanding Evaluation Metrics

Table stakes, key criteria, and emerging technologies represent specific features and capabilities of solutions in a sector. Evaluation metrics, by contrast, describe broad, top-line characteristics—things like scalability, interoperability, or cost-effectiveness. They are strategic considerations, whereas key criteria are tactical ones.

By evaluating how key criteria and other features impact these strategic metrics, we gain insight into the value a solution can have to an organization. For example, a robust API and extensibility features can directly impact technical parameters like flexibility and scalability, while also improving a business parameter like the TCO.

The goal of the GigaOm Key Criteria report is to structure and simplify the decision-making process around key criteria and evaluation metrics, allowing the first to inform the second and enabling IT professionals to make better decisions.

4. Decision Criteria Analysis

In this section, we describe the specific table stakes, key criteria, and emerging technologies that organizations should evaluate when considering solutions in this market sector.

Table Stakes

This report considers the following table stakes—features that we would expect all solutions in this sector to support.

  • System connections
  • Event processing
  • Data correlation
  • Alert management
  • Recovery management
  • Logging
  • Abstraction
  • Security extensions
  • Tool integrations

System Connections
By nature, every CMP provides connections to other systems. But the number of and type of connections an organization needs from a CMP will differ according to size of business, deployment, and geographical needs. The number and types of connections a CMP supports the greater its potential value to the business. If you need Amazon Web Services (AWS), Microsoft Azure, and Google Public Cloud (GCP) for public clouds and HyperV or VMWare for on-premises then only CMP tools that support these would fit your needs. For this report the CMP vendors must support two cloud providers.

Event Processing
Event processing is the ability to process events properly, including outages and breach responses. While a company may have other tools that collect a subset of events, the CMP must collect and aggregate the events for everything they control.

The more powerful CMPs can take action based on the information in an event. For example, if an event is a server hang, the CMP may restart the server/VM/Container, or it may deploy a new instance and clean up after the failed instance.

The number and type of events a CMP tool can manage dictates its value and is useful in creating a shortlist. For example, if you need Oracle database metrics and event notices, then look for a CMP that can get and process these events.

Data Correlation
Data correlation is the ability to gather and make sense of management and monitoring data, and then provide guidance and next steps to operators or automation systems. Much of operational management is based on tracking utilization, capacity, or performance metrics.

Examples include:

  • Detecting a memory leak, which is monitoring memory usage over time and correlating that to a pending outage if not addressed before memory is exhausted
  • File system space utilization, especially where log files or business content files are stored
  • Metrics needed to control auto-scale processes to grow the instances to support a product and again to shrink the instances safely when the demand goes away. The correlation of response time to other metrics needs to be automated to support the automation of capacity management.

Alert Management
Alert management is the ability to process alerts and alarms. This includes notifications from all systems controlled by the CMP plus alerts from 3rd party systems.

This is important as it allows operations staff to deal with the critical issues, such as:

  • If a network link fails, the alerts will be about the systems impacted and the status of restorating the link that caused the outage.
  • If a database that supports multiple applications fails, the CMP would suppress false alarms about applications that fail as a result of the database going down and instead provide business and operations staff alerts and status updates about restoring the database.

Recovery Management
A CMP’s ability to auto-recover or self-heal from common or known incidents is now a requirement. A mature CMP tool uses basic data monitoring to determine the state of a VM, Container, etc., sending alerts, and taking preventive measures as needed to prevent or minimize business impact (that could be a reduction in unplanned downtime or fixing a problem before the business is impacted).

Examples include:

  • A VM that goes into a hung state due to a memory leak. The CMP will issue a restart of that VM and clear up any processes while alerting for additional investigation.
  • Filling up disk space due to excessive usage or application error. A CMP will assign additional space as a temporary measure to keep the product functioning while alerting based on correlated data to help identify the root cause.

Logging
Logging is the ability to log all management and monitoring activity. Advanced CMPs will go further by adopting various log management and extensions as well as logs being collected agent-based or agentless etc.

The level of logging is also a key criterion for a CMP tool. In addition, the ability to collect additional logs during troubleshooting, system impacts during log collection, etc. are part of the consideration. Administrator views of these logs collected by the CMP tool must be considered; correlated log data is expected of today’s CMP tool along with the capability to integrate with other third-party logging tools.

Abstraction
A CMP tool must provide an abstraction layer for both external tool integration as well as CMP tool administration. One of the primary functions of CMPs is the ability to use a common user interface across a few systems.

The ability to manage multiple public clouds (AWS, Azure, etc.) from a CMP tool should be abstracted. Once the CMP tool establishes a connection to various clouds at the backend, the ability to create various cloud objects based on business product requirements will be through the CMP portal and consumers will use the same interface to deploy to multi-cloud.

An abstraction layer will avoid any cloud vendor lock-in by creating an agnostic framework that allows for greater application portability. A CMP must aggregate multiple clouds in a single pane of glass.

Security Extension
The ability to add in security features as needed. Some CMPs will integrate with SIEM (Security Information and Event Management) to protect hybrid and multi-cloud platforms. A CMP needs to integrate identity management, encryption, and data protection.

CMPs must support compliance and governance as well as the ability to create various compliance templates and enforce them across underlying resource-based security policies. As mentioned before, the demand for CMPs to provide threat detection is also growing.

Tool Integration
The ability to integrate with other business systems through a standard abstraction layer such as Rest API. Common integrations include:

  • Ops tools such as security and compliance
  • ITSM tools such as ServiceNow or Cherwell.
  • CI/CD tools such as GitHub and Jira

There should also be integration with external enterprise management systems, including service catalogs, support configuration of storage, and network resources. This allows for enhanced resource management via service governance and advanced monitoring for improved product performance and availability.

Key Criteria

Here we explore the primary criteria for evaluating solutions, based on attributes or capabilities that may be offered by some vendors but not others. These criteria will be the basis on which organizations decide which solutions to adopt for their particular needs. The key criteria are:

  • Complexity management
  • Heterogeneity
  • Resource management
  • Cost Governance
  • Automation management
  • Partner ecosystem

Complexity Management
Complexity management is the ability to leverage abstraction and automation to remove management complexity for multiple public/hybrid clouds. In a single public-cloud deployment, a CMP is not required unless several applications need a CPM to manage the complexity. This is the primary key criteria for any CMP—lack of support for various cloud vendors will reduce the value of a CMP for smaller enterprises. Support for cloud-native functionalities needs to be part of a CMP’s core capabilities, providing a single pane of glass for multi and hybrid cloud deployment.

For a single pane of glass dashboard to operate in a cloud or on-premises hybrid environment, the following are necessary:

  • The ability to create an abstraction layer for multiple cloud and/or on-premises deployments. Provisioning compute through a CMP portal should only define (expose) a pre-configured target (AWS, Azure, or VMWare on-prem) where the developer or requestor of the service does not have to manage unique features of each cloud vendor.
  • An abstraction layer to simplify the complexity of multi-cloud deployment through CMP will avoid vendor lock-in. The goal should be to allow cloud portability so a business unit’s application can be re-hosted by a different cloud vendor if the requirements or hosting costs trigger a change.

Heterogeneity
Heterogeneity is the ability to monitor and manage diverse systems, including legacy and cloud-based. This capability is critical for hybrid deployments. Examples include:

Leveraging the storage tier within a data center.
Provisioning and managing EMC or NetApp-based SAN/NAS along with cloud-based object storage.
The process to register an IP or DNS entry will be different for each cloud, but the CMP should abstract the details from the service requestor.
​​
Resource Management
This addresses where and how storage, compute, and other resources are provisioned and de-provisioned both in the public cloud and internally. In hybrid and cloud-based environments, the full resource management lifecycle must be addressed, so that provisioned compute resources and IP pools are used and assigned and deprovisioned ones are released back to the pool. This activity should likewise free up pre-paid or reserved instance types so other authorized deployments can leverage them if the original project no longer requires them. Finally, updates to DNS or API gateways, as well as to security and monitoring systems, should be cleaned up when a project is decommissioned.

If you have deployed applications in the cloud, whether private or public, you need to know how well they are performing. You need to know where problems exist—be it in the cloud, in your application, in the database server, or the network—even when some application components are hosted in the public cloud and others on the private network. Ultimately, pinpointing root cause comes down to correlating between the performance of your application components, the network, and the public cloud infrastructure.

Cost Governance
This is the ability to monitor costs and set up cost governance policies across hybrid and multi-clouds. Cost governance for public cloud is a key criterion as IT spending moves from a CapEx to OpEx model..

Organizations should be able to discover anomalous consumption and alert engineers when deviations in spending happen. This provides engineering teams with the freedom to consume public cloud resources as needed, thus preserving overall corporate agility.

In some cases, users can achieve additional, more proactive CMP values, such as cost/quota enforcement and optimized workload placement. Over time, cloud consumption can move from direct to brokered, to realize the full value of a CMP. This differs from FinOps which focuses on the forecast and budget of an app. Here, this is looking at performance and the applications needs.

Automation Management
Automation management is the ability to kick off autonomous processing, and support orchestration of multiple automation is a key criterion.

Infrastructure automation (IA) tools allow DevOps and I&O teams to design and implement self-service, automated delivery services across on-premises and IaaS environments. IA tools also enable DevOps and I&O teams to manage the life cycle of services through creation, configuration, operation, and retirement. These infrastructure services are then exposed via API integrations to complement broader DevOps toolchains or are consumed via a centralized administration console.

Cloud Orchestration is a method for automating manual IT processes such as provisioning, installation, configuration management, remediation, maintenance, monitoring, scaling, etc.

Partner Ecosystem
This addressed the number of third party technologies that are known to work with the tool is an important differentiator. Examples include extending internal security to a public cloud, integration with ITSM platforms, and leveraging internal Application Delivery Controller (ADC)—i.e. F5, NGINX, etc.—as part of CMP automation.

Emerging Technologies

Finally, we consider key emerging capabilities in this sector. We expect these technologies to become widely relevant over the next one to two years.

  • Security operations
  • Governance
  • FinOps
  • Artificial intelligence (AI)
  • Security policy as code

Security Operations (SecOps)
SSecurity Operations (SecOps) is the ability to link with security systems. Integrating cloud into your existing enterprise security program involves more than adding a few more controls or point solutions. It requires an assessment of your resources and business needs to develop a fresh approach to your culture and cloud security strategy.

To manage a cohesive hybrid, multi-cloud security program, you need to establish visibility and control. A CMP must act as the controller for a hybrid deployment. There should be a bi-directional relationship where the CMP system feeds alerts and awareness to the security systems, like a SEIM, and gets requests to make an emergency change from a security tool, like a SOAR, to remediate a security incident.

Governance
Governance is the ability to link with governance systems, including services and resources. Cloud governance is a set of rules, guidelines for deployment, security, costs, etc. A CMP provides an abstraction layer of governance applicable to both internal and public-cloud resources. Monitoring and enforcing these rules is integral. Security policies are the most essential element of any cloud governance. A mature CMP not only provides governance but also recommendations on optimization.

FinOps
Financial governance continues to be a top priority as enterprises migrate to the cloud. This includes the ability of a CMP to manage and aggregate financial information across various cloud vendors and provide a comprehensive view, enable enterprises to manage costs, set budgetary thresholds, provide cost per resource, and target endpoints based on cost.

Cloud Resource Optimization
Cloud cost optimization will provide an additional key matrix to manage cloud resources. Optimization of cloud resources is a continuous effort and a CMP’s ability to provide suggestions and visibility should be carefully reviewed.

Artificial Intelligence (AI)
The use of AI is most commonly found in providing cloud cost optimization and will provide an additional key matrix to manage cloud resources. Optimization of cloud resources is a continuous effort and a CMP’s ability to provide suggestions and visibility should be carefully reviewed. The distinction here is learned behavior to suggest optimization as opposed to preset rules with thresholds.

Security Policy as Code
Implementing security policies as code is part of the DevOps journey. The complexity of the technology landscape today demands treating security policy as code. Support for DevSecOps from a CMP will provide visibility, alerting, and continuous improvement. The markets seem to converge on the OPA (Open Policy Agent) standard as a way to link policy makers with actionable code that is versioned and enforced on deployments.

5. Evaluation Metrics

Our assessment of the solution space continues with an exploration of the strategic evaluation metrics we use to evaluate the impact that a Cloud Management Platform solution might have on an organization. In many cases, these metrics will be consistent across technology sectors, reflecting fundamental aspects like flexibility, ease of use, and total cost of ownership. For purposes of this exploration, we consider the following evaluation metrics:

  • Flexibility
  • Scalability
  • Disaster recovery
  • Ease of implementation and usability
  • Monitorability

Flexibility
What capabilities are available out of the box vs through customization? How easy is it to customize? Does the CMP vendor provide support for primary public-clouds and what’s the level of integration with each vendor. If the integration minimizes cloud functionalities through the CMP’s layer of abstraction, this must be taken under consideration. In addition, evaluate how the CMP handles new feature releases from public cloud vendors.

Scalability
Will the CMP scale with the needs of your business? For example, if you start with AWS will the CMP support adding a second cloud, like Azure, at a later phase? Scalability in cloud computing refers to the ability to increase or decrease IT resources as needed to meet changing demand. Scalability is one of the hallmarks of the cloud and the primary driver of its exploding popularity with businesses. ​​Scaling can be done quickly and easily, typically with little to no disruption or downtime.

Today, application design patterns are moving from a server or VM deployment pattern to support containers (Kubernetes or Cloud Foundry) or Function as a Service or Micro-services. So, if your company has these now or will shortly, this becomes a way to narrow down the vendors to ones that can support these patterns.

There is also a scale aspect to CMP tools where some work well at the Fortune 10 level, but may have a startup cost that excludes the Fortune 1000 and below. The scale can also be at the number of projects they support. Early tools became unusable because their GUI retrieved all products and would timeout before the page loaded. Some tools have design aspects that make them hard to use when you have thousands of products as they were designed for the needs of the mid-market that have less than 100 products in their portfolios.

Disaster Recovery
Does the CMP provide business resiliency through high-availability and offer disaster recovery? In a complex hybrid or multi-cloud environment, CMP portal is a critical component. Successful deployment is directly tied to the automation’s ability to abstract location, so environments are consistent independent of where they are deployed.

Usability
What’s the time to market, or how fast can it be implemented? This includes the ability to provide support and training as well as assist in installation, training, and ongoing support. An out-of-the-box CMP implementation should be quick and provide business value in days. Cloud is agile, your CMP should not increase the time required to provide business value.

Monitorability
How well does the CMP monitor the environment and feed external systems monitors and security monitors actionable intelligence? Basic cloud object monitoring across various vendors should be included in the single glass of pane. Additional features offered by vendors include a view from a business product perspective and the ability to identify (at a higher level) where the bottleneck resides.

6. Key Criteria: Impact Analysis

This section provides guidance on how the key criteria features described earlier impact each of the evaluation metrics just defined. Table 1 helps the reader understand each feature or capability’s impact on each evaluation metric, making it possible to better assess the value a solution may have to an organization.

Table 1. Impact of Features on Metrics

Flexibility Scalability Disaster Recovery Usability Monitorability
Complexity Management 5 4 5 4 4
Heterogeneity 5 2 3 3 3
Resource Management 2 5 4 3 5
Cost Governance 3 4 4 3 4
Automation Management 5 5 5 4 4
Partner Ecosystem 3 1 3 5 3

Impact on Flexibility
Flexibility for a CMP is directly tied to the complexity management of hybrid and multi-cloud deployments. Scores for this category are based on how many public clouds the CMP extends to and what legacy on-premises platforms it integrates and how easily. Based on deployment metrics, the CMP should shift workloads to different availability zones, regions, or cloud providers (including on-premises). Business Units should not be burdened with simple applications to modify their applications to support a specific cloud vendor or location. A business’s use of heterogeneous solutions shouldn’t be a constraint on the CMP within the limits of x86 or ARM-based workloads—Note: ARM-based workloads are not a common CPU type supported by all cloud providers. The complexity of the cloud vendors should be abstracted from the business code so deployment to allow the CMP to meet the flexibility requirements of the business unit needs.

Impact on Scalability
The scalability matrix is twofold—resource management and automation. How does the CMP scale as the environment/cloud endpoint grows? The level of resource management the tool can track and map to a specific application or release is critical to show its ability to scale and to ensure no single point of failure. By tracking the performance of configuration items, users can better manage performance issues that would limit scalability. Automation is the other aspect where CMP scalability must be measured. Automation is critical to deploying in one availability zone or multiple, and limitations in automation can impact scalability. For example, if only one scale job can be performed at a time and a situation arises where multiple systems need to scale up or down at the same time, this will impact business operations.

Impact on Disaster Recovery
Disaster recovery (DR) relates to the complexity of the business systems deployed and compounded when the environment is heterogeneous. Additionally, when success is directly tied to the automation’s ability to abstract location, so deployments are consistent independent of where they are deployed. Automation should support testing DR processes without forcing actual fail-over via a simulation approach. Configuration of new applications should show single points of failure and system dependencies to ensure DR plans include all the services needed to recover in the DR site.

Impact on Usability
The top Usability matrix is tied to the partner systems a CMP integrates with. Value should be decided based on how many systems a CMP integrates with as well as how the framework extends. Does the CMP need a third-party layer or a plug-in for the integration? Does the partner network also include any form of marketplace or pre-configured templates for commonly used processes? The more the CMP includes as components, the less users need to pay third parties for additional add-ons. The latter can drive costs and impact usability if customers have to create and maintain their own connectors, plug-in, or integrations with other systems.

Impact on Monitorability
The most important aspect of a CMP is monitorability of the resources it manages. Important considerations include: How is data monitored and presented? Can the data be shared out to other external tools? Alerting based on resource policies will also impact monitorability scores. As cloud vendors require customers to apply the correct tags on configuration items so that billing and monitoring features can be reported correctly, the CMP must enforce including tags on any request for a new resource. Some CMPs can show items that are not tagged so that the administrators of the tool can follow up to get the correct tags applied, for example CMPs that connect to CI/CD or allow ad-hoc interaction with the cloud vendor. Failure to tag will result in failure to provide clear financial controls of cloud spend. The ability to properly inform automation to deploy or change or remove configuration items, like a VM, is linked to the ability to monitor the environment.

7. Analyst’s Take

The CMP market has grown up over the last decade. It used to be made up of hardware-specific tools sold by blade or hyperconverged system vendors who were focused on-premises deployments and trying to emulate AWS. Now, the current state is pure-play cloud management vendors that are supporting multiple public cloud providers as well as on-premises deployments of not just VMs but also containers, Function as a Service, and service meshes hosting microservices.

As this market matures, the CMP vendors are broadening their value by including FinOps functions demanded by the CFO as well as cloud workload planning tools that optimize performance of an application post go-live.

Vendors are integrating their products more with CI/CD tool chains so that they can support intent-based hosting. This is where the CMP leverages application code properties and security settings to identify what the system will need to speed up the deployment process and increase traceability of configurations to requirements. Most of the AI work is going to this purpose, to try and read the code developers write, and derive hosting requirements. ML work is more about performance sizing.

With AMD CPUs being added to the public cloud ecosystems, few tools have offered intelligent placement and instance sizing based on whether an Intel or AMD CPU would be better for a specific workload. Vendors using AI and ML and also doing their own testing may be better able to optimize instance type, CPU type, and which region or availability zone to use as well as identify if a spot market rate, a regular on-demand rate, or if a pre-reserved instance rate would be the best solution for each component deployed in an application.

Buyers who need the flexibility to move cloud vendors should choose a CMP that supports at least the top three cloud vendors (i.e., AWS, Azure, GCP). If you’re a large enterprise with existing investments in IBM, Oracle, or SAP software you may want the flexibility to also deploy to those environments. Few CMPs support vendors outside of AWS and Azure, and even among these, cost and performance are different between locations. This is why a key trend is businesses leveraging their CMP vendor for placement advice to help pick the right cloud and location for their needs.

Additionally, once multi-cloud or multi-region is in scope and there are data province or sovereignty issues, CMPs that offer advice or provide guardrails to prevent deploying or moving data in violation of policy provide added value.

The last trend is Policy as Code. While often thought of as just security, many regulated companies and international companies have policies that are not about security, but are contractual issues that the CMPs can use when suggesting placement or to prevent deployments that violate policy.

8. About Farhad Sayeed

Farhad Sayeed

Farhad Sayeed has been an IT leader in various software development, insurance, and airline industries for more than 25 years. He spent the last 17 years with one of the largest airlines, implementing enterprise technologies in various lines of business, such as cargo, loyalty, employee technology, regional airlines, and more.

Farhad provides vision and technical know-how in a wide variety of areas, including compute (on-prem and cloud), storage, data protection (backup, archiving, and so on), load balancing, virtual desktops, and infrastructure as code. He is a pioneer in hyperconverged/converged and virtualization technologies, and he led an OpenStack implementation to support Pivotal PaaS for vital airline applications.

Farhad architected and oversaw the governance of a Travel & Transportation industries’ first public cloud (Azure) implementation. His deep technical knowledge extends from designing and operationalizing data center and hybrid cloud solutions through implementing automation via CMP, SDN, CI/CD, and the like.

9. About Michael Delzer

Michael Delzer

Michael Delzer is a global leader with extensive and varied experience in technology. He spent 15 years as American Airlines’ Chief Infrastructure Architecture Engineer, and delivers competitive advantages to companies ranging from start-ups to Fortune 100 corporations by leveraging market insights and accurate trend projections. He excels in identifying technology trends and providing holistic solutions, which results in passionate support of vision objectives by business stakeholders and IT staff. Michael has received a gold medal from the American Institute of Architects.

Michael has deep industry experience and wide-ranging knowledge of what’s needed to build IT solutions that optimize for value and speed while enabling innovation. He has been building and operating data centers for over 20 years, and completed audits in over 1,000 data centers in North America and Europe.. He currently advises startups in green data center technologies.

10. About GigaOm

GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.

GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.

GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.

11. Copyright

© Knowingly, Inc. 2021 "Key Criteria for Evaluating Cloud Management Platforms" is a trademark of Knowingly, Inc. For permission to reproduce this report, please contact sales@gigaom.com.