Table of Contents
- What’s Not Working with Cloud Migration?
- Table Stakes Include Agreements Between Business Units and IT
- Evaluating Your Cloud Provider Against Migration Success Metrics
- Migration Success Means Avoiding Nine Roadblocks
- Analyst Take: It Was Never About Migrating a Single Application
- Call-out: NetApp: Cloud Migration Leadership
- About Michael Delzer
- About GigaOm
- Copyright
1. What’s Not Working with Cloud Migration?
Companies are increasingly transitioning their traditional IT systems to the cloud, including everything from databases and applications to the underlying infrastructure. Behind the movement: the prospect of lower costs and improved agility that enables organizations to respond to change. The benefits are supported by evidence and anecdote:
- You may be able to lower your operating costs or at least slow their rate of growth, while reducing or eliminating the expense of physical data centers.
- The ability to scale resources quickly means that the business can be more agile, responding to changing needs and expanding into new markets with less risk.
The benefits of cloud migration are not a given. For IT shops versed in the rhythms of on-premises operations, the transition to the cloud presents unfamiliar challenges. Selecting a cloud vendor is significantly different than the process of choosing a data center provider. And despite historical promises from service providers of increased agility, lower cost, and risk, migrating even a single complex application to the cloud is a major challenge. The obstacles range from the deeply technical to issues relating to operations and management, and to managing the change itself.
Whatever the reasons, organizations are constantly looking at their past migrations with the benefit of hindsight to answer the question, “Why didn’t I get it right the first time?” As the old Irish adage goes, “If you want to get there, don’t start from here.” This report helps you start from the right place, exploring the processes and technologies needed for a successful cloud migration. Three areas of focus are addressed:
- Base-level agreements between business units and IT
- Core and differentiating cloud provider capabilities
- Evaluating a provider in terms of required success factors
Success not only means addressing these needs, but also defeating potential roadblocks in advance. Let’s first consider what needs to be in place between IT and the business.
2. Table Stakes Include Agreements Between Business Units and IT
Perhaps the most fundamental area to focus on first is how cloud-based models change the relationship between an organization and its use of technology. This has an impact on multiple aspects of the organization and its structures. Among them:
- IT-business relationships: Technology platforms can respond faster to business needs; as a counterpoint, taking advantage of this requires better communication between IT and the business. Equally, security and governance policies and controls need to be discussed, confirmed, and agreed in advance, as these will change between in-house and cloud-based environments.
- Systems and infrastructure testing: In your own data center, you enjoy full control over when elements of the stack are changed and how they are tested, but with cloud services you lose control over the rate and impact of change. This happens at different levels based on the architecture: with Infrastructure as a Service (IaaS), it’s below the OS, and with Platform as a Service (PaaS), it’s below custom code. This must be accounted for in your staffing and testing plans to avoid outages.
- Technology financing and forecasting: From a business unit perspective, the change to public cloud is defined by how you budget and forecast spend. Traditional IT and its waterfall capital expenditures typically spread outlays over two to seven years, but cloud models have a much shorter horizon.
As a result of these fundamentals, business units and IT need to agree to certain table stakes requirements in advance of any major cloud migration. Failure to do so will just store up problems for later. At the very least, the following base-level agreements need to be agreed and funded in order to maximize the chances of success and meet business expectations.
- Cloud contract or master services agreement: This is the agreement made with the cloud provider. It must be approved by legal, finance, engaged business units, and stakeholders for the relevant departments in IT. Included in the agreement should be topics such as limit of liability, breach notification, contract length, renewal process, and exit terms or triggers.
- Network and security processes and controls: IT and security must articulate basic network and security processes and controls in a living document that is affirmed by the business units supporting them. Network and security needs and issues are constantly evolving, so regular review, update, and approval of the document must be committed to by all parties.
- Tag usage: Business units, IT, and security must agree to require, enforce, and remediate the use of tags that are employed to achieve accurate billing, and to support operations and forecasting. All cloud vendors will report costs associated with tagged resources. Deployments must use tags from an approved list that ties back to attributes that the finance group will recognize. Failure to enforce tag compliance will result in pools of spend that no one can be held accountable for. The resulting lack in financial accountability will erode trust in business and to shareholders.
3. Evaluating Your Cloud Provider Against Migration Success Metrics
Once you’ve selected the right platform, attention turns to the most important outcome: migration success. This requires not only that the cloud vendor offers certain capabilities, but also that you can use them correctly. In this section we set out capabilities that differentiate migration planning and outcomes, whose value is specific to the company that is migrating. You can consider the list below as you put together your migration plan.
- Environment segregation
- Data protection
- Configuration management and auditability
- Change management and serviceability
- Service reliability and testing
- Management and actionability
Environment Segregation
Beyond general security concerns, you may need to consider how you restrict access between internal and external parts of the systems you are migrating. An environment in the cloud may have both internally facing and externally facing IP addresses and DNS entries; however, network authentication and authorization should not be shared across these environments.
For example, you may have both internal and public dev/test/stage/QA/Prod environments, but one environment should not support both internal DNS and non-public IP addresses as well as public DNS and public IP addresses. Failure to maintain separate environments is a breach waiting to happen. If you can’t afford to run both a public-facing and internal-facing environment with no shared components, plan to run only a public-facing environment.
Data Protection
When moving systems to the cloud, historic backup or archive copies may need to be recalled and stored for use going forward. These must be quickly accessible and available to be loaded in an appropriate location. Be aware that regulatory or legal requirements may mandate that some systems of record be retained on-site with supporting systems if they can’t be migrated. Examples include vertically integrated solutions on an IBM AS400 (iSeries) or a mainframe process.
Some data protection solutions can maintain on-premises data protection while allowing the controlling systems and backup records to migrate to the cloud. This can enable you to move from an on-premises backup of what is in the cloud, to the cloud backing up what is on-premises. The process should span at minimum a month-end close, if not a quarter-net close, to ensure that the correct sizing and data volumes are recorded and tested. Multiple sample-restores with error detection turned on should be performed in both directions to verify the process and ensure the technology is sound.
Configuration Management and Auditability
Software-defined models supported by the cloud allow infrastructure and application configurations to be defined “as code” and rolled out in parallel with any changes to the application itself. This is the concept of IT-as-code or infrastructure-as -code. For best results, plan to adopt this approach wholesale, with proper use of version control, and with processes that assure that you document configurations and changes in advance of deploying them.
All of this enables a process where configuration changes are written and committed to a code repository such as GitHub/GitLab, and an automation process pulls from a repository to execute a change – an approach known as GitOps. This assures that when a change occurs that updates a central repository, it includes a record of who authorized the change and what system made it. In complex deployments, this becomes a chain of custody that shows where each configuration change came from and who authorized its use.
Change Management and Serviceability
The ability to make changes or updates to a component without an outage is critical and often requires enabling diversion of load and transactions away from a component so it can be changed, updated, replaced, or removed without business impact. This need puts constraints on both the cloud provider and on your own ability to deliver change, particularly given that changes can be made by a cloud vendor with as little as two weeks’ notice.
For example, to minimize the risk of change, you can put an abstraction layer between the business applications and the cloud vendor’s specific version of a cloud API or tool. This allows enterprise IT teams to prepare for a change without impacting application code, configurations, or logic. At a network level, IP addresses can be treated as ephemeral elements that can change without impact to security controls, DNS, or business logic. The process of changing an IP address should be programmatic, with DNS processes set to have a pool so an update to an address is not a single point of failure while you wait for DNS to propagate the change.
As you commence your migration, you should review whether the operating systems or Kubernetes versions supported by the provider match the existing requirements, and whether they will be supported beyond when the business plans to upgrade. Keep in mind that if a project runs late, you want to ensure there is still time left in the support contract to run older systems.
Service Reliability and Testing
Reliability is about more than uptime; it’s about predictability—that a service in production has a known time to complete and is consistent in delivering results. To achieve this, you need quality control and testing of configuration infrastructure code alongside application code. And you also need the ability to assess the environment as a whole, determine potential causes of failure, and provide feedback to engineering teams.
You can also review a vendor’s capabilities with service reliability in mind. Optimally, an administrator isn’t required to clean a system after a crash or reboot to return it to operation. If no vendor-automated capability exists, you will need to document a manual process or implement your own automation.
Management and Actionability
The virtual and ephemeral nature of cloud elements requires a specific mindset when monitoring what’s happening in the environment. Thought should be given to path performance issues, which are magnified by cloud designs, and may not be captured by traditional data center approaches that monitor systems in isolation. In parallel, the rate of change in cloud environments is greater than in traditional data centers, and much of that change is outside the control of the enterprise.
Cloud vendors provide metrics and telemetry data that can be ingested into the enterprise’s observability and security awareness tooling. Synthetic probes can trace performance from all major locations the business needs to support, while custom-built applications need error codes that provide actionable intelligence to remediate errors.
Operations staff must gear up to expect and respond to events coming from cloud vendor services or the application under management. For legacy systems, either a third-party tool or internal staff needs to detect errors and identify what caused them, and then create processes to resolve the originating condition.
4. Migration Success Means Avoiding Nine Roadblocks
Once your organization has selected and aligned its business, core IT, security, and operations to a cloud vendor, the task of planning the move begins. Here it is critical that IT leadership recognize and address nine common roadblocks to cloud migration success.
Roadblock 1: Accuracy of Existing Documentation
As a first step, you need to look at how to assure continuity with business users. If you have current and accurate documentation processes, usage models and dependencies should be part of that documentation. If you do not have good documentation or the accuracy of the documentation is in question, successful and quick migrations become significantly more difficult.
Roadblock 2: Save Now, Pay Later (on Infrastructure)
Once you understand the end-user ramifications, the best practice is to review each application to understand if its current architecture is already optimized for the cloud or if changes will be needed, either before or soon after migration, to avoid unnecessary expenses.
If your current environment is completely on VMware or similar, you can move the entire environment “as is” to the cloud using tools that either extend your virtual solution to a cloud vendor or translate your current configurations into that cloud’s native configurations. Newer systems in today’s data centers can more easily be moved to the cloud.
However, this approach can create technical debt. For example: on-premises, your developers and VMware administrators may have over-provisioned infrastructure resources, since it can be easier to have two large VMs load balanced so one VM could handle 100% of the peak load. In the cloud you pay for what you request, so maintaining this structure will result in overpaying. At scale, this can result in higher costs over five years than keeping it on-premises.
Roadblock 3: Monitoring Framework Mismatch
An existing monitoring framework that is not application-focused may need to be changed before migration to ensure you have a normalized operational model. In that way, when you move to the cloud you can establish a common method to measure performance, health, and costs.
In the cloud, you might choose to autoscale an app with two small members as a minimum, scaling the application to meet peak demand, and then shrinking as demand subsides. The extra capacity could come from spot pricing instead of higher on-demand pricing, reducing operating costs. While this may seem like a trivial change, it means that the application must be auto-deployable to ephemeral nodes, and that means your monitoring system must be able to focus on the health of an application that at any moment may be on different nodes than it was just minutes before.
This is a fundamental change in operations. For decades, servers were constant for one to five years, and monitoring tools focused on boxes (or OS instances) running one or more ephemeral applications. Now it is rare that multiple, unrelated applications would be deployed to the same OS instance.
Roadblock 4: Adapting Security
If your current security tooling is based on IP addresses or host names and not on applications, then the way it enforces security or manages security alerts should be changed before you move the application. For example, in traditional data centers, a server’s host name and its IP address can remain unchanged for the life of the server. However, when using public cloud providers, networking and security must be programmatically set using dynamically provisioned and updated values instead of statically assigned IP addresses and unchanging host names.
A traditional server may be on a two-year lease or used for five years, but that should not be the case for a virtual machine. VMs should not be patched; instead, they should be re-deployed and their host name and IP address should be changed upon each deployment or change control event. The fact is that patching a server in a public cloud should be avoided. VMs in the public cloud are ephemeral and can fail, so references to them should be dynamically fed to security systems.
Again, a common theme emerges: If the business is not willing to fund changes to operations and security to support migration to the cloud, your chances of success are severely diminished.
Roadblock 5: Remote No More
A fundamental change in operations for cloud success is to move away from humans remotely logging onto a server or network device to do their job. Instead, your IT staff should write instructions in a standard format that an automation or orchestration tool can execute on their behalf.
Retraining will be in order, as many IT staff must learn to use infrastructure-as-code tools, or to write code in YAML, JSON, Terraform, Puppet, Chef, or Python, which is then checked into a code repository. From there, an automation tool or orchestration tool calls the code to perform the task, following an approach such as GitOps.
Roadblock 6: Calculating Outage Tolerance
Move a system to the cloud and you can expect an outage. Businesses need to calculate how many outages they can withstand during the migration process.
The larger the number of migrated business applications and processes, the greater the danger. A knife-edge cutover of 1,000 operating systems for a complete move of everything will require a longer outage, while a knife-edge cutover of a complete solution one system at a time produces smaller outages per system, but the total migration time to the cloud increases. Businesses must determine if a more complex replication pattern may be needed to balance delivery and downtime.
Data is a key limiter. You can only move so much data from one location to another in a set period of time. If that data is critical and dynamic, you will hit a roadblock. Where data is and how quickly an application can get to it becomes a defining issue in a cloud migration.
In response, before the move, set up a data storage solution that supports replication to the cloud. Three types of storage are used—file-level storage with directory structures, block-level storage with mount points, and object storage. The time a business process can be down will determine the requirements and limitations of the migration. When data access or veracity is critical, the technology used must support these requirements.
Roadblock 7: Licensing and the Fine Print
Trouble can crop up when independent software vendors have legacy language in their licensing agreements that does not support migration to a cloud. Others may require that license and compliance systems be added to validate that licensed capacity is not being exceeded.
Be sure to validate that your licenses allow you to run software in the cloud without changes to contract terms or amounts, and also determine whether the software can be run in both locations during the migration period (which could range from one to three months for small projects based on practice cutover testing to three to twelve months for large, tightly coupled systems).
Versioning can be an issue as well. An older version of a solution may need to be updated before it can be moved to the cloud; it may depend on an older version of an OS that is not supported by your cloud vendor; or it might require direct access to storage, video, or network systems.
Other variables exist. Some vendors enforce time-dependent moves, so if you move software to the cloud, it must stay in place for 90 days before you can move it again. You will need an exception to move it again or face the prospect of having to double your software licenses. Other applications may have site licenses that don’t work with cloud availability zones, so that each is seen as a new site, effectively doubling your license count. Finally, socket-based software often won’t work for virtualized cloud hosting, which is yet another issue.
Roadblock 8: Back It Up, Prepare for Discovery
Electronic Record Discovery (eDiscovery) and data recovery of historical data are two areas that get overlooked—specifically, the process and tools used in the legacy data center to support legal needs for eDiscovery, or your legal and business needs to restore old data. If you are still using magnetic tape for data protection or archiving, you need to devise a plan to convert or subscribe to a third party to provide on-demand access to legacy tape assets before migration. This includes enabling data quality checks to ensure veracity in the data restored.
Security key issues may crop up, so don’t lose access to archived or backup content. Also, the systems used to store content flagged for hold by the eDiscovery system will need to be addressed to see if you can port the data over, or how you will support your historic content needs for regulatory compliance. Of course, you’ll need to forecast your future needs to identify and protect cloud-based content.
Finally, the process for data protection may need to be changed. The limitations of on-premises storage systems often dictate a different solution for data protection and restoration than for cloud-based systems. The resulting re-evaluation may require that the new solution be installed and tested before migration, so you don’t lose your ability to fall back or restore recent data needed to resolve operational issues.
Roadblock 9: Focus on the Bottom Line
Ultimately, cloud migration is not about cost alone nor agility, for that matter. It is about moving your technology-related resources into a model where you can better manage them and derive benefit. As you grow your use of cloud, you may move to different cloud vendors depending on application. When you use multiple clouds, the number of things that can change and the rate of change more than doubles. The dynamic way cloud vendors bill and the multiple ways that clouds perform will trigger additional spending events. These include:
- Cloud management platform to abstract complexity and testing from the customer;
- Cloud resource management to deal with post-deployment migration and optimization of customer workloads;
- FinOps tools to track actual spend against budget and perform forecasting for spend optimization.
Together, these tools enable the best use of multiple cloud providers from a financial and flexibility perspective, delivering actual business agility. For example, FinOps lets a business get daily updates on top-line revenue impacted positively or negatively by changes in bottom-line cloud spending. This can get to the point where a company can see the cost of IT per unit sold in near-real time. Where moving data centers was not practical for on-premises applications due to physical assets, moving between regions of a cloud vendor or between cloud vendors becomes doable based on business value.
5. Analyst Take: It Was Never About Migrating a Single Application
The value of migrating traditional on-premises data center operations to the cloud increases every year as public cloud vendors work to exceed the value of on-premises solutions. Improved agility, scalability, flexibility, and cost management are just a few of the benefits organizations can achieve through such a move. However, a cloud migration can also be among the most difficult and potentially risky transitions an IT organization will face, especially if a complete plan as addressed above is not agreed to by all stakeholders.
We find that organizations must do two things to effect a successful cloud migration:
- Develop a comprehensive plan that addresses the state of the IT infrastructure and targets processes such as data protection, eDiscovery, and compliance. This plan should include all the table stakes and relevant key criteria (features) needed to meet the business expectations.
- Establish a cross-functional team that reports to one person with both the spending authority to fund the migration and the executive backing to ensure robust support across business units and IT bodies.
These two actions will help avoid the roadblocks that threaten to derail cloud migration efforts, but they cannot happen in isolation. For success to be achieved, domain experts and business unit leaders must agree on the specifics of the migration before proceeding, and the executive team must buy into the strategy and desired outcome for the migration and commit to funding its long-term success.
IT leadership must keep its vision in mind so it can consider the critical second-year spend during the migration process. This requires a grasp of the details and dynamics of the ongoing migration so that cost expectations match reality as the migration develops. Failure to manage the second-year spend can result in poorly directed funds and resources, leading to damaging “sticker shock” for the executive team.
But as we have seen, cloud migration requires investment, especially to achieve the longer-term benefits of operating across multiple cloud providers. Tools such as cloud management platforms, cloud resource management, and FinOps are not cheap, but deliver returns both in terms of managing costs and enabling flexibility and operational benefit in the longer term.
Overall, cloud migration was never going to be “like for like,” but can be seen as a way of migrating the organization toward a new relationship with the technologies it adopts, consumes, and delivers to its own customers. Cloud migration even of a single application offers a gateway to digital transformation, and all the benefits organizations are looking to achieve from its vision.
6. Call-out: NetApp: Cloud Migration Leadership
If you have not kept up with the changes at NetApp, you may have missed the company’s technical leadership in supporting cloud migration and CloudOps. NetApp has expanded its service and product offerings, leveraging incumbent data protection and replication capabilities with acquisitions that enable organizations to gracefully transition applications from on-premises to one or more cloud providers. These functions can be acquired separately, but NetApp offers the opportunity to stitch its tools together to provide enhanced value and ease of use.
Beyond its cloud solutions, NetApp is working with the FinOps Foundation to advance best practices in cloud financial optimization, enabling organizations to map IT spend to measured top-line performance. This combination of technical and thought leadership allows NetApp to support cloud migration efforts from initial planning to multi-year business success.
7. About Michael Delzer
Michael DelzerMichael Delzer is a global leader with extensive and varied experience in technology. He spent 15 years as American Airlines’ Chief Infrastructure Architecture Engineer, and delivers competitive advantages to companies ranging from start-ups to Fortune 100 corporations by leveraging market insights and accurate trend projections. He excels in identifying technology trends and providing holistic solutions, which results in passionate support of vision objectives by business stakeholders and IT staff. Michael has received a gold medal from the American Institute of Architects.
Michael has deep industry experience and wide-ranging knowledge of what’s needed to build IT solutions that optimize for value and speed while enabling innovation. He has been building and operating data centers for over 20 years, and completed audits in over 1,000 data centers in North America and Europe.. He currently advises startups in green data center technologies.
8. About GigaOm
GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.
GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.
GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.
9. Copyright
© Knowingly, Inc. 2022 "Key Criteria and Strategies for Cloud Migration Success" is a trademark of Knowingly, Inc. For permission to reproduce this report, please contact sales@gigaom.com.