Managing Tech Migration & Replatforming
What is a tech migration? What are the different types of migration?
Migration is a fundamental change to your tech stack – there are a couple of different common migration types:
- Cloud migration – most commonly, this involves moving from on-prem servers to the cloud or a hybrid model; some larger companies make the opposite move from the cloud to a data center.
- Tech stack migration – a significant change in the software your organization uses. You might be migrating from one software platform to another or changing the programming language you use.
What are the different sub-components of a migration?
Infrastructure – the systems your software runs on, including your servers, operating system, and database platform. For example, it could be moving from a data center with VMware to AWS using CloudFormation and CI/CD tooling. These are the fundamental underpinnings of your platform and the metaphorical hardware and tools to manage it. Your customers won’t see these changes unless something goes wrong, but everything you do is built on this foundation.
Software Stack – changes could include language choices, frameworks and architecture, and tooling of the software you have built. This is what your customers directly interact with.
Data – most organizations have anywhere from gigabytes to petabytes of customer data. It’s usually crucially important to your organization to migrate this successfully from one system to another.
Skills – if you have a specialized team, migrating any of the fundamental pieces of your platform or software can result in a big shift in talent needs. You may have to hire new talent, retrain existing talent, and you’ll scope the historical skill sets on your team that you aren’t aware of.
| Rehost | ||||
| What is it | “Lift and Shift”, taking exactly what you have in your existing situation and rebuilding it as closely as possible in a new environment – for example, if you have a hand-built data center in VMware, then you take the exact same architecture and drop it into AWS. It’s a relevant approach for cloud or data center migrations but is less applicable to tech stack migrations. | |||
| Advantages | – It’s frequently the easiest way to migrate quickly. Companies often use it when they have a fast-approaching deadline to move off of a platform for contract reasons. – There’s a lower chance of outages and customer impact in the short term. | |||
| Disadvantages | – You get the least benefit from the migration and likely miss a lot of available tools using a lift and shift. – It costs the most in the long run. If you’re moving into the cloud and handling it the same way you would a data center, it’s going to cost more. – There’s a higher risk of outage or data loss in the long term as your architecture isn’t optimized for the different constraints of the new environment. | |||
| Replatform | ||||
| What is it | Shifting your approach to accommodate the new platform but not committing to fully invest right out of the gate – e.g., if you’re moving to AWS you would use RDS rather than running your own database on EC2 instances.. Then, you make iterative changes to your architecture that allow you to move with relative ease. When evaluating wholesale changes to databases or programming languages, you’ll typically look for plug-and-play equivalents first. This applies well to various migration types. A lot of organizations start with replatforming and then invest in a full rearchitect down the road. | |||
| Advantages | – It may not lock you into investing in a specific vendor and you can still invest to get optimizations. – You get some moderate benefits from the cloud for moderate investment and can be more selective and invest in high-value changes. | |||
| Disadvantages | – You don’t get the full benefit of investing in the cloud when you have one foot in and one out. There are optimizations in performance and cost savings that you miss out on by not rearchitecting. | |||
| Rearchitect | ||||
| What is it | Committing to a full rewrite of your application – rethinking and rearchitecting everything to take advantage of the new platform. You’ll use the new platform’s offerings heavily (i.e., if AWS, taking advantage of tools like EventBridge, Lambda, AWS Batch, or ECS) ). In every area, you return to evaluating first principles and make an active decision on languages, databases, etc. You’ll likely keep your original setup running in parallel while you rearchitect a new system. | |||
| Advantages | – You can get the most benefit from investing in the platform you’re moving to. – If you keep your original platform running in parallel, it lowers your chances of data loss. | |||
| Disadvantages | – It takes a lot of work and there is a high risk of failure. It’s easy to underestimate the effort that it takes and there’s risk to not sticking it out all the way through and losing work trying to get there. – It comes with higher chances of outages and stability issues as you move to the new platform. – The impact on the skills of your existing team can be significant | |||
What approach you take depends upon:
- The benefits you’re trying to realize – rehosting will end up with more moderate improvements, while rearchitecting can lead to significant leaps in performance.
- Current state of your platform – if you’re far behind and playing catch up with your platform, a rearchitect might make sense to capture big performance and cost improvements.
- Engineering resources available – if you need to get your platform aligned with the market to find talent and skills, you may make a bigger leap with your migration.
Most of the time, replatforming makes the most sense – it’s the most iterative approach that allows for more dynamism, and you can continue to develop your product linearly during the change. You can still add features while you’re replatforming, and you can integrate your new host or platform.
What are the reasons that you might need to migrate?
To enhance your ability to automate – your whole organization will be held back if your current state can’t take advantage of new automation and capabilities on more modern platforms. Your approach and operations will become stale in ways you wouldn’t necessarily expect.
To expand your talent pool for recruiting – If your organization is far ahead or behind standard industry practices, it can present a real obstacle to finding talent. With legacy systems, your talent ages out or changes career path over time; it becomes hard to hire those skills and people are usually unwilling to train on older technologies if they don’t see a future in it. If you’re far ahead of the curve, time is on your side as long as you bet on the right technology.
To realize cost savings (depending on your traffic pattern) – if you have highly variable traffic patterns, a move to the cloud will likely lead to cost savings because the cloud allows you to only pay for what you’re using. If you have a very consistent workload, the cloud may end up having a negative cost impact.
For security improvements – when migrating to the cloud you give up some control in order to benefit from the security expertise of the cloud provider. An increase in security awareness over the last several years means newer software platforms often have security concerns baked in as well. Be sure to understand where their responsibility ends and where yours begins—it requires care as security always does.
For performance improvements – cloud providers can spend so much time optimizing their performance, your investments won’t be able to keep up. Take advantage of the improvements from cloud platforms at a lower cost.
Due to a merger or acquisition – typically, the acquired company transitions to match the parent company. Sometimes, a smaller company is acquired for particular platform expertise and the parent company will move to their platform. The instinct is to make everything as similar as possible for ease of management—but you need to evaluate before making changes. For example, authentication stacks almost always get standardized. But if you buy a company on Azure and you’re using AWS—a full infrastructure migration is a big decision.
What are some common myths about migration?
Myth: that cloud setups are less secure – some people associate more control with more security, however security is a core competency for cloud providers, providers like AWS can do security far better than most of their customers, while freeing more of your time to focus on differentiating your business.
Myth: that more control = better performance – a lot of orgs believe the more control they have the better performance they can squeeze out. While this can be true in some ways, it’s easy to miss the forest for the trees and lose sight of opportunity costs. Giving up some control to an opinionated platform can free up a lot of engineering time that could be better spent on other parts of your business.
Myth: that replatforming will necessarily save you money (or cost more) – platform expense is often misjudged in both directions. Some people automatically assume a new platform will save money, others assume moving up-stack will be more expensive. You need to evaluate the cost considerations of your particular situation.
What are some real risks of migration?
The potential for heavy investment without the results you want – if you invest in a rearchitecture that you’re unable to complete for whatever reason, you can end up with massive sunk costs.
Security risks – if you don’t take the time to understand the new security model, you can open yourself up to threats you didn’t realize were there. While the new platform might be more secure, the transition presents a risk.
Damaging relationships with your engineering team – if you migrate without getting buy-in from employees, you may see a loss in productivity, or talent leaving outright..
Switching to a new platform that ultimately doesn’t survive – it’s a particularly relevant risk if you invest in a very new platform. It could die off, forcing you to migrate for the second time in a short period.
Data loss or outages – both of these are very real risks during a migration. Data loss is worse because the information you lose is likely valuable and irretrievable, while outages can be remedied.
What do you need to consider in preparing your organization for migration?
Prepare a few things before you undertake a migration:
- Understand your goals for migration and how best to achieve them – migration isn’t a one-size fits all process.
- Get buy-in and understand the impact on employees – understand how the changes you make will integrate into your overall culture and how it will impact individual contributors.
- Consider what new talent or skill sets you need to add – some of the skills you need might not exist on your current team, in which case you need to hire for them or seek outside help.
- Understand how your changes integrate with the process – your engineering and software development processes should alter along with new possibilities of your platform.
- Ensure that disaster recovery is in place – a migration involves digging into your application and infrastructure at a deeper level than you normally would. This dramatically increases the risk of disaster.
How might you change your architecture to get the most value out of your migration?
Align your architecture with the capabilities and expectations of your new platform – if you’re migrating to the cloud from a data center, you should shift your thinking to be more elastic. Other kinds of migrations have different specifics, but every platform is designed with certain metaphors in mind, from the classic 7-layer computing metaphor to event-driven computing, data brokers, or microservices. —understand those metaphors and align your thinking and processes with them.
Shift to horizontal scaling and get rid of ‘pets’ – rapid horizontal scaling can be a big benefit for your engineering org. Removing special patches and increasing elasticity provides a consistent benefit across different clouds. You can respond much more quickly to problems when servers are disposable.
What engineering process changes might you change alongside a replatforming?
Take the chance to evaluate all of your processes – a migration (and particularly a rearchitecture) offers a chance to step back and evaluate what works best across your entire organization in the new environment. Be intentional. Ask what the intended purpose of a given process is, and evaluate whether it’s achieving that goal.
Implement infrastructure as code – IaC allows you to understand your infrastructure at a more fundamental level and increase your flexibility. Making changes in the cloud requires less of a long-term commitment and less upfront expense than changes in a data center. Embrace that flexibility and integrate infrastructure configuration into your software development lifecycle.
Adjust change management processes to match the new reality – the flexibility of IaC calls for a shift in how you think about change management. Existing change management processes are designed for environments where changes are an expensive commitment that can be difficult to roll back. . With IaC, changes can be made or rolled back in minutes, and can be managed with your existing code review process. . Embracing this can allow for significant efficiency improvements, and reduce or eliminate engineering time spent waiting on change management.
Find the necessary talent – get the right people and skills in your organization to manage the future state of your platform; this could involve hiring new employees or retraining existing ones.
Shift to a culture of dynamism and experimentation – if you can make changes quickly with minimal impact, you can have a culture of flexibility and experimentation around your platform and infrastructure. Your engineers can experiment earlier in the process to validate what works and what doesn’t.
What components of a shift to cloud computing typically have the worst ROI?
Usually, underutilized and abandoned resources – it’s common to overprovision resources to play it safe rather than putting in the work to right-size compute and manage scaling. It’s also common for old, unused servers and services to stick around for a long time, or to fail to deprovision infrastructure after completing an experiment. This can raise costs substantially.
What can you do to change the way your application is structured to save on costs?
There are 3 areas of cost savings you should consider:
- Take advantage of elasticity – unlike in the data center where you need to provision for your peak load and pay for those resources up front, in the cloud you can pay for only what you need moment to moment. This can be as simple as horizontally scaling in and out based on load, or as complex as re-architecting for an event-driven serverless infrastructure.
- Dynamically leverage “spot” compute – just like being overprovisioned costs you money, it costs the cloud providers money as well. In order to mitigate this, most cloud providers offer spot compute, which is a way for cloud providers to sell excess capacity at significantly reduced prices. The downside is that these instances are ephemeral by nature and can be terminated with little warning. If you design your software to be able to handle that disruption cleanly, you can see significant cost savings by dynamically moving to whatever the least expensive option is at a given moment.
- ARM servers are typically less expensive – many cloud providers are now offering ARM servers. These are significantly less expensive, but since it’s a totally different architecture to the typical x86, some software may not run as well when compiled to ARM. Most modern software isn’t going to be significantly impacted by this though, so it’s worth evaluating.
When should you consider switching to a different cloud provider to save on costs?
In most cases, cost savings aren’t significant enough to justify the upfront cost – if you’re already in the cloud, the math on switching to a different cloud provider typically doesn’t work. But there can be exceptions:
- You’re a small startup and have been offered significant credits to switch
- You have a huge infrastructure and can secure a significant discount by signing a long-term contract with another cloud provider
- Your infrastructure is significantly cloud-agnostic (i.e., doing everything in Kubernetes), making migration a simpler undertaking.
What tools can you use?
Third-party migration advisors – you can find advisors offering all levels of involvement—some take your specifications and handle the entire project for you, others train your employees on the technical skills and knowledge required, and others help you plan and evaluate a migration without involving themselves in the technical side at all.
The platform you’re moving to should have tools to help you move – for example, Amazon has tools to help organizations migrate from expensive proprietary databases to more open ones that work well on Amazon—so if you’re migrating from Oracle DB to MySQL, Amazon can help you get there.
What are the most important things to get right?
Manage communications with your team – if you take a top-down approach, you’re going to incur a lot more trouble and might end up with a lot of turnover in the process. Involving frontline engineers leads to better retention and more visibility into problems as they arise.
Understand the talent and skills you have in the organization – know what you have in-house and what you need to hire for or what you need to get a third party for.
Enforce a tagging convention from the start – this will help significantly in being able to understand later what is running and why. Trying to add tagging down the line is a lot more difficult.
What are some common pitfalls?
Over or under planning – you can’t pin down too early in the process exactly how long the migration will take or what it’s going to cost. You can spend a lot of time and resources trying to project out just to be wrong. Conversely, under planning can result in chaos, shadow IT, and all kinds of problems.
Being overly optimistic – start with the expectation that things may not go as planned. You can get better data during the process and use that to extend out your projections and adjust them.
Responses