Mainframe modernization antipatterns

FEB 25, 2021

Posted by Travis Webb

This blog post describes common pitfalls and antipatterns to consider when migrating your mainframe workloads. It also helps you to understand and avoid them. Migrating or modernizing your mainframe workloads is complex and challenging, even under ideal conditions. If you avoid the antipatterns discussed in this document, you increase the odds of a successful transformation.

This blog post is useful whether you're planning to migrate your mainframe workloads to Google Cloud, to on-premises virtual machines, or to another cloud provider. It demonstrates how to remedy certain mainframe migration antipatterns using technology offerings from Google. In principle, however, you could apply these remedies to many kinds of transformations with different target platforms and architectures.

This blog post describes three common antipatterns:

These approaches can work in some narrow circumstances when migrating mainframe workloads. Avoid them, however, because they have a high probability of failure. For each antipattern discussed, you are given an overview of the antipattern, the typical rationale used to justify it, and the business and technical reasons that lead to failure.

Big bang rewrite antipatterns

In a big bang rewrite, you or your team manually rewrite and re-architect the legacy mainframe code into a modern language using modern design patterns. For example, you might form a development team to build a new Java application that replicates the business logic from a collection of legacy COBOL programs. Senior engineers who are familiar with the system often teach junior engineers the rationale behind the business logic to preserve institutional knowledge. The result is a new codebase using new programming languages and new documentation on a new platform.

Of the three antipatterns discussed in this document, the big bang rewrite requires the largest investment of capital and time to achieve success. It is capital-intensive and time-intensive because most organizations can’t resist the temptation to re-engineer and to improve business logic.

Rationale

Re-engineering your systems using modern technologies allows for future innovation. Your senior engineers are moving on—to management, competitors, or retirement—and you need to transfer institutional knowledge to incoming staff. You expect those incoming staffers to re-engineer the system using the latest programming best practices. These less experienced engineers can rewrite module by module, and take advantage of current development methodologies and tools. Because you have all the code, you have an exact specification for what the new software needs to do, and can test against it. Access to the original code lets you compress the decades of investment into your original mainframe software into a modern application. At the same time, you are transferring institutional knowledge from your senior engineers to your junior engineers. At the end of the process, you'll have a new system consisting of well-engineered software built against modern design patterns and best practices.

This case is compelling and can help to convince your IT decision-makers. Though the approach appears rational, there are hidden pitfalls and risks that your team doesn’t recognize at the outset. Risks like budget overruns, unanticipated complexity, and staff turnover can derail a significant rewrite before realizing the benefits. As a result, big bang rewrites rarely equal the best-case scenarios presented to stakeholders. Often, they fail.

Risks, pitfalls, and outcomes

Big bang rewrites often suffer from the second system effect. Early in the project, they fall behind in schedule and budget. While you quickly develop prototypes, getting them to function in the same way as the original code is a long-tail effort that most teams underestimate. This unanticipated setback leads to the first major decision point in your project: How do I overcome these challenges but still achieve the outcomes that I need to make the project successful?

The first option: Continue to diligently plod the long path and adhere exactly to the original functionality. However, matching the new system precisely to the original functionality always takes longer than expected. This is true because the original code provides little or no improvement in productivity over a conventional specification. That means a significant engineering investment to understand the original code and reproduce it.

The second option: Implement the business logic differently. However, changes in business logic necessarily require changes to the business processes and downstream systems on which the original business logic depends. For example, you could have a web application that depends on the idiosyncratic behavior of your mainframe applications. Rather than incorporate these idiosyncrasies into the new, rewritten application, it is tempting to simplify and improve this behavior. However, that adds scope to the project. The chain reaction of further changes that are required in downstream systems introduce additional risk and prolong the rewrite effort.

If your production mainframe system requires ongoing maintenance or updates during the rewrite, you can compound these problems. For example, you might have a rules engine that powers a billing system on your mainframe. To support a new product launch, you need to add a feature to the rules engine to accommodate a new customer billing type. You also need to implement this new type in the current system and replicate it in the new systempossibly after the billing component was rewritten and tested. This maintenance and update scenario can occur many times during a big bang rewrite, setting the project back at each step, and increasing the odds of failure.

Even for companies that have the tenacity to see through a multi-year transformation effort, the raw cost of a rewrite is often prohibitive. When compared to all other approaches, a big bang rewrite is the costliest way to modernize your mainframe software. Often it has the least convincing return on investment (ROI) when factoring in the risks, unanticipated costs, and delays.

Lift-and-shift migration antipatterns

A lift-and-shift migration is an established method of moving an application from one system to another with minimal changes and downtime. It's commonly used to migrate virtual machines running on commodity hardware to virtual machines in a public cloud. You can take a similar approach with your mainframe migration.

Mainframe platforms are based on proprietary hardware rather than x86-based commodity hardware. Therefore, you must emulate your mainframe environment on x86-based machines. Doing so is required to move your applications directly from the mainframe into the cloud, as you would with virtual machines. To run your applications in the emulated environment, you recompile them using a compiler provided by your emulation vendor.

Rationale

Lift-and-shift migration is often seen as the quickest way to get from an on-premises environment to the cloud. You can apply this same thinking to mainframe workloads. Strategic IT decisions are often most palatable when facing a key transition, such as a hardware refresh. Mainframe hardware investments are capital-intensive. Financing the purchase often adds debt or lease liabilities to your company's balance sheet. By moving to the public cloud, mainframe workloads can scale both up and down to optimize resource use and operational cost. When compared to other migration or modernization options, you can make a strong business case that a lift-and-shift migration provides the quickest ROI and carries the lowest risk.

Risks, pitfalls, and outcomes

The business risks of a lift-and-shift migration appear small compared to other approaches, but the potential benefits are even smaller. The benefits of migrating off the mainframe platform to the cloud don’t materialize, because you remain locked into the same mainframe ecosystem, but now with an extra dependency on an emulation layer. That dependency can result in a new set of technical challenges. Challenges that are often unfamiliar to the teams maintaining the mainframe software. Unfamiliarity can lead to additional reliance on a new, single-vendor cloud ecosystem.

By not changing your mainframe software, you avoid solving many important problems: scarce and shrinking mainframe talent, a static ecosystem, a lack of agility, and an inability to innovate. You're now running your legacy workloads in the cloud, but remain locked out of cloud innovations due to your continued reliance on proprietary platforms.

In this antipattern, the cost benefits that you relied on to justify the investment don’t materialize. While you might spend less after combining your cloud infrastructure costs with your new, ongoing, emulation software license fees, your savings don’t justify the investment. The outcome is that you've taken all the risks inherent in any migration, but have realized few of the benefits, if any.

In-place modernization antipatterns

In an in-place modernization, you focus on improving the quality, maintainability, and testability of your software while keeping it on your mainframe computers. You might choose this antipattern because you see mainframes as part of your future and know that you must modernize your application software accordingly.

You can rewrite your application software to use modern languages that run on the mainframe, or you can re-architect it in place. For a partial cloud-like experience you can install orchestration technologies, like Kubernetes.

Rationale

Mainframe software presents challenges related to maintainability, innovation, agility, and extensibility. By re-architecting and re-engineering this software to align with modern standards and design patterns, you can avoid many of the pitfalls that disrupt large replatforming efforts. Moving off the mainframe is the single largest risk. By avoiding that move, you can improve the odds that your project succeeds. Of all the mainframe modernization approaches you might consider, an in-place modernization appears to be the lowest risk. There's no migration component, so there's no risk of downtime.

There is an ecosystem of vendors offering tools to help with mainframe development using modern methodologies. Therefore, the risk of being left to support the software on your own is low. An in-place modernization often takes longer than a lift-and-shift migration or a code conversion. By modernizing slowly, however, you afford your teams the time they need to learn new development processes. When you re-engineer and re-architect the codebase, you can perform a more rational analysis to better understand whether the mainframe is the appropriate long-term platform.

Risks, pitfalls, and outcomes

An in-place modernization suffers from many of the same challenges as the big bang rewrite. Any approach involving manually updating your mainframe software can have budget and time constraints. These efforts also often suffer from the second-system effect. Performance and correctness issues inevitably arise because rewriting business logic in a new language requires extensive testing before it aligns with the previous functionality. When management learns more about the modest benefits gained by running updated software on the same mainframe platform, expect their willingness to see through such a drawn-out and costly transformation to wane.

The biggest issue with an in-place modernization is that the ideal outcome leaves you many of the same problems that you started with. The mainframe is more than a piece of hardware. Using mainframes encompasses a talent pool, a software platform, and a vendor ecosystem. The trend for each of these variables is moving in the wrong direction. Every year the talent pool shrinks, the software platform becomes more isolated, and the vendor ecosystem consolidates.

Finding help

Google Cloud offers various options and resources for you to find the necessary help and support to best use Google Cloud services:

There are more resources to help you to migrate workloads to Google Cloud in the Google Cloud migration center.

For more information about these resources, see the finding help section of Migration to Google Cloud: Getting started.

What’s next?