Release Train in mobile development

Max Kach on 2023-12-10

Release Train in mobile development

Does the team or app size affect the release process? Well, it depends. Let’s imagine a startup with one small team. In this case, the team usually makes a feature, and then just releases it. Now let’s imagine a large project, for example, a banking app, with many teams working on it. In this case there should probably be a process, release cycles, and maybe some bureaucracy. Without that, there will be chaos.

So when does it become clear that it’s time to set up such a process for your app?

In this article, I’ll share my experience of implementing Release Train for the Dodo Pizza app (Android and iOS) and problems we faced that made out team implement Release Train.

If you are Team Lead/Tech Lead of an Android or iOS project that is growing fast and you have not managed the release process yet, I hope that our experience will help you.

How it used to be

In 2021 we already were using a Trunk-based Development (TBD) approach in our teams. We covered the code with feature-toggles, decomposed tasks, ran unit and UI-tests. Our feature branches didn’t live long, and we had CI working.

The release process was very simple: whoever was ready to roll their feature out, rolled it.

Ideal scenario

Here’s roughly what our branch workflow looked like. Several teams (grey, blue, orange, and green) worked on different features. Since we were working according to TBD, each feature could live through several consecutive branches. For example, the gray team made their feature in four steps, the blue and orange teams made theirs in one step, and the green team made theirs in two steps.

When a team finished a feature, they could roll out a release. For example, if the blue team finished a feature they could do a release. Then the orange team would finish a feature and do another release.

We had the perfect flow, as it seemed then. It worked great up to a point, but all good things come to an end.

Something went wrong: tough, crowded, and unpredictable

Mammoth

The first problem we encountered was that releases started accumulating a lot of features and became too big.

The teams didn’t always want to release their features right away. The release and regression process was time-consuming and took 3–4 days. So if your feature was small and not urgent, you didn’t always get to release it yourself, because probably some other team would do a release soon and it will be included in that release. Roughly it looked like this:

This was quite a fragile arrangement, especially when the number of teams started to grow. Many teams developed many small features, and the total volume of new code in each new release became huge. When someone got to release their big feature, they had to release a whole mammoth together with it.

The mammoth-releases resulted in:

Delayed regression;
Higher risk of regression bugs;
Higher risk of getting a bug in production.

Bottlenecks

Every team is unique, and every feature is different. Sometimes things happened in such a way that several teams would finish their features around the same time. In this case, there were a lot of “wait for me, I’ll merge it tomorrow morning, I promise!” going around.

In the end, such bottlenecks resulted in:

Releases turning into mammoths;
Delayed releases negatively affecting the releasing team’s plans especially if everybody else’s needs were met.

The releasing team shouldn’t need to wait for anyone;
Every other team should know when the next release is expected.

Imagine, the blue team made a small feature, and expect the orange team to release soon. But something went wrong, and the orange team didn’t roll out the release either because of some problems of their own. As a result, the blue team told the business that the feature will be in production soon, but it turned out to be not soon enough. As a result, it is impossible to predict when the feature will be in production.

This doesn’t mean that the blue team is irresponsible. If they have a super important or urgent feature, then of course they will release it themselves. But in other cases, there is no way to tell exactly when the feature will be available to the users.

As you can guess, we were experiencing such issues quite often. We needed to be able to tell exactly when customers would get a feature in production regardless of its size or urgency. All 3 problems (mammoth releases, bottlenecks and lack of predictability) are closely related and complement each other. However, probably the most fundamental and important of them all is the lack of predictability. It generates other problems.

Release Train

We’ve had enough, it was time to make a change. The Release Train was supposed to help us do that.

The term Release Train means different things: a scheduled release process, or a dedicated team that manages the release process. Here we’re going to talk about the scheduled release process. I like the way Release Train is described by Martin Fowler in the “Patterns for Managing Source Code Branches” article, and the definition given by Thoughtworks in their tech radar (maybe it belongs to Martin as well).

This is how we have defined Release Train for ourselves:

Release Train is the process of coordinating releases between teams. All releases happen on a fixed schedule, independent of whether the features are ready or not. The train doesn’t wait for anyone. If you are late, you have to wait for the next one.

Let’s break it down with a couple of examples using our color-coded teams.

Solving the mammoth problem

Release Train happens on schedule and does not depend on who has merged what into the main branch. In the example below, the features from the blue and orange teams will be released. The rest will wait for the next train. We could wait a bit longer, but then we’d get a mammoth.

Resolving bottlenecks

At the same time, the Release Train helps us plan our work more efficiently. Let’s say that the blue team originally planned to finish a feature later. But since everyone knows the release date, they can slightly rearrange their plans to finish the feature early. Or, on the contrary, they can realize that they will definitely not be on time for the next train and hence they can finish the feature safely, because they know the whole schedule.

In the example below, the blue team wanted to get to the release and merged all their changes before the release. Otherwise, there might have been a bottleneck.

Most importantly, Release Train gave us predictability by design.

These examples may seem obvious to some, but we solved problems as they arose. When there were no problems with releases, we didn’t bother using Release Train. When problems accumulated, we realized that the time had come.

How the Release Train was introduced

The first thing we did was write an RFC. An RFC refers to both the process itself and the design document that many companies use before they start working on a project. Some use RFCs specifically, some use Architecture Decision Records (ADRs), some just call them by the more generic term Design Doc. At Dodo Engineering, we use both RFCs and ADRs.

Our RFC process looked like this:

We drafted an RFC document;
We discussed it in a small group, collected comments and made adjustments;
Then the RFC was communicated to a wider group;
Then we implemented it;
After that, we gathered feedback, tracked metrics, and evaluated results.

Description of the Release Train process;
What teams are involved, what they are doing;
What the schedule will be;
Metrics.

First, we came up with this process:

Release every week;
Create a release branch on Wednesday morning;
Complete regression, and send the app for review on Friday;
Start rolling out the release on Monday.
Release team: - One iOS and one Android developers from one of the feature teams; - Two QA engineers.
Create a new release branch on Wednesday;
Make a schedule one quarter ahead, indicating when it’s each team’s turn to release. After a quarter get together and extend the schedule.

It didn’t all go smoothly

After a month it became clear that although the first experience felt great, but:

It was really hard to do a regression every week and be done by Friday;
There was no time for hotfixes, and they sometimes happened.

The takeaway from this is that as long as you have regression tests taking a few days to complete, it will likely be uncomfortable to run a release cycle every week.

The final solution

We ended up moving to two week release cycles and Release Train now looks like this:

Release every two weeks;
Create release branch on Wednesday morning;
Regression, and send app for review on Friday;
Start rolling out the release on Monday.
Release team: - One iOS and one Android developers from one of the feature teams; - Two QA engineers.
Make a schedule one quarter ahead, indicating when it’s each team’s turn to release. After a quarter get together and extend the schedule.
Roll the release out gradually;
Do the hotfixes if needed, now when we have time for them;
A week later on Wednesday create a new release branch.

It all looks like a weekly cycle, except there’s plenty of time left for potential hotfixes. This is what it will look like in case of extended regression tests:

No big deal either, there’s still time even for hotfixes.

How did the new process affect predictability

The main goal for us was to increase predictability. It can be broken down into two parts:

When will the application be released;
In which release my feature will get into.

To further confirm this, we conducted surveys amongst mobile developers, QA and product managers, where together with other questions we asked:

When is the next release? (100% answered this question).
Did Release Train help you plan your teamwork? (75% answered positively, but some perfectly predicted their work even without a Release Train).

Impact on metrics

Beyond just solving problems, we also measured metrics. Let’s take a look at the main ones.

Lead time

The first important metric we measured was Lead Time commit to release.

This is what the graph looks like. I marked the point when we started the Release Train process with arrow.

The graph shows that Lead Time went down to somewhere around six days. Is six days a long or short time?

Google benchmarks

There are benchmarks from Google for this metric, but it’s mostly for the backend. On their scale, they distinguish the following groups:

Elite: less than an hour
High: 1 hour to 1 week
Medium: 1 week to 6 months
Low: 6 months or more

Bugs per regression

Another metric we track is the number of bugs per regression. It describes how stable the release candidate. If the previous release was a long time ago, then we probably created a lot of new code that might potentially contain a high number of bugs, and we might spend more time on regression and fixes.

At one point this metric was down to three bugs. The specific numbers aren’t that crucial, but overall you can see that the trend has gone down.

More metrics

I’ll briefly discuss what metrics were also monitored as part of the Release Train.

Crash-free. We always keep an eye on this metric. There was a hypothesis that it would drop due to our attempts to fit the regression into a tighter timeframe. Well, no drop happened.
We wondered if frequent (weekly) releases would have an impact on customer churn, and on deleting the app. As a result, we detected no impact.

We like the current process, as we think it has achieved its goals. We also know how to improve it further:

We continue to work on automating the regression to make it simpler and faster.
So far we have left out the part on work automation (scripts for branching), but this would also be a great point of growth in the future.
Our app works in 20 countries, and we need to translate the application into many different languages. There is an internal service for this, but developers still have to participate in this process manually before release. Automating this process potentially might improve the release cycle even further.

While we were relatively small, we didn’t need a Release Train. When we faced the fact that we couldn’t predict releases, their size and number, we decided to implement Release Train. At first, we tried weekly release cycles, but because of time-consuming regressions we had to switch to two week release cycles. We’ve been living this way ever since.

Now we have predictability of releases and metrics show positive dynamics. We plan to increase coverage of regression cases with e2e-tests, automate the process of working with branches and optimize the process of translations.

I hope this article and our experience will help you, especially if you have already faced the similar issues and they made you think about the release process.

Thank you so much for reading my article. I hope you enjoyed it. If you have any questions or suggestions, drop me a line in the comments and I’ll be sure to read it!

And follow me on Twitter. Usually, I post about Android development and software engineering in general.