Saturday, January 05, 2008

Multi-Stage Continuous Integration Part III

In Part I and Part II I discussed the problems associated with Continuous Integration when combined with mainline development. In this part I’ll discuss a solution: Multi-Stage Continuous Integration.

Have Some Self Integrity

As an individual developer, you practice two simple best practices which you probably don’t even think about any more.

While you make frequent changes to your workspace which means its stability goes up and down rapidly, you only check-in your changes when you feel like you’ve done enough testing of those changes that you've sufficiently reduced the risk of impacting the stability of the mainline. Second, you only update your workspace when you are at a point that you don’t mind taking whatever the impact is from other people’s changes. These two simple practices act as buffers which protect other people from the chaos of your workspace while you prepare to check-in and protect you from problems that other people may have introduced during that same period. It also produces a variant version of the software for each developer which must all be integrated together. The longer you put off this integration, the more painful it will eventually be. Thus the need for Continuous Integration.

When you as an individual are working on a change, you are often changing several files and a change in one file often requires a corresponding change in another file. While it may seem like a bit of a trivial case, you can think of this process as self-integration. The reason that you work on those files on your own instead of having several people work on it is because the tightly coupled nature of the changes requires a single person.

It's All for One and One for All
The next level of coupling is at the team level. There are many reasons why a set of changes are tightly coupled, for instance there may be a large feature that can be worked on by more than one person. As a team works on a feature, each individual needs to integrate their changes with the changes made by the other people on their team. For the same reasons that an individual works in temporary isolation, it makes sense for teams to work in temporary isolation. When a team is in the process of integrating the work of its team members, it does not need to be disrupted by the changes from other teams and conversely, it would be better for the team not to disrupt other teams until they have integrated their own work. But just as is the case with the individual, there should be continuous integration at the team level, but then also between the team and the mainline.

Multi-Stage Continuous Integration
So, how can we take advantage of the fact that some changes are at an individual level and others are at a team level while still practicing Continuous Integration? By implementing Multi-Stage Continuous Integration of course! For now, I’ll just cover the basics, but in a future post I’ll go into more detail and cover advanced topics as well.

For Multi-Stage CI, each team gets its own branch. I know, you cringe at the thought of per-team branching and merging, but that’s probably because you are thinking of branches that contain long-lived changes. We’re not going to do that here.

There are two phases that the team goes through, and the idea is to go through each of them as rapidly as is practical. The first phase is the same as before. Each developer works on their own task. As they make changes, CI is done against that team’s branch. If it succeeds, great. If it does not succeed, then that developer (possibly with help from her teammates) fixes the branch. When there is a problem, only that team is affected, not the whole development effort.

On a frequent basis, the team will decide to go to the second phase: integration with the mainline. In this phase, the team does the same thing that an individual would do in the case of mainline development. The team’s branch must have all changes from the mainline merged in (the equivalent of a workspace update), there must be a successful build and all tests must pass. Keep in mind that integrating with the mainline will be easier than usual because only pre-integrated features will be in it, not features-in process. Then, the team’s changes are merged into the mainline which will trigger a mainline CI. If that passes, then the team goes back to the first phase where individual developers work on their own tasks. Otherwise, the team works on getting the mainline working again, just as though they were an individual working on mainline.

Multi-Stage CI allows for a high degree of integration to occur in parallel while vastly reducing the scope of integration problems. It takes individual best practices that we take for granted and applies them to the team level.

With Agile, you don’t have the luxury of doing big-bang integration at the end of your development cycle, but when you are scaling Agile beyond a single small collocated team, mainline CI doesn’t mix well with short iterations. Multi-Stage CI solves both problems.

Per-team CI is one form of Multi-Stage CI. In the next part of this series, I’ll introduce some more opportunities for Multi-Stage CI.

Next: Advanced Multi-Stage Continuous Integration

No comments: