Sunday, July 12, 2009

Guilty Confessions of a Software Developer

I tried to look like I was still mulling over the question when Mr. J continued his interrogation, "The description of this feature is incomplete. What should happen when the -t option is combined with -V and there are no other options given?" I was sweating and nervous, my chest felt tight and I knew I was right on the edge of panic.

I couldn't remember the answer to Mr J's question. I'm not sure I ever knew it and the customer that had asked for this new feature had gone out of business 2 months after we implemented it, which was 3 months ago. I was helpless and trapped. The harsh light and the look in his eyes wasn't helping. He knew that I didn't know. I wouldn't be able to hold out for much longer, it felt like the room was starting to spin.

Finally I couldn't take it any more. I cracked under the strain and confessed what he already suspected. "I don't know. I wrote that code so long ago, I just can't remember. I'm sorry." Mr. J shook his head and walked out of my office saying, "I'll figure it out."

Ok, so this is a bit of a dramatization loosely based on a real event, but being in the situation where you are asking about or needing information about something that happened months ago in the development process is fairly typical in a traditional development project.

Perhaps you are the developer in this situation or the person trying to write the tests for the functionality. Or perhaps you were trying to write the documentation, do the demo for the user, market it, sell it, or... use it? As an industry, we've become numb to this problem. Sure, we blame specific situations or individuals and we often try to take corrective action... but it seems to keep happening again and again.

As an industry, we try to write better requirements, but it doesn't seem to help. The tough questions keep coming. We try to take a bit more time on the design, but things just get worse. The questions get more complicated. We try to do more testing and hire "better" people, but we keep running into situations where we just don't have the answer, there's no supporting documentation, and we promise we'll try harder next time.

Why does this keep happening? What can be done to prevent it? It turns out that there is a simple explanation as well as a practical solution.

Leveraging Human Strengths in the Development Process

In the previous post we saw two ways that traditional development is like the game of telephone. First, communication suffers from translation errors at each stage of the development cycle. Second, the frequent attempts to detect, prevent, and recover from translation errors require people to remember details about work they did weeks or months in the past. This is unfortunate because people are much better at remembering something they did within the past couple of days than they are at remembering something they did weeks or months ago.

One of the advantages of Agile development is that it leverages human strengths and minimizes the need for humans to do things that humans do poorly. In the case of communicating important concepts throughout the development cycle, Agile relies on four tools: user stories, one piece flow, conversations, and reduced documentation.

A user story is a simple statement in plain English of a value proposition that the end user of the software is interested in. For instance, "As a user, I want to purchase an item I am looking at with a single click." The main value of user stories for communication is that they are used by everybody involved in the development of that story, from user to tester. While the story may be translated into a design document or test cases, the original intent of the user is always available as a double-check via the user stories.

An important part of increasing communication effectiveness is a concept called "one piece flow" which comes from Lean. The idea of one piece flow is that each aspect of developing a user story happens in rapid succession, and that each team member focuses on a single user story at a time. The result of one piece flow is that the time between when the team first commits to doing a user story and they can ship it fully developed, tested, and documented is very short. The timeframe is generally on the order of a week at most and usually days.

Because user stories go from start to finish in such a short period of time, the bulk of the communication that occurs on behalf of a user story is done via conversations. Of course, Agile teams do document their work, but it is for the purpose of creating an audit trail, not for the purpose of short-term communication between team members. Conversations are a much better way to communicate technical concepts than documentation.

Conversations between Agile teams and their customers

From the perspective of a customer, the time between when they provide detailed information about a feature and when they can see a demo of the result is often as short as a month. Also, the scope of functionality is going to be much less. The reduction in timeframe and scope means that it is much more likely that a customer will remember exactly why they wanted something and that they will be much more able to provide useful feedback on the result.

That doesn't mean that all customers have to be involved every month, only that there is an opportunity to go full circle with customers over a much shorter period of time than with traditional development. Consequently, miscommunication can be caught earlier and corrected faster.

Conversations in an Agile team

From the perspective of individual team members, interaction is now focused on a specific user story at any given time and the timeframe of its development is on the order of days rather than months. Work can be initiated, rapidly completed, and then confidently considered done instead of having an ever growing list of work in progress. With Agile development, the amount of work in progress is very small and is constantly changing. There is much less of a need to rely on piles of documentation or on the long term memory of oneself or others.

Are you an Agile Do It Yourselfer? Check out the web-only book

Traditional Development == The Game of Telephone?

In theory, traditional development is a good idea. The work of development is broken up into stages in order to reduce risk and increase efficiency. But in reality a significant proportion of projects are shelved, cancelled, or delivered with low quality. In part, traditional develelopment has problems for the same reasons that it is fun to play the game of telephone.

I remember fondly playing a game called telephone in elementary school. In the game of telephone, a message is whispered from person to person. The fun part is that the last person says what he heard to the whole group and everybody gets a laugh out of how a phrase such as "Send reinforcements, we're going to advance" gets inadvertently translated into "Send three and fourpence, we're going to a dance". This is very similar to what happens in traditional software development projects.

Figure: The Traditional Development Telephone Game

The figure above represents a typical traditional development process and some typical times between handoffs. At each stage, instead of whispering the information along, the participants create and use a series of documents, each of which serves a different purpose. As the information moves from stage to stage it needs to be translated into an appropriate form for each stage of development. The documentation also serves as a record of information that was discovered along the way and the decisions and justification for decisions that were made along the way.

The product manager talks to customers and those conversations are translated into Marketing Requirements Documents (MRDs) which are translated into engineering requirements which are translated into specifications which are translated into designs which are eventually translated into code. The end result is then given to the user who often says something pithy like "that's not what I wanted."

At each stage of development there is a risk that the intent from the previous stage is mistranslated. Another way to think of the mistranslation is data corruption. The larger the amount of work that is flowing from stage to stage, and the longer it takes for work to pass from stage to stage, the more likely it is for information to become "corrupted" and the greater the extent of the corruption. Of course, people don't just blindly translate information without interacting with people from earlier stages, they ask for clarification when they think they need it.

When clarification is needed, such as when somebody is writing a test and has a question about how a feature is supposed to work, it requires a conversation with the author of the documention. There is no guarantee that the author is available when needed, knows the answer themselves, or can find the answer in the supporting documentation that they themselves used. That then requires another conversation with the architect, and then the architect with the product manager and then the product manager with customers. The customers may have given their feedback based on ideas they had during the discussion with the product manager but have long since forgotten. The further back in the chain you go, the harder it will be for people to remember the details required to answer clarifying questions.

People aren't very good at remembering the exact details of the work that they did weeks or months in the past. That’s one of the main reasons for creating the documents in the first place. When people are routinely put in the position of having to do something they are not good at, it leads to stress, lower morale, and mistakes. [See "Guilty Confessions of a Software Developer"]

Ok, so now we have a good idea of how customer intent can get lost in translation and how the people developing the software can get stressed out. While that's all very interesting, it isn't very useful unless we can figure out how to act on this information.

Next: Leveraging Human Strengths in the Development Process

Tuesday, July 07, 2009

Do It Yourself Agile

Originally, my blog was focused on Software Configuration Management. Then I described in a blog post how I realized that I'm not really an SCM person, I'm a software development process person that had been focused on SCM. So, I changed the name of the blog to "Agile Development Thoughts." Today I realized that this blog has changed its focus from random thoughts about Agile Development to a blog about Agile for the Agile DIY'er. So as of today, this blog is now "Do It Yourself Agile."

I still believe that the quickest route to Agile success is to get people involved who have done Agile before. But, that option is not always available and even when it is, it may not be possible to retain an Agile coach for the multiple years that it will take to reach your full Agile potential.

So, the idea of this site is to give the Agile DIY'er a resource to lean on. A good place to start is the web-only book "Do It Yourself Agile." This is an evolving book which is expanding and changing every week in response to questions and discussion from readers like you. If there is a topic that you don't see covered or that you would like to explore more fully, let me know! Just post a comment wherever you see fit or use the Q&A page. I'll answer your question or create a new blog post on the topic as soon as I can.

Check out the web-only book "Do it Yourself Agile" .

Monday, July 06, 2009

Agile Exposes Scaling Problems

One of the first questions that seems to come up in any discussion of Agile is "How Well Does Agile Scale?" Sometimes this is asked explicitly, but more often there is an underlying assumption that Agile does not or can not scale very well. When I was first exposed to Agile, my first impression was that Agile didn’t scale beyond a small team, say 1-12 people.

I used to try to tackle the question head on, but then I realized that there's something else going on here. Let’s take a step back. What do we currently assume about traditional development? I think we come to the discussion thinking that just because there are teams of 100 engineers, 500 engineers, and even 2,000 engineers doing traditional development, that traditional development is a proven quantity when it comes to scaling. Let’s first ask the question: “How well does traditional development scale?”

To answer that question we first have to define "scaling." I think a good working definition is that as you add new resources, there is a linear increase in effective output with little or no decrease in efficiency. For software, output and efficiency translate to new functionality and productivity. That would mean that if you have a 50 person engineering team and you add 50 more people you get twice the output. But when was the last time you saw that? Have you ever seen it? In my experience, after talking to hundreds of development organizations doing traditional development, productivity falls and thus output does not increase linearly with additional resources. In many cases, output actually decreases as more resources are added.

What do you actually do to "scale" your development organization? Do you have meticulously updated Gannt charts and estimates? Do you schedule lots of meetings? Do you spend a lot of time making sure that requirements and designs are just right? Do you reserve 30% of the development schedule for testing (and fixing)?

In my experience, after talking to hundreds of development organizations, the answer to the question “How well does traditional development scale” is “not very well.” We've just suffered along with this problem, in large part because while the pain was there, the root causes were difficult to trace, let alone address.

The main point here is that if you are in the process of rolling out Agile to a large organization, don't be discouraged when you run into scaling problems. The problems were always there, they just weren't as obvious. Now, instead of wondering why things aren't coming together as expected at the end of the project, you may find out right away that your organization doesn't really have a good way to coordinate API changes for APIs that are used by multiple teams. As Agile exposes problems like this, you can take steps to solve the problems and thus create a more scaleable development organization that just happens to be doing Agile development.