Friday, January 27, 2006

What Exactly is a Stream Anyway?

From time to time, people ask "what is a Stream?" At this point pretty much anybody associated with software development has heard of branches, but streams are a relatively new concept which is similar to but at a higher level than branches.

Here's my .02 on streams, coming from the AccuRev point of view. First, I think that at some level the various SCM systems which talk about "streams" are probably all trying to achieve something fairly similar. That is, a "stream of development."

My initial exposure to "streams" was from hearing folks talk about "streams of development" independent of the SCM system that they were using (which did not have such a concept). The idea was that you had work which was towards a particular purpose, such as new development, maintenance, a team working together on a sub-project, etc. Each of these was a "stream of development".

I have also heard the terms "codeline", "development effort", and "line of development" used in the same context. At the end of the day, the folks which initiate these things (managers, business people, etc) don't really care how they are implemented, they just want to ask questions like "how is 4.0 coming along" and "are all of the fixes from maintenance in the latest release?" Somebody else then translates that into the appropriate queries, which may be in terms of branches, scripts, streams, or something else.

Prior to AccuRev, I found the need to tranlate somewhat mystifying. Why should there be any difference between the mental model of "streams" and the implementation model?

Thus, in AccuRev, the mental model and the implementation model are the same. Streams are the basic building block of the architecture. There are no branches or labels, just streams. There are streams for releases, streams for active development, streams for end users, etc. Each stream except the root stream is defined in terms of a parent stream and inherits everything from the parent (recursively).

So, if you want to do maintenance on the 4.0 release, you would create a new stream based on 4.0 . Through inheritance, it is the same as 4.0. The definition itself is all that is required. The definition is simply "4.0_maint" is everything in "4.0" plus all of the changes in "4.0_maint".

Since streams are first class objects, you can act on them directly. You can assign security attributes, lock them, define other streams in terms of them, compare them directly, do queries on them without having to understand how the streams themselves are composed, etc.

For the curious, I've written a whitepaper which describes AccuRev's stream architecture at an even deeper level:

http://accurev.com/product/docs/AccuRev_Streams.pdf

And if you want to go even deeper than that, the basis of AccuRev's streams is TimeSafe:

http://accurev.com/accurev/info/timesafe.html