Intuitively Obvious to the Most Casual Observer

Thoughts on Distributed Bug Tracking

One of the unsolved, but seemingly obvious, problems associated with the widespread growth of DVCSs is the need for correspondingly structured bug tracking tools. A standard model (centralized) bug tracker corresponding to each repository is the approach github and others tend to take, but that’s pretty obviously inadequate, as bugs filed against one fork do not automatically propagate to other associated repositories, requiring humans to manually watch for bugs elsewhere.

A few tools have been presented as distributed bug tracking solutions which function by storing their database in a DVCS, and then calling it a day. That doesn’t really count, since bugs are associated with the commit at which they were discovered, rather than the commit at which they were introduced. That means that, as with github’s solution, they won’t necessarily propagate to the right repositories. I’m going to take it as axiomatic that any distributed bug tracking system that doesn’t address directly the problem of propagating bugs correctly, is worthless - it’s effectively equivalent to a centralized bug tracker instance for each repository.

Let’s first consider that there are actually two types of issues that one would want to file in an issue tracker associated with code: bugs, and feature requests. The latter, of course, are easy to do in a distributed way - any feature request can immediately be filed in all associated repositories (at least after being triaged as valid).

Bugs are harder to handle, since the commit which introduced them is uncertain. (Feature requests, it turns out, can be considered identical to bugs, but implicitly introduced at the null commit, before all others.) You don’t want to propagate a bug to repositories that don’t have it, but you do want to make sure that other developers know about it. Unfortunately, the task of figuring out which commit introduced a bug is genuinely hard - usually as hard as actually fixing the bug, and sometimes harder.

But that’s exactly the problem required for building a distributed bug tracking system - propagation can never be addressed effectively as long as finding the commit of introduction of a bug remains hard. Which of course it always will.

So it seems that distributed bug tracking is a pipe dream. Which in turn means that truly distributed software development (such as github aspires to facilitate) is a pipe dream, since one of the core associated tasks cannot be done efficiently in a distributed manner. This mirrors what one notes by looking at the various open-source communities as they stand today - the vast majority maintain a centralized code repository. (Even if they use git or mercurial for technical reasons, there’s still a single repository viewed as the “master copy”.) That’s not coincidence, or because they’re behind the times; it’s because that’s what’s necessary for building software effectively.