Why Can't Developers Estimate Time?

A few interesting points came up on a mailing list thread I was involved in. Here are a few of them. The original comments are presented as sub-headers / quoted blocks, with my response below. This isn't a thorough look at the issues involved, but what I thought were relevant responses. Note: I've done some editing to improve the flow and to clarify a few things.

Why can't developers estimate time?

We can't estimate the time for any individual task in software development because the nature of the work is creating new knowledge.

The goal of software development is to automate processes. Once a process is automated, it can be run repeatedly, and in most cases, in a predictable time. Source code is like a manufacturing blueprint, the computer is like a manufacturing plant, the inputs (data) are like raw materials, and the outputs (data) are like finished goods. To use another analogy, the reason Starbucks makes drinks so quickly and repeatably is because they invested a lot of time in the design of the process, which was (and is, ongoing) a complex and expensive task. Individual Starbucks franchises don't have to re-discover this process, they just buy the blueprint. I'll leave it as an exercise to the reader to infer my opinion of the Costa coffee-making process.

It's not actually always a problem that development time is unpredictable, because the flipside is that so is the value returned. A successful piece of software can make or save vastly more than its cost. Tom DeMarco argues for focussing on the high value projects for exactly this reason. Note that this does require a value-generation mindset, rather than the currently-prevalent cost-control mindset. This is a non-trivial problem.

By far the best explanation I've read of variability and how to exploit it for value is Don Reinertsen's Principles of Product Development Flow, which is pretty much the adopted "PatchSpace Bible" for day-to-day process management. And when I say "by far the best", I mean by an order of magnitude above pretty much everything else I've read, apart from the Theory of Constraints literature.

Here is the data from my last development project. (Histogram generated in R with 5-hour buckets: the horizontal axis shows the duration in hours for the user stories - 0-5 hours, 5-10 hours, etc; the vertical axis is the number of stories that took that duration). We worked in 90 minute intervals and journaled the work on Wave, so we knew task durations to a pretty fine resolution. (We did this for both client communication and billing purposes.) The result: our development times were about as predictable as radioactive decay, but they were very consistently radioactive. Correlation with estimates was so poor I refused to estimate individual tasks, as it would have been wilfully misleading, but we had enough data to make sensible aggregates.

Rule of thumb: take the estimates of a developer, double it and add a bit

The double-and-add-a-bit rule is interesting. When managers do this, how often are tasks completed early? We generally pay much more attention to overruns than underruns. If a team is not completing half of its tasks early, it is padding the estimates, and that means trading development cycle time for project schedule. Cycle time is usually much more valuable than predictability, as it means getting to market sooner. Again, see Reinertsen's work, the numbers can come out an order of magnitude apart.

Also, this is the basis for Critical Chain project management, which halves the "safe" estimates to condense the timescale, and puts the remaining time (padding on individual tasks) at the end, as a "project buffer". This means that Parkinson's Law doesn't cause individual tasks to expand unduly. I'm unconvinced that Critical Chain is an appropriate method for software though, as the actual content of development work can change significantly, as feedback and learning improves the plan.

People in general just make shit up

It's not just developers that are bad with estimates either. Everyone at some point is just winging it because it's something they've never done before and won't be able to successfully make a judgement until they have.

As a community we need to get away from this. If we don't know, we don't know, and we need to say it. Clients who see regular progress on tasks they were made aware were risky (and chose to invest in) have much more trust in their team than clients whose teams make shit up. It's true! Srsly. Don't just take my word for it, though - read David Anderson's Kanban.

Estimating is a very important skill and should be taught more in junior dev roles

I propose an alternative: what we need to do is teach to junior devs the meaning of done. If estimation problems are bad enough, finding out at some indeterminate point in the future that something went out unfinished (possibly in a rush to meet a commitment … I mean - estimate!) blows not only that estimate out of the water, but the schedule of all the current work in process too. This is very common, and can cause a significant loss of a development team's capacity.