Systems Thinking Sheffield 2: Why Won't My Car Start?

These are the slides for the presentation and interactive session at Systems Thinking Sheffield 2, held in February 2011 at the GIST Lab.

Note that the slides were prepared quite quickly, which means some of the examples are not as tight as they could be. Also, the output of the "story of the hosed monkeys" interactive tree-drawing session isn't included. I need to write a separate post about that one, as it raises interesting points both about behaviour in organisations and how to model it. (If you'd like to know more about this, please request it in the comments.)

This is the first time I've tried to present these ideas in this format, so I learned a lot. A few key points:

  • Many people's instinctive reaction to figuring out why a situation plays out the way it does is by gathering facts, rather than by asking "why do we see this?", and challenging assumptions. That, I suspect, is becase we think primarily by pattern matching, rather than analysis.
  • It's easier to introduce logic trees by presenting a partially-complete one (and they're all partially complete) and having people raise informal objections, than to teach by building one from scratch.
  • People value the emphasis on externalising and de-personalising problems, and questioning, rather than directly criticising, logic. I included a reference to the Agile Retrospective Prime Directive, which went down well even with a largely non-software audience.

If you have any questions, please feel free to comment. I want to refine my presentation of logic trees over time. Many people are put off them at first, but everyone who has humoured me long enough to draw one said afterwards that they found the activity valuable.

The Mars Lander (without integration tests) in Ruby

At the Agile 2009 Conference in August, J B Rainsberger gave a talk called Integration Tests are a Scam, which you can see in video. The session is well worth watching. While it's long, and takes a while to get to the core issues, it's a very thorough analysis of the costs of slow test runs and (failed) attempts to enumerate all application behaviour from too high a level.

J.B. wrote an example of how focused tests can be used to detect integration issues in the blog post Surely we need integration tests for the Mars rover!. The example is worked through in pseudo-code. I find it hard to read extensive pieces of code, so I turned it into a coding exercise. Here is the pseudo-code translated into Ruby, with comments about the order of how it was built up (more interesting than replaying the SCM patches).

J.B.'s methodical approach to collaboration/contract tests is simple and powerful. The Mars Lander example makes a good concrete example. I highly recommend working through it in your language of choice; I learned a lot.

Customer Input and the Russian Doll of Software Development

While replying to a mailing list post, I realised I was doing a terrible job of articulating where I thought the value of communication from customers is in the software development cycle(s).

The start of the thread was "is it normal for customers to have no contact with developers?". I said this is a terrible thing, and customers should always be able to talk to developers. This is simplistic - so I refined it to saying that having a primary point of contact before the developers is not a bad thing. This is unclear - so I tried to refine it, and in the process decided it was time to dust off OmniGraffle.

As an initial attempt, ths model is that software development is a set of nested cycles, each of which involves specifying the problem in such a way you can test the solution, developing something to meet that specification, and refactoring to improve design and understanding.

Now, a few unintended things spring out of this, but let's tackle the initial problem - where should customers provide input? My current position is that maximum value of customer input is during test case preparation, as identifying what problem to solve is almost always the hardest part of software. At the other stages, the focus is technical, and, with possible exceptions, customer input is of little value.

I once worked with a guy who constantly sat down by and badgered developers when they were trying to work. Little of his input was useful, and much of it caused delay and multi-tasking. A good deal of it was blue sky daydreaming that probably had no benefit in the next 6 months, at least. It doesn't have to be this bad, but incoming communication that interrupts developer and is not part of a feedback loop is waste, in my experience.

However, developer contact with the customer is of immense value, as the ability to clarify and mine for insight enables simplification of code and reduces rework.

The line is thicker between the inner-most development and the customer because my experience is that when developers have questions during the coding phase, it's often about unexpected costs stemming from technical limitations. These tradeoff conversations enable economic decisions about what is feasible, rather than a build-at-all-cost mentality (yet another issue of fixed-price contracts).

I'm making no claims that this is generally applicable, and counter-examples are welcome.

What is a Market Test?

Watch out - the following is more conjucture than fact. I am only now going through the first iteration of customer development, so my opinion in relation to Lean Startup matters should not be given much (if any) weight.

I coined the term Market Test because I couldn't think of anything better to represent the idea that what you fundamentally need is to specify something that will sell (or be used, if it's free/internal etc). An example of what I have in mind is an analytics system that monitors signup rate. It's analogous to the Customer Development Engineering on slide 23 of The Lean Startup slides (Eric Ries and Steve Blank). I nested it because I'm naturally uneasy about any segmentation or conflict (read "discordant" rather than "antagonistic") introduced into a development cycle. The idea that Customer Development Engineering is segmented from or in conflict with development may be a misinterpretation; it may be presented that way for visual impact.

But: A team that can't invalidate its own assumptions is lacking a core self-improvement skill.

How many cycles are there?

It has been pointed out to me that the cycles in this diagram are all fundamentally the same, and that each one falls out of the next. Exactly what needs to be done at each layer is a technicality. That means the diagram simplifies to this:

And let's put the customer inside the process FTW. This implies that all you need to develop valuable software is:
  • someone capable of identifying/proposing a problem and proposing a solution
  • an iterative development process that incorporates self-improvement
  • a development team capable of apply this recursively to solve problems at all levels
Which fits with another idle thought I have at the moment - that the only valuable design principles are those that apply recursively - but that one needs to be worked out first.

Comments welcome. Especially any that explain how I ended up at this conclusion simply by asking "where should customers provide input?".

Testing Software is not Expensive - It's Free

A common criticism of (aka excuse for not doing) test-driven-development is that it's too expensive in terms of developer time. Critics who take this position usually point to the time developers spend writing test cases, which at first seems like a sensible observation. There are (at least) two problems with this.

First - the same people that label TDD as waste are often people who will happily spend - or allow their staff to spend - hours or days at a time in a debugger. Testing to find defects is waste.

Second - and more importantly - writing test cases is not the same as running them to test the software.

At some point, somebody has an idea. They say, I have this problem (for our purposes here, we'll assume they know this exactly), and if I can write a program to do these things, then my problem will be solved. That person has the ultimate test case for the as-yet-unwritten software: if it behaves how they want it to, their problem will go away.

Now, a developer takes over, and turns this conversation into a set of ideas about what the code should do to implement this behaviour. At the very least, having written something, he should run it and inspect what it does, to verify that it behaves as they expect. (Some don't even do that much...) This is simple manual testing. But note two things:
  • if he doesn't think about what it must do, he has zero chance of designing the right solution
  • if he doesn't test his assumptions about the software's behaviour, he pushes errors downstream, where they become slower and more expensive to correct
Now given that the developer here must think about what he is doing - the most effective way to think about it is to express it in an unambiguous form. A form that something stupid and mindless can understand - say, a computer. If he can specify the problem in a way a computer can understand, the only source of error is in getting this spec right in the first place. But fortunately, as he's thinking about what he's doing, this is usually not a large source of errors. (If it is, you have a bigger problem on your hands.)

How does our developer know if the computer has understood the spec for this code? The only way is to make the computer able to verify the program against the spec. Otherwise, the spec is about as useful as a stray Word file, such as a signed-off requirements document. We want booleans here. Flashing lights. Possibly red and green.

When this developer runs his spec program against his solution program, the computer is doing what he should do anyway before releasing it to his customer. The only difference is it can do it many orders of magnitudes faster than he can. So fast, in fact, that it is effectively instant. How much does it cost to fire off the test run? A few seconds of developer time. Or, if you're using an automatic test runner, exactly nothing.

Up to this point, we've established two things
  • writing test cases is the process of formalising a spec so that a computer can be employed for testing
  • running tests is effectively free
But, how free?


The inspiration for this post came from chapter 4 of Don Reinertsen's Managing the Design Factory (It's All About Information). The purpose of this chapter is to explain ways to efficiently generate valuable information. The examples in the chapter are largely from circuit engineering, but even there, there exists a continuum. From page 76:
[Testing costs] could be twice as high with four iterations instead of two. This means that when testing costs dominate the economics we should concentrate on quality per iteration. We do not want to incur extra, expensive trials when the cost of a trial is high. In contrast, when testing costs are lower, we will get to higher quality faster by using multiple iterations.
So, if testing software is essentially free, how many iterations should we have? The answer is hinted to on page 74: this is an economic order quantity (Wikipedia) problem in disguise[1]. Out of sheer laziness to get an equation editor working, I'll reuse the slightly arcane, CC-licensed Wikipedia equation:
  • Q* is the optimal order quantity - how many tests you should batch before you start a test run
  • C is the order cost - the cost of a test run
  • D is the rate at which the product is demanded - arguably requests for features (this is not explained in MtDF, presumably because you can demand features arbitrarily fast) 
  • H is the holding cost - the cost of running tests late in development, when change is more expensive

The key, though, is that if C, the cost of running tests, is at or near zero, and H, the cost of making changes late is high (and every developer's experience is that tracking down bugs in old code is much harder than in freshly-written code) the optimal batch size of tests to hold is also at or near 0. Which in reality means:

You should strive to keep the cost of testing software at effectively 0,
and to run all your tests every time you make a change

If you've done TDD for a while, you'll know this intuitively. But expressing it in terms of existing economic models, already in use in other forms of engineering, puts it on solid ground.

I'll leave it open to interpretation exactly what I include in the scope of a "test", but that will be touched on in my next post. And if you doubt just how free software testing can be, take inspiration from IMVU's continuous deployment: Doing the impossible fifty times a day.

[1] You mean you didn't spot it either? :)

NWRUG October 2009: Uses & Abuses of Mocks & Stubs

These are the slides for the NWRUG presentation on mocks, from July 2009.

Note that most of the slides were written in the middle of the night, and I didn't have much time to trim them down. And I didn't get to beta test them on a real live human being. So the presentation goes on a bit long, and some things look a bit strange without me there explaining them. I've corrected the slide that I noticed was spectacularly wrong (ie, the spec didn't even pass), but otherwise it's as presented

Also my opinions on some things may have changed since, so consider this an archive…

GeekUp Sheffield 6: From Specification to Success

These are the slides for the GeekUp Sheffield presentation on developing software with user stories.

The structure of the huddle was like this:

  • Intro - 10 mins
  • Audience writing stories - 10 mins
  • Audience prioritising - 15 mins (after it overran)
  • Break for coding - 45 mins (there was another talk here which gave me just enough time to code up the top-voted feature)
  • Demo of Cucumber, Celerity, RSpec using the code from the break - 15 mins (for full details and links, grab the slides).