This article orignially appeared on Madelyn Freed’s personal blog, “The Marzipan Hub” in March 2015.
Today I’m going to be talking about agile story estimates. Please don’t go.
However, this post assumes you know a little about how agile software development is done, with user stories and story estimates. As long as you know the definition of those words, you’ll be good.
So here’s the brief thought. In Daniel Kahneman’s Thinking, Fast and Slow, he describes two characters who play a role in everyone’s mind, System One and System Two. He spends many chapters distinguishing their roles, strengths, and failures. System One is the super-active under-conscious that inhales tons of data and organizes it based on highly optimized heuristic methods. You don’t even notice it working. It makes judgments and assumptions so quickly that you don’t feel it, and it’s easy to ignore its influence on your analysis of the world.
System Two is the more energy-intensive character that only starts working when it needs to. This is your conscious mind, or at least the one that you feel is you. When you feel yourself thinking, that’s System Two. To think rationally is to understand the logical mistakes that System One is primed to make, and to control its feed of information to you during critical moments.
For example, if you need to make an important decision, you might need to talk to a number of your colleagues who are trying to convince you that their data is the most influential. You should be extra sensitive to the fact that you hate Donna and admire Francesca, and that your System One will color everything Donna says with suspicion and halo everything Francesca says with the benefit of the doubt. System Two should be cautious to ignore that feed, and listen carefully to both.
System One is very good at certain things. Some things it can’t even help but notice. If I show you a short enough word, you will read it, even if instructed not to. System One is very good at relative sizing. Here are some lines. Which is the biggest?
It didn’t take thinking. You just knew. Thanks, System One.
What if I ask you about how long the line would be if you added all the lines end to end? You can do it; it just takes some thinking. You need to keep the information of the lines in your mind, and add rather than size. System One is not as good at counting, and it passes off the responsibility to System Two (this is an elaboration of Thinking, Fast and Slow, so I may be butchering). And the answer you get will not be as accurate as the sizing answer, because the counting and the adding is harder, and we’re bullshitting a lot more.
I think this applies to estimating the size of work. In software development, we are always fighting about how to estimate story cards. At ThoughtWorks, where I work, we usually teach that estimates should be points based, and not counted in “ideal developer days.” We have a lot of rationalizations as to why that is, but it usually boils down to this:
Estimating in developer days puts undue pressure and scrutiny on a dev team by uneducated ogre managers, resulting in fighting and bad feeling and wasted time. A dev team should be allowed to make mistakes in estimation—that’s why they’re estimates. So estimate in abstract points and leave us alone.
That’s a good reason. Another reason is that estimates should be a tool to help you better predict work, and we’ve found that developer days are just wrong more often. Even though estimating in abstract points feels like you’re just making things up, we find that they’re usually more accurate than estimates in developer days.
Why is that?
I hypothesize it has to do with what our System One is good and bad at, and another little human failure called optimism. As we discussed before, we are good and natural at relative sizing. Our System One is incredible at it. Whether it be sticks or agile story cards, we can look at two of them and say which one is bigger. We’re usually right. We happen to assign the “bigger” cards to five points, and the “smaller” cards to two points, and then we go to pub night.
What we’re not so good at is counting. And that’s what estimating in developer hours is. You’re looking at a single card, engaging your System Two, and saying you can finish the card by Thursday. Just like adding up the sticks, System Two is doing its best to lay work along a timeline that it’s imagining, and it's predicting a future where it’s all completed. And of course, even though it’s trying its best, System Two is getting it wrong.
Compounding this problem is optimism, which is usually very nice but in the case of estimating is an enemy. The Science shows that even if you have completed a task many times before, if you are asked how long it will take you to do, you will always underestimate. You and all your hairless ape friends are so optimistic that it’s nearly impossible to get anyone to overestimate how long it will take them to do a task. The only way you can get close is if you ask how long it took you the last time to get that task done. Then you get a little closer and are a little more truthful with yourself. But people just don’t look at past reality to predict the future unless you force them to.
When you estimate in developer hours, you are giving yourself the opportunity to do the two things we’ve discussed that lead to wrong guessing: counting instead of relative sizing, and being optimistic. So you’ll always be wrong. So why is relative sizing with abstract points better, again?
We sometimes have to count when we do relative sizing, too. It’s when we ask the developers, “How many points can the team complete in an iteration?” And they subsequently give some dumb answer that is inevitably wrong, because they’re being optimistic, like idiots. But here’s the beauty! It doesn’t matter if they’re wrong! Reality will give you the answer. Have the team estimate and work on stories for an iteration, and at the end, the devs will have completed some story points. And that will be the answer to “How many points can the team complete in an iteration?”
No human optimism or counting allowed. You’ll never have to ask the devs to guess how many points they guess they hope they’ll deliver, because they will show you in each iteration how many points they can deliver. And they’ll never have to stack up their work bit by bit to tell you how much they can complete by Thursday. They can just keep doing what they’re good at doing, which is relative sizing. And writing code, presumably.
What if you’re wrong about developer hours (which you will be)? There are maybe 30 ideal developer hours in a week-long iteration, and so you put 30 hours’ worth of work in the hopper. And then, guess what, the devs do only 26 hours’ worth of work. And suddenly you can’t tell whether the estimate was wrong or devs are stupid and lazy. If you’re an ogre manager, you decide on the latter and make everyone stay late so you can yell at them. This always works to make people more productive and much smarter.
Just kidding. The way to make people smarter is to give them that Flowers for Algernon juice and relax while everything goes good forever and never gets super tragic. No, wait—that’s wrong, too. Just estimate using relative sizing and past experiences, and you should be good.