Estimation in software is a topic that is covered ad nauseam on the internet. There are no shortage of people giving their opinions on how to make good estimates. There are also no shortage of consulting companies trying to sell you services to help you with this admittedly difficult topic. Since I’m rather passionate about project management and Scrum principles (which are often misunderstood or poorly implemented), I figured it was time to start dropping opinions on how to do estimation properly.
All Estimates are Lies
I don’t know who said this, but one of the aforementioned consultancies hired by a prior company I worked at used this quote, and it has stuck with me to this day. When you’re asked to give an estimate for a software project, you often don’t have enough information to provide anything close to the truth of how long it will take. Yet the business folks need this information so they can promise functionality to clients by X date. You’re also likely being asked to estimate a project that is just too big for any reasonable estimation.
The longer I’ve worked in the industry (admittedly only 9 years at this point), the more I’ve come to believe this epitaph - all estimates truly are lies, and most companies don’t really care. They just need to get a number down.
One joke during my time at Microsoft was the estimate of “2-4 weeks”; a team member was pulled into a project on another team for additional help/expertise, with the caveat that it would only take that long. Over a year later they were still involved in that project, and the 2-4 week estimate hadn’t changed.
So what do we do? We under-promise and over-deliver. Estimates are usually doubled and padded out. If you think something will take a couple weeks, you double it to 4, and add a bit of buffer time. That means something that potentially only takes 2 weeks could take up to 5. This is a fine system, if you don’t really care about providing a meaningful estimate. Or, if you want to be able to kill some time without actually working.
Accuracy vs Precision
A quick note on some terminology that will be helpful to consider. There is a big difference between accuracy and precision, and the two are not necessarily prerequisites for each other.
Accuracy means that your estimates or expected values very closely map to actual values when repeating a process. In software, this might mean that you give estimates that are regularly close to the actual amount of work.
Precision means that your estimates or expected values are very closely clustered together. Precise estimates for a software project would repeatedly be within a very low tolerance for deviation. That means sprint goals are regularly met without needing to carry work over, and without developers finishing work early and having nothing to do for the remainder of the sprint.
An accurate estimate could be less precise if, for example, a developer had 2 extra days in the sprint with nothing to contribute - their work finished early, but the estimate of how much work they had was not maximized to give them something to do for the entire sprint.
A precise estimate could be inaccurate if a team regularly pulls in less work for a sprint than their velocity. They consistently hit their sprint goal and finish all their work, but developers are not doing as much as they could be doing because they finish it so quickly. This is a contrived example and I don’t expect that teams regularly operate like this, but it’s not completely implausible.
Getting Better Estimates
Now to the “meat and potatoes” of the topic - how do we go from bad estimates (inaccurate and imprecise) to good estimates (hopefully accurate and precise)? I’ll cover the theory here for how it might work in a perfect world, assuming complete buy-in from your organization and the dev team, support of a competent Product Owner or Product Manager, and investment from everyone on actually caring about making a proper estimate.
Time Estimates are a Very Bad Thing™
When asked for an estimate, most people immediately want to provide the time it will take. This is the natural instinct of human beings, and stems from a problem in how the question is usually formed - management is asking for the wrong thing. Most managers will say “how long will this thing take?” and while well intentioned, it sets you up for immediate failure (and confidently lying). What they should be asking instead is “how much effort is this thing to implement”?
In a mature Scrum-based organization, the team should be aware of their velocity - that is, the amount of effort they consistently deliver in a given sprint. This velocity should be relatively stable, although after onboarding to a Scrum-based workflow it should improve steadily. This velocity is the sum of effort for all tasks assigned to all team members for a given sprint. It is a magic made-up number, based on nothing concrete, because the source of this number is a set of magic made-up numbers based on nothing concrete.
What Does “effort” Even Mean
Instead of tracking the “time” associated with a task, teams should instead focus on the effort to complete said task. The reason time should be avoided is that it invites speculation at every front. What if a developer that is competent at a task is unavailable, or a new developer is onboarding while learning via said task? What if someone gets randomized on an unplanned project? What if the subject matter expert for a task is in a car crash and ends up in a coma? All of these things are very real concerns, and while some might be less likely than others, they could all potentially happen. Time is relative to a single developer, and not a team - making it a terrible thing to use for estimating work that should be shared and delivered by a self-organizing Scrum team.
Instead, the unconventional wisdom is to use “Effort”. Effort is an arbitrary scale, usually based on the Fibonacci sequence (1, 2, 3, 5, 8, 13, 21, 34, 55…), that allows you to size work items relative to one another. Relative sizing of work items means you pick a reference work item, usually something that is well-understood by the entire team, and size things larger or smaller relative to that work item. This reference work item may change as time goes by (the team composition changes, people leave, forget, etc.), but should always be a nice “medium” sized item that everyone can use as a reference point.
Effort is a nice metric because it is resilient to the problems mentioned above with time estimates:
- Effort works independently of the expertise required to finish a task. It is always relative to your reference work item.
- Effort is resilient to onboarding new team members. If anything, adding a team member should actually (eventually) increase the team’s velocity.
- Effort doesn’t care if a team member is randomized. Another team member should be able to pick up the work. Assuming a healthy organization, randomization should not happen, or will at least respect sprint boundaries, but this is mostly a pipe dream.
- Effort values don’t change when the implausible happens. Self-organizing, cross-functional scrum teams have the tools and skills necessary to continue delivering work even if a team member is incapacitated for reasons outside of work.
Effort is also great because of the nature of relative sizing. The Fibonacci sequence has a very nice property - each item in the sequence is roughly (operative word) 50% larger than the previous value in the sequence. That means you can easily represent a doubling of effort by bumping up a work item in size.
Time is the Most Important Factor!
All of the concepts around effort, planning, estimation, and velocity rely on one important component: time.
Team velocity, as mentioned previously, is the sum of all effort values that can be completed in a given sprint. However, a team that is new to Scrum does not know how much work they can reasonably complete in a given sprint. In fact, their estimates may not be very accurate yet either!
It takes time to build up this knowledge - hopefully within three or four 2-week sprints, a team might have a better idea of their velocity, and a healthy team will have improved enough to give accurate estimates for their work.
Putting It Into Practice
Now that we know all about what effort is, let’s talk about how to use it properly.
The most important factor in properly leveraging estimates is having a competent product owner that can serve as a Subject Matter Expert on what the customer actually wants, and help hammer out requirements. Developers absolutely love to add unnecessary complexity to their work, build the wrong thing, and/or do too much all at once. The Product Owner’s job (and to some extent, the Scrum Master’s) is to reign the development team in and help them focus on small work items, that deliver useful user functionality, that the user actually wants.
Knowing What to Build
It might seem kind of obvious, but having a backlog of work (the responsibility of the Product Owner) helps significantly when trying to estimate a project. Usually this requires a lot of work by the development team. The Product Owner should start with some big, poorly-defined, poorly-scoped idea, and with the help of the development team, break it down into many small sprint-sized work items that the team can deliver on. This set of work items turns into the Product Backlog - an artifact in Scrum.
I’m not going to cover priority of work items - that’s largely the job of the Product Owner, although the development team should have say if there are dependencies between different work items.
Relative Sizing
Once the product backlog has been defined, the team needs to slap some effort values on each of the work items.
To start, it requires some courage - the scale makes no sense, all the values are made up, and it’s hard to begin from zero. One work item that seems both a) a medium workload, and b) easy to understand by anyone on the team, should be selected as the reference work item. If some additional help is needed here, it might help to break things down by “T-shirt” size. XS, S, M, L, XL. Assign these T-shirt sizes to each item in the backlog. Break things down into smaller items where possible. Finally, pick something that is in the ‘M’ bucket and just run with it from there.
Nothing is permanent - a poorly picked reference work item will be sussed out soon enough, and can be changed for future Sprints (hopefully via feedback from a Retrospective).
Now that your reference work item has been chosen, you can either assign effort values with a 1:1 mapping between T-shirt sizes and the Fibonacci sequence, or redo the planning process and assume each T-shirt size maps to a set of values from the sequence. This requires more work on the part of the team but allows for more granularity in estimation. All estimates should be relative to the reference work item’s size.
Planning Poker
Deciding on the effort value for a work item might lead to disagreements in invested teams. This is a sign of a Very Good Thing™ - the team is invested enough to care that effort values are precise and accurate.
One method for resolving differences in opinions on effort values is to do planning poker. Each person on the team puts their personal estimate on a work item, without showing other members. Once everyone has an estimate ready, they’re revealed at the same time. It is important not to spoil the surprise - you don’t want one expert opinion coloring the results of others. The ensuing discussion could provide helpful in making an estimate that everyone can agree to.
Interfacing With Management
How does all this translate into what managers really want: the original question of “when will this thing be done?”
Over time (see: Time is the Most Important Factor!), the team’s velocity will be well-understood enough that a project burndown chart will start to make sense. For release-level planning (not covered in this post), the backlog will likely span more than one team and more than a few features. A burndown chart allows you to accurately state the following:
Based on the team’s velocity, and based on the effort values remaining, we will be done with all planned work in {x} sprints
Because of the framework we’ve built up, it’s simple arithmetic to come up with a date, and it is based on empirical data.
If management responds with “That’s not good enough! We need it by {y} date!”, the answer becomes:
Based on the team’s velocity, we can reach {y} date if we cut features {1}, {2} … {n}
You know immediately how much effort to cut if you want to reach the date. The PO should decide which features can be cut, and management can decide if that is acceptable. If not, prepare for a disappointed business or overworked developers serving in crunch time. If crunch time happens, consider finding a different place to work - this is an unhealthy response to being told to do the impossible!
Closing Thoughts
It’s not easy to get started with Scrum planning. Admittedly, a lot of what I describe above is extra cruft on top of Scrum, which is extra cruft on top of Agile. Much of it is designed so that consultants can sell it to you.
However, I don’t think that completely disproves its usefulness. I worked at a small company where we were very close to a healthy implementation of the above process. There were some organizational issues that prevented us from truly meeting the goals set out by the above process, but I can very easily see a path to how it could have worked very well for us.
My best recommendation is to have courage to speak up when things aren’t working, or when you disagree. Trust your team to make the right decisions. Trust your organization to have your back, and trust that those in power are making decisions in the best interest of delivering software, and ultimately, making your company money.