S/W Estimation Principles

version 1.1 of 24 Nov 2012
Mark Kampe

1. Introduction

One of the greatest books ever written on the subject of strategy begins:

The same thing can be said of software project estimation. Estimates are a pre-condition for any non-trivial software project, in that time and cost are key considerations in deciding whether or not to pursue a project. Good estimates lay the foundation for successful projects, by identifying the required resources and enabling us to predict how long it will take to develop the required functionality. Bad estimates can ensure the failure of a project ... if it cannot be built with the assigned resources, or cannot be delivered within the required time frame. If we are to be successful at software construction, we must develop skill at software project estimation. Unfortunately, as another great commentator on the human condition (Yogi Berra) once observed ...

Predictions are tough, particularly when they involve the future.

Project estimation involves many imponderables:

We may not be entirely sure what the rquirements are, and hence the scope of the problem to be solved.
We aren't sure what we will wind up building, and hence how big it will be.
We aren't sure how we are going to build it, and hence what difficulties may arrise in the process.
We aren't sure what resources will be available (and hence what skills they will have) or when (which gates when the work can begin).
We don't know what other distractions will arrise, and prevent those resources from being fully applied to this problem.

These simple facts alone would seem to preclude the possibility of accurate estimates. Because the preparation of estimates is so difficult, the results so disappointing, and the consequences so dire, estimation is an activity that most developers and managers avoid like the plague. While accurate predictions of the future are not generally possible, there are techniques that can help us to make reasonable predictions about the work that will be required to build a piece of software. Because of the importance and difficulty of estimation, the study of these techniques is a subject of inquiry, which can on no account be neglected.

2. Principles of Estimation

There are numerous tools and techniques for estimating the size of a software task, and numerous books written about each. More important, however, are a few basic principles that should guide all estimation efforts.

2.1 Estimates are not Guesses

They are the result of an analytical process. The quality of the estimate is determined by the sophistication and diligence with which the analysis is performed. Better (deeper and more methodical) analysis inevitably yields better estimates. Your skill in analyzing projects and preparing estimates will improve with experience and study.

Estimates are based on knowledge and data: an understanding of the project to be undertaken, and about previous experience with the same team, tools, and methodology. The more complete your understanding of a project, the better will be your estimates. If you have access to data on the productivity of a team, on various problems, using various tools, this will enable you to make better predictions about their productivity on the next project. Data you collect on this project will help you with future projects.

2.2 Estimate at the lowest possible level of detail

A single estimate for a 20 staff-year project could easily be high or low by a power of ten. An estimate for a 50 line routine will probably be accurate to within an hour:

smaller tasks are better defined and leave room for less doubt
smaller tasks are simpler and more easily envisioned
estimates for smaller tasks can be easily sub-contracted out to groups or individuals with expertise in that particular area.
the work to decompose a large project into smaller sub-tasks tends to expose tasks and problems that might not have been obvious when the project was being viewed at a higher level of abstraction.
the work to decompose a large project into smaller sub-tasks will tend to resolve many issues that were unclear when the project was being viewed at a higher level of abstraction

The result is that bottom-up estimates (summing the estimates for all of the sub-tasks) are inevitably more accurate than whole-project estimates. Not only are the individual sub-task estimates more accurate, but the enumeration of tasks to be performed is more complete and the errors in these smaller estimates tend to cancel out.

Hierarchical decomposition of a problem is a rational process for exploring its difficulty, but it can lead to false confidence. Not unlike contemplating chess moves, each successive step takes us further out onto hypothetical limbs, where expanding sets of unknowns may make it difficult to accurately assess the envisioned situations ... casting ever-greater doubt on the validity of the resulting assessments.

The SCRUM process recognizes and attmpts to address these dangers by declining to recognize the validity of any task that cannot be completed in a small number of staff days. SCRUM holds that anything larger than that is too big to be clearly understood, and not ready to be pulled into a sprint. The tasks that are ready to be pulled into a sprint are sufficiently well defined and of a small enough size to enable fairly accurate estimation.

2.3 Estimates are neither Precise or Deterministic

Because we do not have complete knowledge, it is not possible to accurately predict how long it will take to complete a project. Because of this uncertainty, an estimate is not a number, but rather a probability density function. The confidence estimates are every bit as important as the time estimates ... and so they must be included along with the estimate. There are two very different types of uncertainity:

Imperfect Knowledge (the known unknowns)
Uncertainty about exactly what will be done, how productive we will be, and the problems that will be encountered along the way.
External Events (the unknown unknowns)
Changes in requirements or resources. Problems with external dependencies. Other commitments that will divert our attention from the planned work.

2.3.1 Imperfect Knowledge

These are situations where we know what we are estimating, but we may not have yet developed sufficient understanding to enable us to produce high-confidence estimates. The best way to deal with such uncertainties is to attempt to quantify them (e.g. with probability bands). In the early stages of project inception, crude estimates that might be off by a power of ten may be entirely acceptable. By the time resources are being committed, it may be important for estimates to be correct to within 5-10%.

If, in preparing an estimate, you find areas where uncertainty is unacceptably high, these are the areas on which you need to focus more effort and reduce your ignorance:

seek opinions from people with more experience in this area
do research on the problems and approaches
elaborate the design to gain a better understanding of the proposed solution
construct models and prototypes to explore potential problems and approaches.

Estimates should be regularly revisited and refined to monitor the progress of the project, and update our predictions of when we expect it to complete. As investigations, plans, and implementations proceed, unknowns will be resolved, probability functions will collapse, and the estimates should converge. If the estimates do not converge, but rather continue to move outwards, this is a warning sign that something is very wrong.

Again we see that SCRUM process recognizes and explicitly addresses these issues by:

only preparing detailed estimates for (small, well defined, and well understood) tasks near top of the backlog. As we gaze farther into the future our understanding of both the tasks and the required work becomes cloudy, and the estimates correspondingly degenerate to cruder (T-shirt size) small, medium, large characterizations.
doing design and estimation in a progressive process, continuously refining descriptions, breaking larger tasks into smaller sub-tasks, exploring designs, and prototyping solutions for the tasks near the top of the backlog. This process is both easier and more accurate than all-up-front estimates.
applying past measured velocity to the remaining backlog to obtain successively better extrapolations of the expected time to complete the required features.

2.3.2 External Events

Here the problem is not so much a matter of estimating how much work something will take as trying to guess what the future constraints on the problem might be:

requirements negotiations might take much longer than expected, delaying our ability to start serious work.
key requirements could be changed after the start of work, requiring significant rework.
planned hiring may go more slowly than expected, or people with required skills may be unavailable.
other emergencies may preempt people who are expected to be working on this project.

The schedule implications of such events could range from imponderable to devastating, and trying to incorporate them into the schedule as expectancies would (in most cases) be neither practical nor useful. Rather, these should simply be called out as risks (where possible) with associated management plans.

2.4 Get Multiple Independent Estimates

There are many approaches to estimating the size of a project:

Highly interactive projects can often be estimated based on the number of specified use cases.
Given an architecture, we can attempt to estimate the size of each component.
If we have worked on similar projects, we can attempt to assess the relative difficulty of this one, and to estimate the size delta relative to other benchmark projects.
Object oriented implementations can often be estimated based on the number and richness of classes.
Given a component design, we can attempt to estimate the number of required lines of code or Function Points.

Different techniques are appropriate to different projects at different stages of specification ... but each of these techniques yields a completely independent assessment of how much work will be involved.

Different people have had different experiences. When they look at a project description, they will see different problems, and different approaches. Assessments from different sources can points, use cases, comparisons to past projects).

If 3-5 independent estimates cluster well, you can feel good about the estimate. If estimates are widely scattered there are two possible explanations:

the problem is not yet well enough understood or specified
some of the estimators or estimation techniques are ill suited to this problem.

Which ever the cause, a group discussion about the disputed estimates usually resolves the question quickly. If the diversity of opinion results from an inadequate understanding of the problem, the answer is additional research, design, or prototyping. If the diversity results from the application of an inappropriate estimation technique, find better ones.

2.5 Be Honest and Clear

Do the best analysis you can to develop your estimates. Projects that begin with estimates that are slanted to bolster some agenda, or to tell the executives what they want to hear usually end badly. The best decisions are the ones that are based on the best possible data. A project that begins with a bad estimate is on the road to ruin.

Executives will want to know what the bottom line is. When can I have it and how much is it going to cost? Clearly state your assumptions, and make sure that you always give your times and costs within a confidence band. If they tell you that a confidence band is too wide, you need to be prepared to suggest investigations to reduce that uncertainty.

Management will almost certainly "push back" on your estimates, telling you that they are too high. Don't get defensive! Be prepared to explain the process you used to develop your estimates, the data sources on which you drew, the confirmation that you obtained, and how much confidence you have.

2.6 Keep and Open Mind, but Keep it Real

Just because I did a bunch of work to estimate that it would take six months to build a particular component, doesn't mean that I should have any great attachment to that estimate. Project size estimates can often be reduced by changing:

requirements ... exactly what we have to do
Requirements are not always well written, and all parts of the requirements are not equally important. What seems a very small requirement (only a few words) can often create some very large problems. Don't be afraid to explore requirements changes that have the potential to greatly change the difficulty.
scope ... how much we build in the first release
Requirements often describe an eventual vision of how the system should work ... but those visions can often be achieved in phases. Don't be afraid to explore possibilities for phased delivery that reduce the amount of work that has to be done in a schedule-challenged release.
design ... adopting a simpler approach
As engineers, we are drawn towards general and elegant implementations. Some generality pays for itself very quickly, but some things are over-designed. Is there a substantially simpler implementation that would actually meet the requirements for this release, and do so without painting us into an architectural corner?
methodology ... using different tools or techniques
Brooks, in "No Silver Bullet", made a distinction between essential difficulties (intrinsic in the problem) and non-essential difficulties resulting from the use of weak tools and methodology. If a problem is outside of our areas of expertise, we may have chosen a poor approach to solving it. Can we find other people, more experienced in this area, who can suggest better tools and approaches?
our understanding of the problem
When working on a hard problem, it is easy to fall into a conceptual rut. We have come to a particular understanding of the problem, which has driven us towards a particular solution. We may have made false assumptions about the problem, or the range of possible solutions. Look for people with very different perspectives and approaches, and be open to the possibility of renouncing your current visions.

But don't keep your mind so open that your brains fall out! Most software project managers believe that software development is constrained by a crude but none-the-less inescapable equation:

time

people

functionality

quality

more functionality means more work (specification, design, implementation)
greater usability, performance, robustness and reliability mean more work (design, review, testing)
more work means more time and/or people

We can change the functionality by changing the requirements or scope. Perhaps a humbler project would still meet our near-term needs. We can change the quality by reducing the time spent on design, review, error handling, and testing. Perhaps the intended uses for this software do not require much in the way of reliability, robustness, performance, usability, maintainability, etc. With appropriate planning, we may be able to accelerate the project by adding people to it. But in the end, this equation is going to hold, and trying to change the time term without making corresponding changes to the other terms is self-deception, (and unlikely to end well).

3 Summary

Estimation is difficult, messy, and does not admit of closed form (or even correct) solutions. This does not, however, mean that estimation methodology is futile or that all estimates are equally bad. People who take project estimation seriously will find their efforts rewarded. During the age of exploration (when the proceeds of expeditions had to be divided among investors) anyone who could do long division found that his career was made. Today, when so much value is created by successful software projects, the same can be said about someone who can do project estimation.