Data Driven Decisions

Mark Kampe

Introduction

We have all heard questions of the form: "How can you put a value on X!" This is usually a rhetorical question, with the implication that the value of X is infinite, or at least an order of magnatude greater than the value of all the other factors combined ... making the decision a no-brainer. Relatively few real-life decisions turn out to be no-brainers ... which means we will often be forced to compare apples and Orangutans, to put values on each, and to find a solution that optimizes our "goodness function".

Different stakeholders will argue, at length, for and against various approaches, and very few of the arguments will be so persuasive as to decisively resolve the question. The decisions usually wind up being made by some combination of:

the loudest voice.
the most charismatic advocate.
the largest constituency.
the view that most impresses the leader.

Unfortunately, experience has shown the first two to be (at best) uncorrelated go good solutions, and the second two to be highly hit-and-miss. We would like to be able to reduce such decisions to linear programming problems ... but there are a a few major obstacles to that approach:

The "goodness function" is seldom specified.
Most of the attributes of most proposals are non-numerical, (or have inconformable units) and thus difficult to compare.
The degrees to which the various alternatives contribute to terms of the goodness function may not be completely understood.

Goodness Functions

In our personal lives, we get to define our own "goodness functions" ... and I like to think that as we mature we gain an ever-better understanding of their key terms (be they "career advancement", "time with friends and family" or "readiness to have our hearts weighed against Maat's feather of Truth"). These understandings result from a lifetime of experiences. How can a group of very different people possibly agree on a "goodness function" in a reasonable period of time? Fortunately, it is much easier for a group! It takes most people a lifetime to figure out the goals for their lives. When a group of people band together, however, it is usually to achieve fairly clear and specific goals (e.g. mastering a skill, getting a good grade on a project, solving a problem, building a product, becoming better musicians, etc). Those goals form the basis for the group's goodness function.

In the Wiegers paper on Prioritizing Requirements he suggested making a list of the requirements, and assigning a relative weight to each. The same approach can be used to deciding how to evaluate proposals. It is often easiest to do this in a top-down manner:

65% of the weight should be for satisfying basic customer requirements.
1. 50% of the weight should be for the must-have requirements.
  1. 20% should be for core functionality requirements
  2. 10% should be for competitive requirements
  3. 10% should be for ease-of-use requirements
  4. 10% should be for robustness and reliability requirements
2. 15% of the weight should be for the should-have requirements.
25% of the weight should be for achieving organizational goals.
1. 15% of the weight should be for progress against our technology roadmap.
2. 10% of the weight should be for the development of desired capabilities.
10% of the weight should be for achieving personal goals.
1. 5% of the weight should be for established educational/developmental goals.
2. 5% of the weight should be on fun and interesting.

What amazes me, whenever I go through the process of agreeing on weights, is how well it works. Even though different stake-holders may have very different perceptions of what is important, converging on a set of weightings generally goes smoothely and quickly. Why?

everybody wants the group to succeed, and even though different people may have very different hopes and fears, their views on what constitutes success tend to be well clustered.
at each level, it is usually obvious what the most important 2-3 items are, and (because of their importance) their weights tend not to be very controversial.
secondary items are pretty easy to add to the list. We all probably agree on the list of secondary considerations.
it is pretty easy to organize the items into three crude buckets (critical, important, and valuable).
the weights of the critical items tend not to be very controversial, and the weights of the valuable items turn out not to matter (their coefficients being sufficiently small that tweaking them makes no difference).

Not only does this process tend to converge fairly quickly, but having gone through this process leaves everyone with a better understanding of all of the properties that must to combine in order to achieve success. The process of defining goodness by a top-down weight assignment and refinement is an effective way to guide complex decisions. People sometimes fear that "decision methodology" will suppress the views of indvidual stake-holders. Quite to the contrary, it creates a framework within which all of the competing concerns can be combined and balanced.

This appoach is widely used, and not merely in business settings. It is common for grant proposals to be scored in this way (with points awarded for specific weighted factors) ... and it is the process that has been used to determine the formula for computing your grade in this course.

Comparing Un-Like Quantities

A proposal must be evaluated in a great many dimensions, most of which do not have neatly calibrated axes. There are three basic approaches that can help with this problem:

reducing characteristics to common units
criterion-referenced scores
don't get hung up on meaningless precision

Reduction to Common Units

In every endeavor, there are a few fundamental units of cost and benefit. Whenever possible, all costs should be converted into fundamental cost units, and all benefits into fundamental benefit units. In commercial enterprises, time and money are often the obvious fundamental units:

pursuing any option costs time and money
risks carry likely costs in time and money
having a product by a certain date, is likely to result in a monetary reward. Having a different product, on a different date, is likely to result in a different monetary reward.

This provides a relatively objective basis for assigning a value every option. Where probabilities are involved (with either costs or benefits) we can use expectancies ... or if we want to get really tricky, we can do Monte-Carlo simulations against distributions.

But everything doesn't reduce to money. There are other kinds of costs (e.g. hours away from fun/friends/family, angst/aggrivation, telescope-hours) and other kinds of benefits (e.g. publishable papers, grades, hours of fun). The point is not to reduce everything to dollars, but merely to comparable units. If we could distil three project alternatives down to:

15 days of work, likely to earn 80 points
20 days of work, likely to earn 90 points
40 days of work, likely to earn 95 points

Would this help you to make the choice?

Criterion-Referenced Scores

Some things are never going to be numerically reduceable to cost and reward units ... but having decided on weighting actually goes a long way towards making them quantifiable.

Suppose we have decided that meeting a certain requirement is worth ten points. We could then define a scoring system:

10: clearly and completely satisfying the requirement.
8: likely to satisfy the requirement.
6: likely to address most instances of the requirement
4: likely to address most important instance(s) of the requirement
2: responds to the requirement

It is not hard to come up with a set of criteria for evaluating each product characteristic. But how do we decide what scores to assign to partial satisfactions. A non-arbitrary basis would be the estimated probability that the product could still be successful, given the degree of requirement satisfaction. If we are pushing on the state-of-the-art, even partial progress against the goal may represent a major success. If we are talking about a characteristic for which the market already has high expectations, even a small shortfall might make the result completely unacceptable.

The shape of the scoring function is driven by an understanding of the nature, importance, and meaning of each requirement. Fortunately, having gone through a requirements elicitation and validation process, you are prepared to offer informed opinions on these subjects.

Meaningless Precision

None of the above-suggested processes are particularly precise. These processes are intended to use our best undersanding of our goals and proposals to assess the relative costs and probabilities of success. It is highly unlikely that any of the results will be "accurate" (what ever that might mean) to within 10%. It is a relatively crude process that is intended to:

separate the contenders from the non-contenders
highlight the differences between the contenders

Sometimes a clear star emerges. Sometimes there are a couple of options that all seem equally good ... in which case we can shift our attention to a more precise analysis of their key differences. Sometimes there are no survivors ... in which case we at least know how they all fell short, and what a winning proposal would need to address.

Because of the relatively large uncertainty in these numbers, there is no point in arguing about small differences in secondary and tertiary weights or scores. They won't make a difference in the final decision. Once you have your numbers in the right ball-park, move on. Later, if there are multiple contenders, we can try to improve the precision of our estimates in the areas where they differ.

Attributes of the Proposals

So we have a matrix, with columns for each proposal, and (weighted) rows for (10-100?) characteristics of each proposal. All we have to do is fill in all the cells, and we have our answer ... but there are a lot of cells to fill in. How are we going to do it?

A good proposal is one that makes it easy to fill in most of the cells of a comparison matrix:

it should explain the manner in which it responds to each of the major requirements.
it should (crudely) enumerate the major work items, and estimate the likely size (e.g. power of 2 hours) for each.
it should articulate the major risks, associating a (crude) probability (e.g. high/medium/low) and cost (e.g. hours/days/weeks/failure) with each.

This would seem to presume that the decision matrix was known before the proposals were written. Often this is the case (e.g. it is common procedure for grant RFPs), but much of it is obvious:

the requirements to which the proposal is responding should be well known.
the likely costs and risks follow from the proposal, and estimating them is part of preparing the proposal.
each approach is likely to have numerous advantages beyond the manner in which it responds to basic requirements. These should be clearly articulated in the proposal ... and those that are found to be compelling, will become rows in the comparison matrix.

The point here is that the values that go in most of the cells will be quick and clear. There will surely be discussions, but they are likely to be:

explorations of specific cost and risk estimates.
explorations of how well a particular proposal responds to a particular requirement.

and these are very valuable discussions to have.

Summary

It is not the case that a sound decision-making methodology will eliminate all arguments or inevitably lead to a quick convergence on a good decision. It can, however, change the arguments, and result in a well-justified decision. Instead of arguing about "which proposal is best", the arguments will focus on the key elements of success, the probabilities of success and failure of various sub-tasks, and the degrees to which particular characteristics are likely to contribute to our overall success. Those are much better arguments because:

they focus on issues that are critical to our success, encouraging us to "keep our eyes on the ball".
they are sufficiently fine-grained that they are more likely to be argued with data, or at least with more readily justified/evaluated opinions.

To the extent that you can arrange for most of your arguments to be resolvable questions about key issues, backed-up by the best available information, your success (in all your endeavors) is "in the bag".