Making decision research useful-not just rewarding

Text-only Preview

Judgment and Decision Making, Vol. 1, No. 2, November 2006, pp. 162–173
Making decision research useful — not just rewarding
Rex V. Brown?
School of Public Policy
George Mason University
Abstract
An experienced decision aider re?ects on how misaligned priorities produce decision research that is less useful than
it could be. Scienti?c interest and professional standing may motivate researchers — and their funders and publishers
— more powerfully than concern to help people make better decisions.
Keywords: decision analysis, decision aids.
“Doing science is like making love. Some good may
tary action, and we pay for it dearly. Human welfare has
come of it, but that’s not why we do it.” (Richard Feyn-
greatly suffered through poor decisions.
man)
For half a century sound tools of rational choice have
been widely available and used, notably PDA (prescrip-
1 Introduction
tive decision analysis), involving the quanti?cation of
personal judgments of uncertainty and preference. These
40 years ago many of us thought we were on the brink
tools have certainly had some success, and I am con?dent
of a new era: thanks to emerging decision analysis tools,
that structured decision aiding in general has a promising
we could look forward to a brave new world, where we
future.
would no longer make foolish mistakes that ruin our lives.
I was a junior member of the Raiffa-Schlaifer team that
So far, nothing like that is remotely in sight, and I don’t
developed PDA at Harvard Business School in the 1960s.
expect it ever will be.
We were convinced these tools would revolutionize how
However, I still believe that decision aiding, and pre-
people everywhere go about their business; but so far, it
scriptive decision analysis in particular, can become a
hasn’t happened.
major force for good in the world; but it will take an up-
Twenty years ago, the US National Academy of Sci-
heaval in how decision tools are fashioned and how they
ences had leading decision scientists study the effective-
are used. It won’t come about spontaneously, because the
ness of risk analysis and decision making techniques (Si-
root problem is not technique, but motivation, which is
mon, 1988). They reported that the use of PDA and other
notoriously dif?cult to correct.
decision aiding tools was still negligible compared with
Today, I want to talk about how we can make decision
the great need and potential for them. Since then, the sit-
aiding more successful, by making usefulness a top prior-
uation has improved, but not dramatically, even for PDA,
ity in the decision aiding community and among decision
which is arguably the most promising form of decision
researchers in particular. A lot of important decision re-
aid.
search is being done, but not very much of it is helping
decision aiding to get used and be useful. I think that can
There are certainly encouraging signs. Numerous PDA
be turned around, but it won’t be easy.
success stories in various ?elds have been reported (Cor-
ner & Kirkwood 1991; Keefer et al., 2004; Clemen &
Kwit, 2001). However, the reports are typically brief and
2 The problem
do not really document how a decider’s actions were in-
?uenced and how bene?cial the results were. My expe-
2.1 Decision aiding in the doldrums
rience (which may not be representative) is that the sat-
is?ed clients are often staffers who commission the aid,
People make terrible mistakes all the time. We marry
not deciders who stand to bene?t. More systematic re-
the wrong person, our government takes misguided mili-
search is needed to establish the facts (and hopefully sug-
gest what distinguishes more from less successful deci-
?Keynote address to UNESCO Conference on “Creativity and Inno-
vation in Decision Making and Decision Support,” London School of
sion aid). In any case, the successes cannot be more than
Economics and Political Science, June 30, 2006, [email protected]
a small drop in a large potential bucket.
162

Judgment and Decision Making, Vol. 1, No. 2, November 2006
Making research useful
163
There are independent indicators that all is not well.
Another productive use of prescriptive quantitative
General Motors, which had been in the vanguard of PDA
models is to justify or communicate a choice; rather than
supporters, has backed off (Lieberman, 2002). Harvard
to make the choice in the ?rst place. Much of my decision
Business School, the cradle of PDA, no longer makes it
consulting business has been of this kind, as for others in
an MBA requirement. (However, Howard Raiffa tells me
my ?eld. For example, regulators regularly use decision
that HBS is planning to re-introduce PDA into the core
analysis models to defend in court controversial rulings
curriculum, which would do much to restore its profes-
against con?icting commercial interests.
sional standing.)
In any case, I will be illustrating my argument from my
Credible authorities have expressed to me serious skep-
own experience, mainly at my previous research and con-
ticism about PDA. They include: James March, Daniel
sulting company, DSC (Decision Science Consortium,
Kahneman and Herbert Simon, noted descriptive decision
Inc.). It deals mainly with using PDA on choices among
theorists, two of them Nobel laureates; Jackson Grayson
a few clear-cut (if perplexing) options, rather than with
(1960, 1973), a PDA pioneer who later headed the Fed-
complex decision processes, like oil re?ning,
eral Price Control Board), Stephen Watson (1992), PDA
text-book author and later principal of Henley College of
2.2.2 Useful decision aiding
Management; and policy advisors to senior Italian, Rus-
sian, British and Israel government of?cials.1
The usefulness of decision aiding depends most directly,
It is true that some of our original Harvard team have
of course, on how sound the decisions it produces are.
become highly successful deciders in business and gov-
What do they promise to contribute, at least in the long
ernment.2 Two of my own students got to head a billion-
run, to human welfare? This depends, of course, on
dollar corporation3 (which bought out my own decision
whose interests are affected (such as doctors, patients and
aiding company, which I suppose is a tribute of sorts).
tax-payers, in the case of medical decisions.)
They all told me, however, that they make little explicit
However, I am not counting as useful aiding clients to
use of PDA tools, although they ?nd their decision anal-
make up other people’s minds in the clients’ own favor.
ysis training helps their informal decision-making.
(They tend to cancel projects as soon as they appear to
produce the “wrong” answers. The US Navy approached
2.2 Useful decision aiding
us to “help” Congress decide whether to buy aircraft car-
riers or bombers. When I insisted that our ?ndings, what-
2.2.1 “Decision aiding”
ever they proved to be, should be made public, they lost
interest in using us!)
“Decision aiding” is often used to mean explicit use of a
quantitative decision model to help someone make a bet-
ter decision, and here is where progress has been most
2.2.3 Essential requirements
modest. But if we broaden the interpretation of decision
To be at all useful decision aiding must meet certain es-
aiding to include any use of quantitative models, the pic-
sential behavioral and logical requirements. It must:
ture is more encouraging.4
For example, training in decision modeling often en-
1. Address the decider’s real concerns.
hances a decider’s informal decision making (as with my
Harvard business school colleagues). My own advice to
2. Draw on all the knowledge he has.
executive clients has usually been informal, but honed, I
trust, by my decision analysis training. The decision anal-
3. Represent reality accurately.
ysis course I now teach (Brown, 2005c) is designed to ed-
ucate the intuition of deciders-to-be, not to have them rely
4. Call for input that people can provide.
on formal models in their future professional choices.
5. Produce output that the decider can use.
1Respectively, Edward Luttwak, Ivan Yablokov, Herman Bondi,
Yezekiel Dror.
6. Fit the institutional context.
2Including Ed Zschau, congressman and company CEO; Andrew
Kahr, business strategist cited in Nocera (1994) as “one of the great
Decision aiding is useless if any of these essentials is
?nancial visionaries”; Bob Glauber, Assistant Secretary of Treasury.
lacking, which is often the case.
3 Bill Stitt, president, and Jim Edwards, chairman of ICF-Kaiser Inc.
4 I am not concerned here with decision aiding that does not involve
prescriptive models. They include qualitative techniques, such as lateral
2.3 Impediments to useful aiding
thinking and group brainstorming; and decision support systems that
do not indicate a speci?c choice, such as computerized management
information systems. What I have to say may not apply to these other
The main impediments to useful aiding are de?cient
types of decision aid.
methodology and its misapplication.

Judgment and Decision Making, Vol. 1, No. 2, November 2006
Making research useful
164
1. CONTROLLABLE
2. AIDER
3. AID ESSENTIALS
4. AID
FACTORS
PRIORITIES
USEFULNESS
a. Organization of
a. Intellectual
a. Right question
aiding priorities
comfort
a. Sound
b. All knowledge
decisions
b. Choice of aider
b. Professional
used
standing
b. Clear
c. Aider incentives
c. Structure sound
rationale
c. Economic
d. Decider’s
gain
d. Input sound
involvement in
aiding

d. Service to
e. Output useful
decider
f. Output
communicated

a. Decider
e. Resources
applied

b. Institution
a. Problem type
b. Decider training
5. AID
c. Institutional setting
ADOPTION
d. State of the art
0. UNCONTROLLABLE
FACTORS

Figure 1: Effect of aider priorities on decision aid usefulness (from Brown, 2005a).
2.3.1 Aider priorities
cause they have learned to be more wary; but the same
sort of thing still goes on, in a less egregious form. The
I have addressed the misapplication impediment in a
case also has a certain piquancy for this audience, be-
companion paper (Brown, 2005a). I argued there that aid
cause our host, LSE, was involved (though no-one who is
is often misapplied because decision aiders do not give
still here).
high priority to being useful. They are under little pres-
Ford UK suspected it had too many parts depots in
sure to do so and therefore to assure that all those essen-
South Eastern England, and engaged an LSE operational
tial usefulness requirements are met.
research group to advise them. The group developed a
Figure 1 shows the structure of that argument. Whether
sophisticated transportation model, which determined an
some decision aid is useful, and therefore adopted (last
“optimal” number and location of depots. It indicated
column on right) is in?uenced by whether all essential
that, of the seven existing depots, three should be closed.
requirements are met (column three). This, in turn, is
Ford trustingly did so, with disastrous results. The ca-
signi?cantly in?uenced by the aider’s priorities (column
pacity of the four remaining depots proved so inadequate
two), such as intellectual comfort and professional stand-
for demand that trucks had to circle the depots endlessly,
ing. Aider priorities can be partly controlled (column
waiting for space to open up.
one), for example by how the aiding is organized and who
the aiders are.
It turned out that the analysts had used fatally ?awed
input (requirement 3d). They had calculated depot capac-
2.3.2 Ford depot case
ity as width-times-height-times-breadth, in effect treating
it as an empty box to be ?lled to the top, ignoring un-
A number of cases in Brown (2005a) illustrate the harm
avoidable dead space. They could easily have avoided
that misaligned aider priorities can do. They include
this gross capacity overestimation by checking with any
plenty of recent failure stories; but I will cite you here
Ford stock controller. But getting that input right may
an old one which shows with stark clarity, what can go
have been a lower priority than technical satisfaction, and
wrong. Deciders are not so easily led astray today, be-
not worth diverting much effort to.

Judgment and Decision Making, Vol. 1, No. 2, November 2006
Making research useful
165
3 Research on decision aiding art
(The same imbalance is true of other research ?elds.
The US Department of energy has spent billions (sic) of
Today, however, I want to concentrate on the ?rst imped-
research dollars on siting a nuclear waste repository. In
iment to useful decision aiding, inadequate state-of-the
the course of working on the project, I learned that a ma-
art, and to re?ect on how decision research could help re-
jor federal research agency was diverting contract money
move it. Just as aiding decisions has not been the main
into under-funded research projects more central to their
motivation for “decision aiding,” so making decision aid
regular scienti?c mission.)
useful has not been the main motivation for decision re-
search.
3.1.3 Gaps
There are serious research gaps in terms of a decision
3.1 Attractive vs. needed research
aider’s interest (though some may have been ?lled since I
3.1.1 The record
retired from active practice). The gaps are of three types:
specialty research; practice-driven research and aid de-
Following the disappointing Academy report on decision
velopment.
aiding practice that I referred to, DSC got prominent de-
cision scientists and decision aiders together to review the
3.2 Specialty research
actual and potential impact of decision research on deci-
sion aiding (Tolcott & Holt 1988).
Specialty research is speci?c to a discipline, such as
The results were disturbing. The participants did report
statistics or psychology. It is generally convergent, in
productive descriptive research on how people do make
that: it aims for well-speci?ed and authoritative scienti?c
decisions and normative research on how they would
?ndings; it usually addresses a single aspect of a prob-
make decisions if they were logical. But they had trou-
lem; it seeks universal, rather than topical ?ndings; and
ble thinking of recent research that had done much to
it is usually done by university faculty (with their own
advance the applied art of decision aiding, with the ma-
agendas). Specialty research accounts for most decision
jor exception of in?uence diagrams. Nor could they cite
research, and certainly produces many useful, even crit-
much research that was addressing problems that decision
ical, ?ndings, as I will note later. Some research is log-
aiders were currently facing.
ical or normative, and some is behavioral or descriptive
There have certainly been major innovations in deci-
(which can temper the normative, to produce usefully op-
sion aiding technique, but they have tended to come from
erational methods).
practitioners. For example, in the 1970s decision aid pi-
oneer Cam Peterson introduced the social dynamic tech-
3.2.1 Logical
nique of decision conferencing, which is now widely used
in business. Academic researchers, however, have often
Logical specialty research studies what a decider would
followed through on these innovations, for example, Ol-
do, if he met certain logical norms, like if he were an
son and Olson (2002) with decision conferencing. (Un-
“economic man.” It includes major work by Savage
fortunately hard-pressed practitioner-innovators, such as
(1954) on axioms of rationality, Fishburn (1970) on util-
Peterson, having no academic agenda, rarely publish their
ity theory, Dantzig (1957) on linear programming, and
work, which would have helped others to build on it. Here
many other models of optimal choice.
is where aider motivation works against developing the
Neglected logical topics include:
state-of-the-art.)
1. What exactly does decision theory contribute to op-
timizing choice, beyond testing judgments for con-
3.1.2 Why? Motivation
sistency?
So, why hasn’t decision research been more useful?
2. Is there a place in the PDA armory for a construct of
Richard Feynman once said, “Doing science is like mak-
impersonal probability (Brown 1993)?
ing love. Some good may come of it, but that’s not why
3. In everyday life, we progressively develop knowl-
we do it.” The good that may come of decision research
edge about uncertainties in a way that doesn’t
is that it improves decisions. The “Why we do it” (that is,
seem to ?t the conventional value-of-information
why researchers do the research that they do) is that it is
paradigm. Can this common-sense process be pro-
rewarding professionally and personally. The question is:
ductively formalized?
would more good come of decision research if usefulness
were the reason we did it? I think so. The other priorities
4. How viable is the construct of “ideal” judgment
are quite legitimate, but their dominance has created a not
that would result from perfect analysis of a person’s
particularly useful decision research scene.
available knowledge?

Judgment and Decision Making, Vol. 1, No. 2, November 2006
Making research useful
166
3.2.2 Behavioral
who was prepared to fund us to do practice-driven re-
search. We interleaved it with our regular decision aiding
Behavioral research describes decision processes, includ-
practice; and it enabled us to prepare a number of suc-
ing what is wrong with them and why. It includes Tver-
cessful, more conventional research proposals (in which
sky and Kahneman (1974) on judgmental biases; March
we often sub-contracted the specialty research parts).
and Simon (1958) on bounded rationality in organiza-
tions (1979); and Klein (1997) on naturalistic decision
processes.
3.4 Aid development research
Neglected behavioral topics include:
Thirdly, there is aid development research, which is often
prompted by practice-driven research. But, unlike that
1. Systematic review of a sample of past decision-
research, it is convergent, in that it has a clearly de?ned
aiding efforts. How good were they? Why were the
objective, which facilitates funding. But, unlike specialty
bad ones bad? What changes might have helped?
research, it addresses the here-and-now rather than the
eternal — which discourages funding.
2. How can people integrate analytic results into their
informal thinking, without disrupting it?
3.4.1 Generic
3. We have a good ?x on how to make people think
Some aid development is generic and addresses a single
smart; but how do we get them to act smart?
aspect of decision methodology. Typically it is carried
4. How can training in formal analysis educate intu-
out by an academic specialist, such as Schachter (1986)
itive and informal decisions?
on modeling in?uence diagrams.
Neglected generic questions include:
1. How does the institutional context motivate—or
mis-motivate — deciders, decision aiders and deci-
1. What is the most appropriate form to elicit the util-
sion researchers?
ity of a prospect? Should the informant judge util-
ity holistically; or as additive components of util-
ity; or by decomposing such components into fac-
3.3 Practice-driven research
tual impact and importance weight, for additive lin-
ear MUA (multiattribute utility analysis)?
Secondly, there is practice-driven research. This is open-
ended exploration of decision-aiding problems and solu-
2. What errors in evaluating options result from com-
tions, prompted by lessons learned in the ?eld. The coun-
mon modeling approximations (Brown & Pratt
terpart in medicine is clinical research (contrasted with
1996), such as additive linear MUA
experimental research). It is divergent in having no pre-
de?ned end-product, it draws on whatever disciplines the
3. Empirically, what has the experience of past
practical need calls for, and often leads to specialty re-
decision-aiding efforts been? Did they change what
search or aid development (which I will be coming to).
the decider did? Did they help, as far as we can tell?
Little practice-driven research gets deliberately
4. How accurately can people make hypothetical fac-
planned—or at least funded—mainly, I think, because
tual judgments, both in general and in speci?c oper-
it is untidy and lacks academic appeal. However, as
ations, such as the likelihood assessments called for
political analyst George Kennan has said “Tentative
in Bayesian updating?
solutions to major problems are worth more than de?ni-
tive solutions to minor problems.” It has been argued
5. Which decision tools, including non-PDA ap-
that practice-driven research will still get done, because
proaches (such as AHP, traditional OR and behav-
researchers will invest in it and get adequate return from
ioral techniques) produce closest to ideal action,
fundable follow-up research. However, the researchers
when cognitive accessibility, logical soundness and
in each case are different. Decision aiders are naturals to
implementation are traded off?
do practice-driven research (though they may not have
the time or quali?cations needed.) They produce what
3.4.2 Method-speci?c
I. J. Goode has called partly-baked ideas, for specialty
researchers to ?nish baking. (He proposed a Journal of
Other aid development is method-speci?c, which focuses
Partly-baked Ideas, where papers were characterized by
on designing a usable tool, such as Henrion’s (1991) in-
p as their degree of bakedness.)
?uence diagram software. Much of it is done by compa-
DSC was unusually lucky in having an enlightened
nies who can justify it as a business investment, so fund-
sponsor at the Of?ce of Naval Research, Marty Tolcott,
ing is less of an issue.

Judgment and Decision Making, Vol. 1, No. 2, November 2006
Making research useful
167
It usually takes the form of what design engineers call
study is needed to check them out and ?rm them up.
“build-test-build-test.” You use whatever tools you have
It would have to address causal links between research
to solve a problem, see what goes wrong, try to ?x the
projects and human welfare.
tools, and try them on the next problem. In this spirit, we
Figure 2 presents a schematic scheme of such causal
arranged back-to-back funding from the Nuclear Regula-
links. Starting at the bottom, it addresses questions like:
tory Commission (to work on their practical problems),
and from the National Science foundation (to develop
1. What decision tools will a given research project en-
methodology as needed).
hance, and how? By improving the tool or how it is
Neglected method-speci?c questions include:
applied? (“Direct research impacts” row in Figure
2)
1. Decision processes commonly consist of incremen-
tal commitments, but we analyze them as if they
were once-and-for-all choices. Is there a practical
2. How much room for improvement is there in ex-
alternative to cumbersome dynamic programming?
isting decision aiding or in the decision practices it
aids? That is, how de?cient are tools and practices
2. How can the reconciliation of plural evaluation mod-
now? (“Prescription factor” row)
els be conveniently computerized (for example, by
“jiggling” inputs)?
3. As used now, how much will the tools reduce any
logical or behavioral de?ciencies in “prescription
3. Cam Peterson has a dictum, “Model simple, think
quality” or in “action on prescription”? (“Action
complex!” How complex, or structure-intensive,
factor” row)
should decision models be, as opposed to judgment-
intensive?
4. Will the project only bene?t “action quality” or also,
4. What is the best balance of decision effort between
say, “cost” (of the decision process) or “institutional
unaided reasoning based on what you know, getting
values”? (“Bene?t type”).
new information and formal modeling?
5. How do bene?ts to various classes of decision and
3.5 Overall Pattern
population aggregate into total “human welfare”?
(Top four rows)
My best guess at the proper mix of effort on the three
types of decision research, taking into account usefulness
How the various items combine is important. The top
and other legitimate criteria, would be to spend about
levels are usually independent and additive, weighted by
a third on each. More systematic consideration might
importance. At lower levels, however, item contributions
change this split; but I’d be most surprised if it did not
may be dependent and non-additive. For example, “pre-
shake the virtual monopoly of specialty research.
scription quality” and the degree of “action on prescrip-
An analogy: artistic evolution may have turned a sim-
tion” may need to be multiplied (rather than added) to get
ple gothic arch into magni?cent Rheims cathedral. But it
“action quality.”
takes a more pedestrian utilitarian revolution, like mod-
For all its complexity, Figure 2 is by no means com-
ular building, to house the masses. In decision research,
plete. It does not, for example, address the usefulness of
an evolutionary counterpart would be in?uence diagrams,
seeding future research. Nor does it account for who is
where a powerful new idea has been continuously devel-
doing the evaluating. For example, a responsible citizen
oped over the past 30 years past (I believe) the point of
may consider that a project that improves environmental
diminishing practical returns, and is still center stage in
management world-wide just a little is more useful than
the PDA world (Decision Analysis, 2005). A revolution-
research that helps a businessman to prospect for oil a
ary counterpart would be plural evaluation, whose present
great deal. The American Petroleum Institute may not
primitive development (Brown and Lindley 1986) may
agree.
achieve most of what a greatly re?ned version could do.
4 Considerations in evaluating use- 4.1 Adapting evaluation to the nature of re-
search options
fulness
We only need to consider those items in the causal
The research suggestions I am making are based largely
scheme that are affected by a particular project evalua-
on intuitive judgment. Systematic, but still informal,
tion.

Judgment and Decision Making, Vol. 1, No. 2, November 2006
Making research useful
168
£
 
E.
Ultimate objective
¢ Human Welfare ¡
£
 
£.
 
£ .
.
.
.
 
E.
Domain
¢medicine ¡
¢government ¡
¢business ¡
.
£
 
£
.
 
E.
Bene?ciary role
¢decider ¡
¢consituency ¡
.
.
£
 
£
.
 
£
 
£ .
 
Bene?ciary type
E.
¢doctor ¡
¢hospital manager ¡
¢patient ¡
¢taxpayer¡
.
£
 
£ .
 
£
.
 
Bene?t type
E.
¢action quality ¡
¢cost ¡
¢institutional values ¡
£
.
 
£
.
 
E.
Action factor
¢prescription quality ¡
¢action or prescription ¡
.
.
£
  £
.
  £
  £
.
 
E.
Prescription factor
¢existing de?ciency ¡
reduction in de?ciency
existing de?ciency
reduction in de?ciency
.
¢
.
. ¡ ¢
¡ ¢ .
¡
d
s
 

d
 
.
.....
.
.....
.
.....y
$$$$$$$$$$$
X.
d
 
E.
DIRECT RESEARCH IMPACTS
decision tool
.
quality
decision .tool application
e
u
¡
!
e
¡
RESEARCH PROJECT
Figure 2: Contributors to research usefulness
4.1.1 Designing a single tool
5 Quantifying usefulness
Suppose the contending projects simply address differ-
5.1 The value of a measure of research use-
ent aspects of the same aiding tool. The research options
fulness
may only affect the quality of the choices that this tool
prescribes. In that case, we only need to judge which re-
How is the appropriate refocusing of research effort to be
search option improves “prescription quality” most. At-
achieved?
tention can thus be limited to the two arrows at bottom
left of Figure 2.
5.1.1 Inadequacy of exhortation
Publicizing the above informal reasoning on research
usefulness might be all that is required to stimulate use-
ful research practice. In fact, I originally thought that
4.1.2 Comparing dissimilar projects
all decision aiders had to do was to tell researchers
what research we needed and wait for it to get done.
However, if research options are more dissimilar than
I campaigned rather vigorously for a reformed research
this, more of the causal scheme needs to be considered.
agenda, by pitching it to decision science groups5 around
Suppose projects address the same decision task, but dif-
the USA and by publishing articles in psychology and
ferent aiding tools. (One project may study decision con-
operations research journals (Brown 1989; Brown & Vari
ferencing and the other expert systems, both for medical
1992). Not much came of it. My issues were not the re-
therapy purposes). Suppose, further, that the tool choice
searchers’ issues; and at DSC we were not in a position
affects not just “prescription quality,” but also “action on
to do much of the research ourselves. Exhortation is not
prescription” and on “institutional values” (e.g., commu-
enough.
nication). Then all four of the bottom rows of Figure 2
will be affected.
5.1.2 Need for motivation
Taking dissimilarity among projects further, suppose
they address different domains, different decision tasks
Motivation is therefore needed. The decision research
and different tools. The choice might be between re-
community has had the luxury of indulging priorities
search on recognition-primed decision for nuclear risk
other than usefulness, because it could get away with it. I
management and research on career planning for the
am now convinced that decision researchers, funders and
deaf.) Then virtually the whole of the causal scheme
5These included Harvard, Stanford, Duke, Wharton and Carnegie-
would need to be addressed.
Mellon.

Judgment and Decision Making, Vol. 1, No. 2, November 2006
Making research useful
169
journal editors will only pay real attention to usefulness
upper end-point could be a project that produces a per-
if they are held accountable for it—or at least get credit
fect decision aid (that is, one that makes perfect use of
for it.
the decider’s knowledge), or the greatest contribution that
The National Science Foundation does have its pro-
any decision aid could make. For example, in a medical
posal referees comment on something like usefulness, un-
context the evaluator might reason: “I project that this re-
der the heading “issue importance.” But this criterion is
search will move the quality of surgical decisions 10% of
swamped by others, such as technical soundness and orig-
the way from current practice to some ideal.” I am not
inality, and, since the evaluation is qualitative, it does not
sure how well such a measure would work, but I will be
constrain referees much in selecting proposals.
trying one out presently.
5.2 Grading research projects on useful-
5.2.3 Quantifying the measure
ness
Any measure, however de?ned, can be evaluated holisti-
I now believe that nothing short of reporting a credible
cally with direct judgment, and that will often be enough.
and highly visible quantitative measure of research use-
It could be derived from a decision analysis model; but I
fulness will move researchers and sponsors to take it seri-
would not give that high priority. The content of the eval-
ously. The purpose of the measure would be not so much
uation, and the very fact of quantifying it, is usually more
to improve informal judgment of research usefulness, as
important than how convincingly it is quanti?ed.
to communicate and justify the judgment to others.
6 Real examples
5.2.1 Existing precedent
There is some limited precedent for funders giving credit
To be more concrete, here is a couple of real research
for a quantitative measure of usefulness. NSF’s SBIR
planning choices that I have had to make, with some
(Small Business Innovation Research) program does have
thoughts on how they might be evaluated.
referees score proposals on usefulness on a ?ve-point
scale, under the heading “anticipated technical and eco-
6.1 Different aspects of one tool
nomic bene?ts.” This score is added to scores for four
more conventional academic criteria. This is ?ne. But I
6.1.1 Elicitation vs. logic for Bayesian plural evalu-
would like to see that practice extend to all decision re-
ation
search procurements.
My ?rst example involves only the bottom of the causal
scheme, and is one of the simpler examples of how a
5.2.2 Credible usefulness measures
research planning choice might be quanti?ed (if it were
The ?rst step in any quanti?cation is to specify the mea-
worth the trouble). It is a comparison of two method-
sure. For many research planning situations, such as com-
speci?c aid development projects. I was preparing a re-
paring small-budget proposals, the loose measure SBIR
search proposal to develop a Bayesian tool for plural eval-
uses may be suf?cient. However, more precise measures
uation (that is, making a judgment different ways and
are called for in high stakes evaluations, especially where
reconciling the results). The research design issue was
usefulness has to be traded off against other criteria. A
whether to re?ne the logic of an existing model or to im-
critical consideration would be whether the user of the
prove the elicitation of inputs.
evaluation can understand the measure and check it intu-
itively for plausibility.
6.1.2 Informal Evaluation
A natural metric (like money) may be the most promis-
ing usefulness measure. It could be the maximum that the
I decided in favor of elicitation, on the following infor-
evaluator would consider paying. A funding agency of-
mal grounds. Bayesian updating in its current form is
?cer might say “The most I could approve awarding for
almost useless for enhancing intuitive plural evaluation,
this proposal is $50k. They’re asking $100k, so I’m de-
because people can’t provide the likelihood assessments
clining it.” However, there may be no natural measure
it needs as input. On the other hand, the logic is already
that ?ts the circumstances.
quite passable and has only modest room for improve-
The default measure could be an all-purpose rating
ment. Most of the tool de?ciency would be cured if elic-
scale. The end-points might be zero for present perfor-
itation were effective. Since we could make comparable
mance and 100 for some ideal. The range of the scale
improvements in either aspect for the same cost, elicita-
would be the room for improvement in existing aid. The
tion research appears more cost effective.

Judgment and Decision Making, Vol. 1, No. 2, November 2006
Making research useful
170
NOW
After E
NOW
After L
Room for improvement in logic
NOW
Room for improvement in elicitation
Figure 3: Research on different aspects of a decision tool
My original impulse, however, had been to work on the
shortened by about ten times as much for the elicitation
logic, because my background made me more comfort-
as for the logic project, a great advantage.
able with the decision theory involved in the logic issue
My other, private reasons for initially favoring logic
than with the psychology of elicitation. Moreover, a logic
were not enough to overcome this advantage for elici-
study would give us a better chance getting funded and
tation in usefulness, even if this triangle only approxi-
getting published. In effect, I was swayed by the same
mately models my judgment. So, all in all, the elicitation
distorting priorities that I have been imputing to others. I
project is clearly preferred.
managed to overcome that impulse.
6.1.3 Quanti?ed evaluation
6.1.4 Comparing proposals
If we had quanti?ed this reasoning, the measure of use-
fulness could be “potential improvement in plural eval-
What would I have gained by this exercise in quantify-
uation.” We could judge directly which project scored
ing usefulness? In this case, probably not very much. It
higher; or try something more ambitious, like the follow-
would con?rm my informal planning choice and remove
ing.
any indecision, but it still would make no sense to spend
Imagine an ideal plural evaluation methodology where
much of the research effort on planning how to spend the
the modeling and elicitation deal perfectly with what the
rest of it. True, it could also have helped justify my choice
evaluator knows. Now consider how far short of this ideal
to the research agency, but still probably not enough to
the present state-of-the-art falls, i.e. the room for im-
bother.
provement. It seems to me that elicitation and logic relate
On the other hand, it could be quite worthwhile to
to that de?ciency in a roughly Pythagorean way (rather
ONR, the funding agency, to grade all proposals on use-
than, say, multiplicatively). Figure 3 shows that relation-
fulness along these lines, to help choose among them.
ship in the context of a right-angle triangle.
However, the measure of usefulness would now need to
The triangle sides are de?ciencies in the two aspects.
be located higher up the causal chain, and take into ac-
The logic side (on the left) is shorter than the elicitation
count more than improving one aiding tool. The mea-
side (at the bottom), re?ecting my view that the logic is
sure might go as high as contribution to the quality of
less de?cient. The hypotenuse gives the resulting total
all military decisions. Furthermore, if ONR wanted to
de?ciency. If the same effort on either aspect cuts its
take other criteria into account, the measure of usefulness
de?ciency by half, the new hypotenuse (dashed line) is
would need to be explicit enough to permit trade-offs.

Judgment and Decision Making, Vol. 1, No. 2, November 2006
Making research useful
171
6.2 Different aiding approaches
with an immense research program to aid a critical na-
tional choice.
6.2.1 Decision analysis vs. Organization design
I was a consultant to the US Department of Energy on
The next example shows how both the researcher and so-
how to spend literally billions of dollars on whether a pro-
ciety could bene?t from quanti?cation, in the cause of
posed nuclear waste site was acceptably safe. I proposed
convincing a research sponsor to support more useful re-
an analytic strategy for allocating this money among var-
search. The case involved research on alternative ap-
ious research tasks, and re-allocating it when develop-
proaches to improving military tactical decisions. Navy
ments indicated (Brown 2005b). When I implemented
authorities had noted that in ?eet exercises, submarine
the strategy, it indicated major reallocation of the original
commanders wait far too long to ?re their torpedoes,
budget. In particular, in the light of unexpected recent ev-
which puts them at great risk of being ?red upon ?rst and
idence, it recommended more research on gas-borne ra-
being destroyed in a real war.
dioactive release and less on water-borne release, which
We were charged with developing a decision tool that
had dominated the research program so far.
would help sub commanders to make more rational ?ring
The trouble was that this enormous budget was shared
decisions. Our ?rst analyses con?rmed that, indeed, the
among a few large and entrenched research organizations.
commanders did wait imprudently long to ?re. However,
They jealously guarded their shares, and none of them
when I talked to commanders, I found that the problem
had an interest in the gas-borne issue. They wielded
was not with rational choice, but (once again) with moti-
enough political in?uence to block any reallocation. I
vation. They got credit for pinpointing where the enemy
took my informal argument in vain to an independent
sub was, but they were not penalized for taking unjusti-
Technical Review Board appointed by the US President
?ed risks (which could get them killed). So it was quite
(and was promptly ?red by DOE!). I suspect that a well-
rational for a career-oriented of?cer to delay ?ring be-
modeled quantitative argument presented to the US Of-
yond sound military practice.
?ce of Management and Budget, the ?nal government ar-
biter, would have been less easily brushed aside. I might
even have gone public with it and pressured Congress to
6.2.2 Informal evaluation
intercede.
I urged our Navy client to switch our research assignment
Fellow decision-aiders on the project have estimated
from decision analysis to an organizational study of their
that the Department of Energy has wasted some 5 bil-
reward system. We argued that bene?ts to the navy would
lion dollars on this nuclear waste program, over the years
go beyond this case and could pave the way for a fruitful
(Keeney, 1987). In this light, I wouldn’t be surprised
new research program. However, our informal argument
if the difference in usefulness between our proposed re-
did not prevail, for bureaucratic reasons: our research
search plan and the one adopted amounted to tens of mil-
grant was part of a larger ONR program on operational
lions of dollars. Thus, a convincing measure of research
military decision aids, and this proposed change was out
usefulness might us have saved the American tax-payer
of scope. So we bowed out of the grant (and luckily found
a great deal of money. Decision analyst Ron Howard
support for the organizational research elsewhere).
has suggested that 2% of stakes involved in any deci-
sion should be devoted to analyzing it. In this case, that
would justify spending hundreds of thousands of dollars
6.2.3 Quanti?ed evaluation
on comparing the usefulness of research plans—provided
It is possible that we would have prevailed over the bu-
the results were acted upon!
reaucratic constraints, if we had made a quantitative case
on usefulness to our client’s Navy superiors. The mea-
7 Conclusions
sure of usefulness might be: reduction in the Navy’s loss
due to mistimed torpedo ?ring (adjusted for other crite-
ria, such as seeding new research). The supporting ra-
7.1 Main message
tionale — formal or informal — would address how re-
In this talk, I have tried to make the following case:
search might actually change the reward system; and, if it
If grading decision research projects becomes general
did, what its effect would be on torpedo ?ring behavior.
practice, decision research will be radically transformed,
and decision aiding might at last become a major force
6.3 High-stakes risk research
for better decisions throughout society.
My third example presents by far the strongest case for
7.2 Work needed
a quantitative measure of research usefulness, indeed one
supported by substantial modeling. The example dealt
For this to come about, two things need to happen.