An alternative approach for eliciting willingness-to-pay: A randomized Internet trial

Text-only Preview

Judgment and Decision Making, Vol. 2, No. 2, April 2007, pp. 96–106.
An alternative approach for eliciting willingness-to-pay: A
randomized Internet trial
Laura J. Damschroder?1,3, Peter A. Ubel1,2,3,4, Jason Riis5, and Dylan M. Smith1,2,3
1HSR&D Ann Arbor Center of Excellence, Department of Veterans Affairs, Ann Arbor, MI
2Division of General Internal Medicine, University of Michigan
3The Center for Behavioral and Decision Sciences in Medicine, University of Michigan
4Department of Psychology, University of Michigan
5Department of Marketing, Stern School of Business, New York University
Open-ended methods that elicit willingness-to-pay (WTP) in terms of absolute dollars often result in high rates of
questionable and highly skewed responses, insensitivity to changes in health state, and raise an ethical issue related to
its association with personal income. We conducted a 2x2 randomized trial over the Internet to test 4 WTP formats: 1)
WTP in dollars; 2) WTP as a percentage of ?nancial resources; 3) WTP in terms of monthly payments; and 4) WTP as
a single lump-sum amount. WTP as a percentage of ?nancial resources generated fewer questionable values, had better
distribution properties, greater sensitivity to severity of health states, and was not associated with income. WTP elicited
on a monthly basis also showed promise.
Keywords: health, contingent valuation, willingness-to-pay, computerized elicitation, income.
1 Introduction
ical issues that question the validity of eliciting WTP
through a single elicitation. Early WTP surveys elicited
Many economists elicit people’s willingness to pay
values using an open-ended question from a self-interest
(WTP) for healthcare interventions through contingent
perspective to obtain personal use values; e.g. “how much
valuation surveys so that the bene?ts of those interven-
would you be willing to pay to be cured?” (Smith &
tions can be valued in monetary terms (Diener, O’Brien,
Richardson, 2005). These open-ended formats ask for
& Gafni, 1998; Klose, 1999; Olsen & Smith, 2001;
WTP values without presenting a starting point value and
Smith, 2003). This is despite many known biases that
without using a search routine to help respondents de-
occur when attempting to elicit a dollar value from peo-
termine a value. Respondents are simply asked to give
ple for a good that is not usually directly available in the
a dollar value. However, researchers have questioned
market; e.g., perfect health (Baron, 1997). Much litera-
the validity of this format because responses are prone
ture focuses on developing consensus on the most valid
to a high number of non-response or zero values and
method for eliciting WTP; putting aside any philosoph-
because responses are heavily skewed toward high val-
ues, perhaps, in part, due to strategic bias (Donaldson,
?The authors would like thank Richard Smith for his insightful com-
Thomas, & Torgerson, 1997; O’Brien & Gafni, 1996). In
ments on earlier drafts of this paper. Also, thanks to Todd Roberts and
response to these concerns, a U.S. Federal panel in 1993,
Jennifer Heckendorn who helped administer and implement the survey.
led by Kenneth Arrow, concluded that “both experience
Financial disclosure: This research was supported by HSR&D Ann
and logic suggest that responses to open-ended questions
Arbor Center of Excellence, Department of Veterans Affairs and the
National Institute on Child Health and Human Development Grant
will be erratic and biased” (Arrow et al., 1993, p. 4613).
#R01HD040789. The funding agreement ensured the authors’ inde-
Since then, researchers have moved away from elic-
pendence in designing the study, interpreting the data, writing and pub-
lishing the report. The following authors are employed by the VA Ann
iting WTP using an open-ended format and developed
Arbor Healthcare System: Laura J. Damschroder, Dylan Smith, and Pe-
three types of closed-ended formats in an attempt to over-
ter A. Ubel. Dylan Smith is supported by a career development award
come shortcomings of the open-ended format. These
from the Department of Veterans Affairs.
“close-ended” formats ask respondents to say yes or no
Direct Correspondence to: Laura J. Damschroder, University of Michi-
gan Health System, 300 North Ingalls, Room 7C27, Ann Arbor, MI
to a series of questions or to select a value from a pre-
48109–0429. Email: [email protected]
speci?ed list. All three methods have methodological is-

Judgment and Decision Making, Vol. 2, No. 2, April 2007
Alternative approach to eliciting willingness to pay
sues, however. The bidding game is prone to starting-
et al., 1993). Conversely, a respondent may give an arti-
point bias (WTP changes depending on the starting value
?cially low response in an attempt to in?uence the actual
used to begin the bidding) and the payment card method
price eventually charged.
is prone to range bias (WTP changes depending on the
It could be that a more constrained, but still essentially
range of values presented) (Klose, 1999; Smith, 2000;
open-ended approach might avoid some of the problems
Venkatachalam, 2004; Whynes, Wolstenholme, & Frew,
reviewed above. Speci?cally, eliciting WTP as a per-
2004). The single-bounded discrete choice format is sta-
centage of ?nancial resources has two potential advan-
tistically inef?cient and studies using this approach are
tages. First, a percentage measure will force the use of
very expensive to conduct because, all else being equal,
a bounded 0–100 response scale creating a more statis-
it requires a larger sample size and more sophisticated
tically ef?cient scale measure (Kahneman et al., 1999).
design and analysis techniques (Smith, 2000; Venkat-
Generally, people are unable to map their preference for
achalam, 2004). In addition, this format is prone to
a health effect using a scale consisting of dollars with that
several biases including “yea-saying” where respondents
starts at zero but with no clear maximum amount (an un-
have a tendency to agree with the amount presented (Ye-
bounded scale) (Payne, Bettman, & Schkade, 1999). Sec-
ung, Smith, Ho, Johnston, & Leung, 2006). A double-
ond, percentages involve smaller numbers (a 0–100 scale
bounded choice format was derived to increase statistical
for the percentage formats versus 0 to an unde?ned max-
ef?ciency. However, even responses from people who re-
imum for the dollar formats) and people process smaller
port a high level of certainty about their willingness to
whole numbers more reliably. In one study, Thompson,
pay exhibit signi?cant anomalies that increase as uncer-
Read, and Liang (1984) found that a percentage measure
tainty increases (Watson & Ryan, 2006).
exhibited more signi?cant associations with key indepen-
We believe the open-ended format deserves further ex-
dent variables such as the number of symptoms suffered
ploration. Despite the strong statement we quoted ear-
by respondents and medications taken than did WTP ex-
lier against using it, some researchers do not agree with
pressed in dollars.
the call to abandon the open-ended format (Smith, 2000).
The purpose of the current study was to compare WTP
Although different formats produce different responses,
values elicited as a percentage of ?nancial resources to
it is not clear which format is superior (Venkatachalam,
values elicited as dollars using open-ended formats. We
2004). A recent study comparing alternate elicitation for-
predicted that the percentage method would be less prone
mats concluded, “. . . it would seem that the most informa-
to inconsistent responses, would be more sensitive to dif-
tive elicitation format in the present context . . . appear[s]
ferences in severity across health states, and would show
to be the open-ended format. . . [though this] format is
more desirable distributional properties. We asked for
nowadays distinctly unfashionable in health economics,
percentages based on “?nancial resources” rather than in-
having long since given way to supposedly-superior elic-
come because it is realistic to expect that many people
itation formats” (Whynes, Frew, & Wolstenholme, 2005,
would consider savings, borrowing power, and other ?-
p. 384). Advantages of the open-ended format are that
nancial resources to pay for a cure of a condition they
it does not introduce range or starting-point biases and it
want to avoid. Thinking about paying out amounts on a
can be highly statistically ef?cient compared to discrete
monthly basis rather than a single lump sum enables re-
choice formats.
spondents to think of smaller quantities and the amounts
The open-ended format also has several clear disadvan-
proposed are likely to be more salient because many peo-
tages, however. This format may place a heavy cognitive
ple budget their ?nances on a monthly basis. Advantages
demand on respondents. In fact, the other formats were
of the percentage format could be reduced or eliminated
developed, in part, to make the elicitation simpler and
when monthly payments rather than lump sum payments
more realistic for respondents (Donaldson et al., 1997;
are considered. Thus, we also introduced a second di-
Smith, 2000). Furthermore, asking for WTP in terms
mension against which to compare elicitation formats: a
of dollars using an open-ended format requires using an
monthly timeframe versus a single lump sum amount.
unbounded response scale (a scale that starts at zero but
The current study extends the studies done by Thomp-
with no de?ned upper end) that naturally contributes to
son and colleagues (the largest study, to date, that has
the highly variable and skewed responses typically seen
elicited WTP as a percentage) in several ways. First, we
with open-ended WTP elicitations (Kahneman, Ritov, &
introduce a within-subjects measure of sensitivity. Sec-
Schkade, 1999). In addition, people may be more likely
ond we compare the effects of using a monthly timeframe
to give “strategic” values with an unbounded scale; a re-
to elicit WTP to a single lump-sum amount. Third, we fo-
spondent may believe that the treatment has high intrin-
cus speci?cally on distributional properties of responses
sic or social value and thus places a very high value not
to further assess percentage formats as a more ef?cient
grounded in the reality of actually paying such a ?gure in
measure. Finally, the current study utilizes a larger sam-
the form of taxes or as an out-of-pocket expense (Arrow
ple, and surveys the general public instead of patients.

Judgment and Decision Making, Vol. 2, No. 2, April 2007
Alternative approach to eliciting willingness to pay
Table 1: WTP elicitation formats.
Time Period:
WTP Units:
Please type the maximum dollar
Please type the maximum dollar
amount you think you would be will-
amount you think you would be
ing and able to pay for this treatment.
willing and able to pay per month for
this treatment.
$_____ (please enter only one amount)
$_____ per month.
(please enter
only one amount)
% ?nancial resources
Please type the maximum percentage
Please type the maximum percentage
of your ?nancial resources you think
of your ?nancial resources you think
you would be willing and able to pay
you would be willing and able to pay
for this treatment.
per month for this treatment.
_____ % of my ?nancial resources.
_____% of my ?nancial resources
(please enter a number between 0 and
per month. (please enter a number
between 0 and 100)
2 Method
sign. We elicited WTP using one of two different units
of measure (percentage of ?nancial resources or dollars)
We elicited people’s WTP for curing two health condi-
and one of two different timeframes (on a monthly ba-
tions using a web-based survey over the Internet. We
sis or an overall total). No durations for payments were
recruited respondents via an email sent to a sample of
speci?ed. We chose percentage of “?nancial resources”
members in an Internet panel maintained by Survey Sam-
instead of income for reasons already cited. Financial re-
pling International (SSI). This panel is made up of more
sources will typically be equal to or greater than income;
than 1 million unique member households, recruited via
thus, the underlying scale could represent values greater
random digit dialing, banner ads, and other opt-in tech-
than income. The four versions (2 WTP measures X 2
niques. Our study sample was strati?ed to mirror the U.S.
timeframes), along with the speci?c questions we posed
census population based on age, gender, race, education
are presented in Table 1.
level, and income. Upon completion of the survey, par-
For each format, we ?rst presented the description of
ticipants were entered into a drawing for cash prizes that
the health state (listed in the appendix) and then asked the
totaled $10,000.
respondent to type in their response. The precise word-
ing asking for a WTP amount depended on the format to
2.1 Health state descriptions
which the respondent was assigned, as presented in Table
1. We then told respondents, “In answering this question,
We presented descriptions of two health states to each
take into consideration the actual ?nancial resources you
respondent: 1) a below-the-knee amputation (BKA) that
have. We recognize that giving an exact amount may be
moderately affects physical mobility; and 2) paraplegia,
dif?cult; just give the best estimate you can.” Our purpose
which signi?cantly affects mobility. Detailed health state
with this instruction was to emphasize personal ?nancial
descriptions are in the appendix. We counterbalanced the
constraints before respondents gave a WTP amount. We
order of the BKA and paraplegia health states.
elicited WTP for both health states from each respondent.
2.2 WTP elicitation formats
2.3 Outcome criteria and analysis ap-
We elicited each respondent’s WTP for a medical treat-
ment that would permanently restore full physical func-
tioning for each of the two health states. Respondents
Analyses were performed using the native units and time-
were randomly assigned to one of four elicitation for-
frame with which WTP was elicited; e.g., in terms of
mats, using a full-factorial two-by-two experimental de-
monthly percentage of ?nancial resources. Our primary

Judgment and Decision Making, Vol. 2, No. 2, April 2007
Alternative approach to eliciting willingness to pay
study question was whether WTP expressed as a percent-
incomes may have fewer discretionary ?nances available,
age of ?nancial resources would result in higher quality
even when expressed as a percentage (Donaldson, Birch,
responses and better distributional properties compared
& Gafni, 2002). Though we did not have a prediction
to WTP expressed in absolute dollars, and thus would
about whether WTP and income would be signi?cantly
show greater ability to detect differences between health
associated with the WTP elicited using the two percent-
states of different severity. We also wanted to explore
age formats, we did hypothesize that WTP as a percent-
whether WTP expressed on a monthly basis would im-
age of wealth would have a lower association with in-
prove properties of WTP responses and perhaps reduce
come compared to WTP elicited as dollars.
any advantages observed of the percentage format.
We compared the four elicitation formats using ?ve cri-
3 Results
First, we wanted to reduce the number of questionable
WTP responses. Questionable WTP responses include
Compared to WTP expressed as absolute dollars, WTP
missing values, values of zero, or WTP values that are
expressed as a percentage of ?nancial resources gener-
the same for both health states. We used ?2 tests to com-
ates more usable values, greater sensitivity to differences
pare differences in frequencies for these types of occur-
in severity between health states, better distribution prop-
rences between the formats. Those who gave missing or
erties, and is not associated with income. Furthermore,
zero values for both health states were excluded from the
asking WTP in terms of monthly amounts also shows
remaining analyses.
Second, we assessed normality of WTP values in terms
of skewness and kurtosis. Parametric models are often
3.1 Respondents
used to predict WTP responses and assume that WTP val-
ues and error terms are normally distributed. Even a small
Eight percent of those invited responded by clicking onto
misspeci?cation of the functional form in these analyses
our survey using a link from within the email invitation.
can result in large differences in predictions (Yeung et al.,
Of those who clicked onto the site, 75% (n=982) com-
pleted the survey. Of those who completed the survey,
Third, we assessed internal consistency with a simple
98% were included in the analyses, except where noted.
ordinal consistency check. WTP values should re?ect
5 were excluded because they were under 18 years old,
the lower impact that BKA has on mobility compared to
15 said they intentionally gave wrong answers, and one
paraplegia. Accordingly, we expect respondents’ WTP
gave invalid values (38,117 for both health states using
for treatments to be lower for BKA compared to paraple-
the monthly percentage format). The rate of exclusions
gia. We excluded cases where the value was the same for
were similar across the four versions of survey (p=.22.).
both health states from this portion of the analysis and
The remaining 961 respondents gave 1,812 non-zero and
they were not included in the denominator. We used ?2
non-missing WTP valuations; 55 (6%) gave missing or
tests to compare differences in the proportion of those
zero WTP values for both health states.
who were ordinally consistent between the groups.
The 961 respondents included in the analyses were not
Fourth, we tested the sensitivity of each of the WTP
statistically different across the experimental groups with
elicitation versions for detecting differences between the
respect to demographic factors (p-values > 0.15). Over-
two health states by computing Cohen’s d-statistic as
all, 31% of respondents identi?ed themselves as being a
a measure of effect size (Cohen, 1988). Larger effect
non-white race or Hispanic ethnicity. Self-reported mean
sizes indicate greater sensitivity and thus will require
age was 46 years (s.d.=16). Median education was some
smaller samples to detect statistical differences between
college but no degree. Overall, 59% of respondents were
two health states.
women. Just under half (44%) of respondents identi?ed
Our ?nal assessment was investigating the degree to
themselves as having “average” economic status and 47%
which WTP values correlate with reported income for
of respondents reported an income of $40,000 or less.
each of the four formats, using the Spearman rank cor-
relation coef?cient. Con?dence intervals were computed
3.2 Questionable Values
using the bias-correction and accelerated bootstrap esti-
mation method (Haukoos & Lewis, 2005). Two smaller-
55 (6%) respondents gave zero or missing values for both
scale studies that elicited WTP as a percentage of wealth
health states. Another 39 (4%) gave a zero or missing
did not ?nd this measure to be signi?cantly associated
value for one health state. The rate of zero or missing
with personal income (Schiffner et al., 2003; Thomp-
values was comparable across the four versions (Chi-
son, 1986). Nonetheless, it is possible that an association
square; p=.60). However, the rate of those who gave
would still persist in our study because people with low
zero or missing values for both health states varied by

Judgment and Decision Making, Vol. 2, No. 2, April 2007
Alternative approach to eliciting willingness to pay
Table 2: Summary of outcome criteria.
% Total
% Monthly
$ Total
$ Monthly
% respondents with the same WTP for both health states
55% **
0.81 **
1.24 **
7.64 **
2.93 **
0.72 **
5.42 **
2.00 **
3.97 *
70.77 **
15.19 **
1.82 **
38.68 **
8.77 **
Spearman rank correlation coef?cients for WTP and income
0.30 **
0.33 **
0.14 *
0.30 **
0.39 **
** p<0.01, * p<0.05
income (Wilcoxin rank-sum, p<.001); three-quarters of
or $325 per month; WTP, when elicited as percentages
these cases had income less than the median. It is pos-
was 53% as a total amount and 39% on a monthly basis.
sible that these subjects did not have any discretionary
?nancial resources with which to pay for a cure (Smith,
2005). Respondents who gave zero or missing values for
both health states were dropped from the remainder of the
3.4 Ordinal consistency of responses
Another type of potentially questionable value came
On average, 88% of respondents who gave different WTP
from respondents who gave the same non-zero, non-
values for the 2 health states were willing to pay more to
missing value for both health states. Table 2 shows
cure paraplegia than for BKA (Table 3). The rate of or-
the distribution of these cases. Participants assigned to
dinal consistency did not vary by whether or not WTP
a monthly format (dollar or percentage) gave the same
was elicited by month (p=0.41). However, respondents
WTP for both health states more often than those who
assigned to a percentage format had a higher rate of or-
were not (p=0.004). Participants assigned to a percentage
dinal consistency (91%) compared to those assigned to a
format (monthly or lump sum) gave the same WTP val-
dollar format (84%) (p=0.03).
ues for both health states less often than those who were
not (p=0.008). The combined effect resulted in only 35%
of participants who were assigned to the total percentage
format giving the same WTP for both health states while
3.5 Sensitivity to differences in severity
over half (55%) of participants assigned to the monthly
dollar format did so (p<0.001).
WTP means for the two health states were signi?-
cantly different, regardless of the elicitation format (p-
values<0.001). However, the differences in effect size
3.3 WTP values
across the versions varied considerably. The percentage
format on a total basis had nearly a 3 times larger effect
Table 3 shows mean and median WTP values for each of
size than the corresponding dollar format. The effect size
the elicitation formats. Respondents were willing to pay
for the percentage format on a monthly basis was over
$30,276 in total or $252 per month to cure BKA when
1.5 times larger compared to the effect size for dollars
WTP was elicited as dollars. WTP in terms of percent-
elicited on a monthly basis. As seen in Table 3, these dif-
ages were 35% of ?nancial resources as a total amount
ferences in effect sizes translate to dramatic differences
and 28% when elicited on a monthly basis. To cure para-
in sample sizes needed to detect differences between the
plegia, respondents were willing to pay $73,968 in total
two health states.

Judgment and Decision Making, Vol. 2, No. 2, April 2007
Alternative approach to eliciting willingness to pay
Table 3: WTP values by version
WTP elicited as:
% Total
% Monthly
$ Total
$ Monthly
% respondents willing to pay more to cure paraplegia than to cure BKA1
Cohen’s d effect size2
Sample size required3
1. Below-the-knee amputation. Include only respondents who gave different WTP values
for the two health states.
2. Effect size, used in power analyses, for comparing difference in mean WTP for BKA
and paraplegia for each of the elicitation versions.
3. Sample size that would be needed to detect the difference in mean WTP with 80% power
and 5% alpha level for each of the elicitation versions.
3.6 Normality of responses
relations obtained by using the percentage formats (p-
values<.01), except that the lump sum dollar format was
As can be seen in Table 3, there is a wide disparity be-
only marginally higher than using the monthly percent-
tween mean and median values, especially for the dollar
age format when eliciting values for curing paraplegia
amount formats, indicating highly skewed distributions.
(p=.06). WTP expressed in terms of percentage of ?nan-
Indeed, Table 2 shows that the skew statistics for the dol-
cial resources was signi?cantly correlated with income
lar value formats were 2.0 or higher, indicating a distri-
only for paraplegia and only if expressed on a monthly
bution that is skewed toward high positive values. The
skew statistics for 3 out of 4 of aggregate values using
percentage formats were less than 1.0. However, the only
distribution of responses that was statistically similar to
a normal distribution were WTP values elicited in terms
4 Discussion
of the total percentage of ?nancial resources for curing
paraplegia (p=.7). Most response distributions exhibited
Asking people to give their WTP as a percentage of ?-
signi?cant kurtosis, with kurtosis statistics as high as 71
nancial resources instead of asking for WTP as dollars is
for WTP values expressed as dollars. A normally dis-
a promising way to improve WTP measures that are typ-
tributed set of responses would have a statistic equal to
ically plagued by undesirable properties. We also evalu-
3.0. WTPs in terms of percent of ?nancial resources are
ated timeframe and found that the advantages of the per-
much closer to this target value and in fact, 2 of the 4 sets
centage format persisted when a “per month” instead of
of responses are statistically similar to that expected for a
a lump sum method was used. The percentage lump sum
normal distribution (p-values>0.2).
format yielded the fewest respondents who gave the same
value for two different health states with clearly different
3.7 Correlation with income
levels of severity and yielded the highest rate of respon-
dents who were ordinally consistent (WTP was higher for
WTP expressed as absolute dollars, in monthly and to-
curing the health state with the more severe impairment
tal timeframes, were both signi?cantly correlated with
[paraplegia] than for the less severe physical impairment
income for below-the-knee amputation and paraplegia.
[BKA]). The two percentage formats were substantially
These correlations were all signi?cantly higher than cor-
more sensitive to differences between health states and

Judgment and Decision Making, Vol. 2, No. 2, April 2007
Alternative approach to eliciting willingness to pay
thus more statistically ef?cient compared to WTP ex-
ses. High values may also indicate that people are giv-
pressed as absolute dollars in total or on a monthly ba-
ing extraordinarily high values that represent the impor-
sis. This improvement in sensitivity translates to an 8-
tance of perfect health without regard for whether they
fold reduction in the sample size required to detect com-
can make the tradeoffs necessary to afford the treatment.
parable differences in other studies when comparing the
best performing format (WTP as a total percent of ?nan-
cial resources) to the worst performer (WTP as total dol-
4.2 WTP correlation with income
lars). Both percentage formats yielded more nearly nor-
WTP expressed in absolute dollars clearly has a stronger
mally distributed WTP values compared to WTP in either
association with income than WTP expressed in terms
monthly or total dollars. The worst performer on every
of percentage of ?nancial resources. When WTP is ex-
criterion was WTP expressed as absolute dollars; either
pressed as a percentage, the association is negligible for
monthly or total, depending on the criteria. The superior
both health states with both percentage formats (this is a
psychometric properties assessed in this study for WTP
natural consequence if participants include their income
measured as a percent are good news considering that
in considering their ?nancial resources). WTP expressed
though many researchers recognize the challenging dis-
as absolute dollars showed moderate associations with
tribution properties of WTP values used in CBAs (cost-
income. In a recent study, WTP was less sensitive to
bene?t analyses), there has been little consensus on what
differences in health state, the higher the proportion of
to do about it (Donaldson, 1999).
income represented by their WTP because of personal
On average, participants were willing to pay 28% of
budget constraints (Smith & Richardson, 2005). The ex-
their ?nancial resources on a monthly basis (35% on a
traordinarily high proportion of people giving the same
total percentage basis) to cure BKA and 39% (53% on
value for both health states when expressing WTP in a
a total percentage basis) to cure paraplegia in our study.
single lump sum dollar amount may indicate that a bud-
The percentage for curing BKA is higher than the 17%
get ceiling comes into play more readily than with the
(Thompson, Read, & Liang, 1984) and 22% (Thomp-
other 3 formats; i.e., people give a WTP to cure BKA at
son, 1986) for relief of arthritis symptoms in the stud-
the maximum of what they can afford and they have no
ies by Thompson. Schiffner and colleagues also elicited
discretionary wealth remaining to cure paraplegia even
WTP directly as a proportion of monthly income. Pre-
though they may agree they would be worse off. On
treatment, psoriasis patients were willing to pay 14% of
the other hand, there is evidence that people are often
their income for a cure (Schiffner et al., 2003). It is dif?-
scale insensitive when giving WTP values — these val-
cult to assess whether the values obtained in our study are
ues may simply re?ect the respondent’s subjective desire
out of line with these previous studies because of differ-
to be healthy without considering difference in severity
ences in severity between the health states evaluated and
(Baron & Greene, 1996).
the myriad differences in elicitation methods among the
We have shown that WTP, elicited as a percentage, has
four studies.
superior measurement properties. However, some may
argue that we failed to measure what needs measuring
4.1 Distributional issues
(the amount people are willing to pay for various treat-
ment options) with this approach — after all, CBAs re-
Distributional properties of WTP expressed as absolute
quire dollars, not percentages. We argue, however, that
dollars are in line with results from other studies. Most
WTP measured as a percentage can be readily converted
studies, along with this one, make note of a positively
to dollar amounts in several ways, and thus provides more
skewed distribution of WTP expressed in absolute dol-
?exibility in addition to better measurement properties.
lars and use non-parametric approaches or mathematical
As with our study, Schiffner et al. (2003) and Thomp-
transformations prior to analyses to reduce undue in?u-
son et al. (1984; also Thompson, 1986) found no asso-
ence of high values. Our skewness statistics, ranging
ciation between income and WTP when WTP was ex-
from 2.0–2.9, for monthly WTP expressed in absolute
pressed as a percentage of wealth but, as with many prior
dollars is comparable with skewness statistics from an-
studies, we did ?nd that WTP elicited using absolute dol-
other study in which WTP was elicited using an open-
lars was moderately and signi?cantly associated with in-
ended format in an interview where participants were
come. The dissociation of WTP from income may be
asked for their WTP in terms of a “weekly, fortnightly,
cause for alarm for some economists who regard the pres-
monthly or yearly ?gure.” A speci?c timeframe was not
ence of this association as one criterion by which to val-
indicated. Skew statistics in that study ranged from 1.7–
idate the WTP values elicited (Brach et al., 2005; Don-
3.0 (Smith & Richardson, 2005). Even a highly skewed
aldson, 1999; Donaldson et al., 1997). This may be good
measure is not necessarily invalid, but skewed measures
news to others, however, who point out the ethical issues
require transformations or use of non-parametric analy-
that arise when WTP is associated with income — out of

Judgment and Decision Making, Vol. 2, No. 2, April 2007
Alternative approach to eliciting willingness to pay
fear that the “buying power” of the rich will give them
analyses. We did not intend to generalize actual WTP val-
a disproportionate voice in prioritization schemes (Olsen
ues obtained in this study but rather sought a diverse sam-
& Smith, 2001). Some researchers see merit in both con-
ple to participate in an experimental study. We were suc-
cerns (Donaldson et al., 2002).
cessful in recruiting a diverse sample with respect to age,
Percentages can be converted to dollars in two ways.
race and ethnicity, education, and income group. In ad-
First, for those concerned about the lack of association
dition, these demographic characteristics were balanced
of income with WTP, percentages can be converted to
across the experimental groups. Thus, we expect that the
dollars using individual income (Klose, 1999). Measure-
differences we observed in behavior with the four for-
ment issues aside, these dollars are the same as if elicited
mats in this study will extended to other similar popula-
directly and thus association with income will be estab-
tions. Our results were also in line with those obtained in
lished while preserving the psychometric properties of
two pilot studies we conducted using a paper survey of a
elicited percentages. In fact, backing into dollars this way
smaller convenience sample.
may result in WTPs that are more highly correlated with
WTP expressed as a percentage of monthly ?nancial
level of income than dollars elicited directly. People may
resources was lower than WTP expressed as a total per-
be under-sensitive to their own ability to pay because of
centage. Purely mathematically, the percentages should
the dif?culty of thinking about a dollar amount to pay for
be the same if the same sources of ?nances were con-
the good in question and then to consider whether they
sidered in the two timeframes. However, there are many
can afford that amount. The percentage format allows
reasons to believe this may not be the case. People may,
people to think directly in terms of proportion of what
in fact, be drawing upon different ?nancial resources on
they can afford, thus simplifying the task.
a monthly versus lump-sum basis. It would not be un-
Second, those concerned about association of WTP
reasonable for respondents to consider the wider range of
with income have the option of applying the average
assets that may be available to them on a one-time lump
WTP percentage to average income of the appropriate
sum basis. They may more willing to use their borrowing
population (or subgroup) to obtain average WTP in dol-
power or to dip into savings to cure their health condi-
lars, dissociated with income (Thompson et al., 1984),
tion with a single payment. The monthly timeframe may
an approach the World Bank has used to incorporate
more salient for many people who budget on a monthly
equity considerations in CBAs of healthcare projects.
basis and this format may focus respondents on cash
This approach incorporates distribution weighting con-
?ow where income may be the primary monthly source
sistent with an inequality-averse society (Brent, 2003) for
of incoming cash. Relatively speaking, smaller amounts
healthcare. Using raw WTP expressed as a proportion of
may be available for discretionary expenditures month-
?nancial resources will result in a group with one-quarter
to-month, after paying for things like housing, utilities,
average income having a weight of four while those in an
and food. Psychologically, shorter timeframes lead to
income group with four times the average would have a
more concrete thinking and predictions (Trope & Liber-
weight of one-quarter. However, some argue that this ap-
man, 2003). Though WTP as a percentage of total ?-
proach, at best, results in an “index of the strength of ‘so-
nancial resources performed well based on distributional
cial preferences”’ with obscure meaning that makes WTP
criteria, we cannot ignore the fact that half of our respon-
elicited as a percentage of income irrelevant from the per-
dents were willing to forego half or more of their ?nancial
spective of economic theories underlying the conduct of
resources to cure paraplegia while, on a monthly basis,
CBAs (Smith & Richardson, 2005), page 82). Resolving
the median amount was only 30%.
these differing viewpoints and challenges is beyond the
We did not actually convert WTP percentages into dol-
scope of this paper.
lars for this study. If we did so, based on our data and
assuming gross income as the denominator (the only ?-
4.3 Limitations and open questions
nancial measure we collected in this study), values would
be signi?cantly higher than dollars elicited directly (for
This study has several limitations. Our scenarios did not
both monthly and annual amounts). Such a comparison,
specify a timeframe in which payments would need to
however, is fraught with issues. Dollar amounts would
be made nor how long the cure would last if payment
likely be over-estimated because we would not be able
stopped. Though many studies do not spell out speci?c
to take taxes into account; most people consider after-tax
time-periods (Smith, 2003), it is important to do so to en-
income, not gross income when considering the dollars
sure consistent interpretation of the elicitation and results.
they can afford to pay for something. However, if peo-
We conducted this study over the Internet and had a low
ple really did consider more than just their income and if
initial response rate. However, once people clicked onto
we were not constrained by a yearly timeframe, then the
the survey, 75% of them completed the survey and 98% of
converted dollars would be under-estimates. It is clear
those responses were suf?ciently valid to include in our
that more study is needed to discern what respondents are

Judgment and Decision Making, Vol. 2, No. 2, April 2007
Alternative approach to eliciting willingness to pay
considering when giving their WTP in dollars or percent-
Regardless of format, further work is needed to determine
ages and more elaborate measures of wealth and income
the appropriate “dose” of information to help people dis-
are needed. The Health and Retirement Study is one ex-
cover what their true preferences are (Watson & Ryan,
ample where participants are asked for information about
2006) – whether coupled with an opportunity for peo-
many components that comprise their ?nancial resources
ple to deliberate various considerations (e.g., (Abelson et
(Juster & Suzman, 1995).
al., 2003; Damschroder, Ubel, Zikmund-Fisher, Kim, &
The WTP values elicited in our study were for curing
Johri, 2005; Dolan, Cookson, & Ferguson, 1999), feed-
relatively severe disabilities with idealized treatments.
ing back an interpretation of respondent’s WTP so they
Both of these factors led to relatively large, whole number
can af?rm or change their response (Watson & Ryan,
percentages for most participants. But the percentage for-
2006), or whether researchers simply need better ways to
mat may be dif?cult to use when placing value on more
uncover already existing underlying preferences without
modest (and realistic) treatments. For example, WTP for
being in?uenced by the method (Sugden, 2005). In ad-
mammography screening was as low as $12 in one study
dition, many psychological questions remain about what
(Yasunaga, Ide, Imamura, & Ohe, 2007); it would be very
WTP elicited using these kinds of methods actually rep-
dif?cult for people to estimate such small a percentage
resents. Common sources of biases have were described
of annual take home income. However, there is evidence
earlier but in addition, regardless of format, people tend
that even when eliciting WTP in terms of dollars, low val-
to give the same WTP for varying levels of goods (scale
ues may be less reliable than high values (Smith, 2006).
insensitivity), and WTP value for two units valued sep-
More work is needed to determine the validity of re-
arately is often higher than WTP for 2 units valued to-
sponses elicited through the Internet. Though we were
gether (lack of additivity) (Baron, 1997), WTP values are
concerned about the potential for a high level of protest or
often more re?ective of perceived market value or cost to
spurious responses, we did not see evidence of this. An-
produce and not a re?ection of their own personal valu-
other study elicited utilities for four different health con-
ation (Baron & Maxwell, 1996). Results from our study
ditions (including BKA and paraplegia) from this same
help to illuminate ways to elicit consistent and valid WTP
panel of Internet users who were recruited in the same
amounts from people over the internet, but do not solve
way at the same time. The large majority of responses
the larger issues around WTP values, which despite chal-
were reasonable and valid. Participants gave responses
lenges, continue to be used in CBAs of healthcare pro-
that were highly differentiated between four different
health conditions and 74% of those who gave different
utilities for BKA and paraplegia (comprising 62% of re-
spondents) gave rankings that were consistent with the
corresponding utilities (Damschroder, Zikmund-Fisher,
& Ubel, in press). Most of the “questionable” responses
Abelson, J., Eyles, J., McLeod, C. B., Collins, P., Mc-
in the present study were a result of respondents giving
Mullan, C., & Forest, P. G. (2003). Does deliberation
the same non-zero WTP for both health states. The high
make a difference? Results from a citizens panel study
rate of equal values is troubling, but this may partly be a
of health goals priority setting. Health Policy, 66, 95–
function of budget constraint (Smith, 2005). The elicita-
tion format appears to in?uence the rate of inconsistent
Arrow, K., R, S., Portney, P., Leamer, E., R, R., & H,
responses; evident in the lower rate of people with the
S. (1993). Report of the NOAA panel on contingent
dollar formats who did not conform to our ordinal crite-
valuation. Federal Register, 58, 4601–4614.
ria compared to the rate for the percentage formats. Many
Baron, J. (1997). Biases in the quantitative measurement
researchers insist that because of the high cognitive de-
of values for public decisions. Psychological Bulletin,
mand of WTP elicitations, in-person interviews are nec-
122, 72–88.
essary (e.g., Arrow et al., 1993). Our results are not much
Baron, J., & Greene, J. (1996). Determinants of insen-
different from another recent study using face-to-face in-
sitivity to quantity in valuation of public goods: Con-
terviews in a large diverse sample in which 41% of par-
tribution, warm glow, budget constraints, availability,
ticipants gave all zeros or equal non-zero WTP values for
and prominence. Journal of Experimental Psychology:
3 treatment programs (J.A. Olsen, Donaldson, Shackley,
Applied, 2, 107–125.
& EuroWill Group, 2005); a reason for some optimism
Baron, J., & Maxwell, N. P. (1996). Cost of public goods
for reliably eliciting WTP values using a web-based in-
affects willingness to pay for them. Journal of Behav-
ioral Decision Making, 9, 173–183.
Nonetheless, the larger question of whether people
Brach, M., Gerstner, D., Hillert, A., Schuster, A., Sos-
have consistent values for health conditions with which
nowsky, N., & Stucki, G. (2005). Development and
they are not familiar has yet to be answered de?nitively.
evaluation of an interview instrument for the monetary

Judgment and Decision Making, Vol. 2, No. 2, April 2007
Alternative approach to eliciting willingness to pay
valuation of expected and perceived health effects us-
Olsen, J. A., Donaldson, C., Shackley, P., & EuroWill
ing rehabilitation interventions as a model. Physikalis-
Group. (2005). Implicit versus explicit ranking: On in-
che Medizin Rehabilitationsmedizin Kurortmedizin,
ferring ordinal preferences for health care programmes
15, 76–82.
based on differences in willingness-to-pay. Journal of
Brent, R. (2003). Cost-bene?t analysis and health care
Health Economics, 24, 990–996.
evaluations. Cheltenham, UK: Edward Elgar.
Olsen, J. A., & Smith, R. D. (2001). Theory versus prac-
Cohen, J. (1988). Statistical Power Analysis for the Be-
tice: a review of “willingness-to-pay” in health and
havioral Sciences (2nd ed.). Hillsdale: Lawrence Erl-
health care. Health Econ, 10, 39–52.
baum Associates.
Payne, J. W., Bettman, J. R., & Schkade, D. A. (1999).
Damschroder, L. J., Ubel, P. A., Zikmund-Fisher, B. J.,
Measuring constructed preferences: Towards a build-
Kim, S. Y., & Johri, M. (2005). A randomized trial of a
ing code. Journal of Risk and Uncertainty, 19, 243–
web-based deliberation exercise: improving the qual-
ity of healthcare allocation preference surveys. Paper
Schiffner, R., Schiffner-Rohe, J., Gerstenhauer, M., Hof-
presented at the The 27th Annual Meeting of the Soci-
stadter, F., Landthaler, M., & Stolz, W. (2003). Will-
ety for Medical Decision Making.
ingness to pay and time trade-off: sensitive to changes
Damschroder, L. J., Zikmund-Fisher, B. J., & Ubel, P. A.
of quality of life in psoriasis patients? Br J Dermatol,
(in press). Considering adaptation in preference elici-
148, 1153–1160.
Smith, R. D. (2000). The discrete-choice willingness-to-
Diener, A., O’Brien, B., & Gafni, A. (1998). Health care
pay question format in health economics: Should we
contingent valuation studies: a review and classi?ca-
adopt environmental guidelines? Med Decis Making,
tion of the literature. Health Economics, 7, 313–326.
20, 194–206.
Dolan, P., Cookson, R., & Ferguson, B. (1999). Effect
Smith, R. D. (2003). Construction of the contingent val-
of discussion and deliberation on the public’s views of
uation market in health care: a critical assessment.
priority setting in health care: focus group study. BMJ,
Health Econ, 12, 609–628.
318, 916–919.
Smith, R. D. (2005). Sensitivity to scale in contingent
Donaldson, C. (1999). Valuing the bene?ts of publicly-
valuation: The importance of the budget constraint.
provided health care: does “ability to pay” preclude
Journal of Health Economics, 24, 515–529.
the use of “willingness to pay”? Social Science and
Smith, R. D. (2006). The relationship between reliabil-
Medicine, 49, 551–563.
ity and size of willingness-to-pay values: a qualitative
Donaldson, C., Birch, S., & Gafni, A. (2002). The dis-
insight. Health Economics, 9999, n/a.
tribution problem in economic evaluation: income and
Smith, R. D., & Richardson, J. (2005). Can we estimate
the valuation of costs and consequences of health care
the “social” value of a QALY? Four core issues to re-
programmes. Health Economics, 11, 55–70.
solve. Health Policy, 74, 77–84.
Donaldson, C., Thomas, R., & Torgerson, D. J. (1997).
Sugden, R. (2005). Anomalies and Stated Preference
Validity of open-ended and payment scale approaches
Techniques: A Framework for a Discussion of Coping
to eliciting willingness to pay. Applied Economics, 29,
Strategies. Environmental and Resource Economics,
32, 1–12.
Haukoos, J. S., & Lewis, R. J. (2005). Advanced statis-
Thompson, M. S. (1986). Willingness to pay and accept
tics: bootstrapping con?dence intervals for statistics
risks to cure chronic disease. Am J Public Health, 76,
with “dif?cult” distributions. Academic Emergency
Medicine, 12, 360–365.
Thompson, M. S., Read, J. L., & Liang, M. (1984). Fea-
Juster, F., & Suzman, R. (1995). An overview of the
sibility of willingness-to-pay measurement in chronic
Health and Retirement Study. Journal of Human Re-
arthritis. Med Decis Making, 4, 195–215.
sources, 30, S7–S56.
Trope, Y., & Liberman, N. (2003). Temporal construal.
Kahneman, D., Ritov, I., & Schkade, D. A. (1999). Eco-
Psychological Review, 110, 403–421.
nomic preferences or attitude expressions?: An analy-
Venkatachalam, L. (2004).
The contingent valuation
sis of dollar responses to public issues. Journal of Risk
method: a review. Environmental Impact Assessment
and Uncertainty, 19, 203–235.
Review, 24, 89–124.
Klose, T. (1999). The contingent valuation method in
Watson, V., & Ryan, M. (2006). Exploring preference
health care. Health Policy, 47, 97–123.
anomalies in double bounded contingent valuation. J
O’Brien, B., & Gafni, A. (1996). When do the “dollars”
Health Econ.
make sense? Toward a conceptual framework for con-
Whynes, D. K., Frew, E. J., & Wolstenholme, J. L.
tingent valuation studies in health care. Medical Deci-
(2005). Willingness-to-pay and demand curves: A
sion Making, 16, 288–299.
comparison of results obtained using different elicita-