Development of Personality in Early and Middle Adulthood : Set Like Plaster or Persistent Change ?

Text-only Preview

Journal of Personality and Social Psychology
Copyright 2003 by the American Psychological Association, Inc.
2003, Vol. 84, No. 5, 1041–1053
DOI: 10.1037/0022-3514.84.5.1041
Development of Personality in Early and Middle Adulthood:
Set Like Plaster or Persistent Change?
Sanjay Srivastava and Oliver P. John
Samuel D. Gosling
University of California, Berkeley
University of Texas at Austin
Jeff Potter
Cambridge, Massachusetts
Different theories make different predictions about how mean levels of personality traits change in
adulthood. The biological view of the Five-factor theory proposes the plaster hypothesis: All personality
traits stop changing by age 30. In contrast, contextualist perspectives propose that changes should be
more varied and should persist throughout adulthood. This study compared these perspectives in a large
132,515) sample of adults aged 21– 60 who completed a Big Five personality measure on the
Internet. Conscientiousness and Agreeableness increased throughout early and middle adulthood at
varying rates; Neuroticism declined among women but did not change among men. The variety in
patterns of change suggests that the Big Five traits are complex phenomena subject to a variety of
developmental influences.
How does personality change during adulthood? Psychologists
In this study, we set out to understand how personality traits
since William James (1890/1950) have struggled with the question
change in early and middle adulthood by examining the Big Five
of whether various aspects of personality, including personality
personality trait dimensions (Goldberg, 1992; John & Srivastava,
traits, change in meaningful ways during adulthood, and when
1999; McCrae & Costa, 1999). We used a cross-sectional design to
those changes take place. Contemporary hypotheses about the
study how mean levels of personality traits differ by age and
development of personality traits stem from theories about what
whether those age effects are moderated by gender.1 We were
personality traits are. McCrae and Costa’s (1996) five-factor the-
particularly interested in examining whether change on all of the
ory asserts that personality traits arise exclusively from biological
Big Five dimensions stops or slows in middle adulthood, as
causes (i.e., genes) and that they reach full maturity in early
predicted by the five-factor theory, or whether change is ongoing
adulthood; thus, this theory predicts little or no change on any
and differentiated, as predicted by contextualist theories.
personality dimension after early adulthood. By contrast, contex-
tualist perspectives argue that traits are multiply determined, and
Past Research on Mean-Level Change on the Big Five
that one important influence on traits is the individual’s social
During Adulthood
environment (Haan, Millsap, & Hartka, 1986; Helson, Jones, &
Kwan, 2002). Contextualist perspectives thus predict plasticity:
A recent literature review summarized previous studies of
Change is complex and ongoing, owing to the many factors that
mean-level change on the Big Five (Roberts, Robins, Caspi, &
can affect personality traits.
Trzesniewski, in press). In this review, Roberts et al. (in press)
rationally categorized a wide variety of personality measures into
the Big Five domains and summarized patterns of mean-level
change that were consistent across studies. They concluded that, in
Sanjay Srivastava and Oliver P. John, Institute of Personality and Social
general, Conscientiousness and Agreeableness tend to go up dur-
Research, University of California, Berkeley; Samuel D. Gosling, Depart-
ing adulthood, Neuroticism tends to go down, Openness shows
ment of Psychology, University of Texas at Austin; Jeff Potter, Cambridge,
mixed results across studies, and Extraversion shows no general
Sanjay Srivastava was supported by a National Science Foundation
pattern of change at the factor level. This basic pattern of findings
Graduate Research Fellowship; Sanjay Srivastava and Oliver P. John
were supported by National Institute of Mental Health Grant MH-43948.
We thank Ravenna Helson, Robert R. McCrae, and Frank J. Sulloway for
“Change” is a broad concept that can be defined in a variety of other
their helpful comments on drafts of this article. We also thank Frank J.
ways, such as rank-order change (whether people change in their ordering
Sulloway for developing the algorithm to remove repeat responders from
relative to age mates) and individual differences in change (whether
the database.
different individuals change at different rates over time). These other ways
Correspondence concerning this article should be addressed to Sanjay
of examining change address somewhat different substantive issues, and it
Srivastava, who is now at the Department of Psychology, Stanford Uni-
is possible to obtain conceptually compatible but different results with the
versity, Jordan Hall, Building 420, Stanford, California 94305. E-mail:
different approaches. (For a fuller discussion of different kinds of change,
[email protected]
see Caspi & Roberts, 1999.)

has been reported in specific studies by researchers who argue that
dimensions. We call the more recent “minor revision” (McCrae &
personality traits are affected by context (e.g., Helson et al., 2002;
Costa, 1999) the soft plaster hypothesis, because here personality
Helson & Kwan, 2000) as well as those who favor a strictly
is like plaster that has not fully hardened but is becoming more and
biological interpretation of traits (e.g., McCrae et al., 1999, 2000).
more viscous: Personality traits change more slowly after age 30
Although Roberts et al.’s (in press) conclusion seems to repre-
than before age 30.
sent some common ground among researchers, there is still con-
siderable disagreement: The biological and contextual perspectives
Contextual Perspectives on Personality and Change
disagree sharply over the timing of changes within the life course
and over whether there are any differences between men’s and
In contrast to the plaster hypothesis, contextual theories predict
women’s development.
that personality changes throughout adulthood (e.g., Haan et al.,
1986; Helson, Mitchell, & Moane, 1984; Neugarten, 1972). By
Set Like Plaster: The Five-Factor Theory
their very definition, contextual theories are necessarily more
varied than the five-factor theory, but viewed together they predict
According to the five-factor theory, personality traits are “insu-
different changes in personality during different life periods and, in
lated from the direct effects of the environment” (McCrae & Costa,
some formulations, different changes for men and women (Helson,
1999, p. 144) and are exclusively biological in origin. Change is
Pals, & Solomon, 1997; Wink & Helson, 1993).
addressed by Postulate 1c of the five-factor theory: “Traits develop
through childhood and reach mature form in adulthood; thereafter
Personality Changes as Person–Environment
they are stable in cognitively intact individuals” (McCrae & Costa,
1999, p. 145). More specifically, traits are said to reach maturity
by age 30 (e.g., Costa & McCrae, 1994; McCrae & Costa, 1999;
Social roles, life events, and social environments change during
McCrae et al., 2000). The predicted stability is expected to last
the life course, and such factors have been suggested as important
throughout middle age, though in old age personality could change
influences on basic personality traits (Haan et al., 1986; Hogan,
again, being disrupted by cognitive decline. A commonly used
1996). A number of researchers have focused on the transactions
metaphor for this pattern of change, based on a passage from
between individuals’ personalities and experiences. In the trans-
William James (1890/1950), is that personality becomes “set like
actional view, individuals are seen as active agents who play an
plaster” by age 30 (see Costa & McCrae, 1994); thus, we refer to
important role in selecting and shaping their environments, and
Postulate 1c, in its general form, as the plaster hypothesis.
these environments in turn affect their personalities. Often these
In its original formulation, the plaster hypothesis stated that
transactions serve to amplify or strengthen earlier dispositions
changes in Big Five traits after age 30 were nonexistent or trivial
(Caspi & Moffitt, 1993). For example, personality traits like Open-
(Costa & McCrae, 1994; McCrae & Costa, 1990, 1996). More
ness and ambition predicted women’s level of involvement in the
recently, the authors of the five-factor theory have indicated that
women’s movement in the 1960s and 1970s; involvement in the
the plaster hypothesis is “ripe for minor revision” (McCrae &
women’s movement, in turn, led to subsequent increases in Open-
Costa, 1999, p. 145), as studies have shown changes in mean levels
ness and ambition (Agronick & Duncan, 1998).
of personality traits after age 30 (e.g., McCrae et al., 1999, 2000;
Research on transactional person– environment processes gen-
see also Roberts et al., in press). They interpret such changes as
erally addresses individual differences in change, but the transac-
stemming from intrinsic biological maturation rather than social
tional perspective can be applied to understanding mean-level
influences, and they still regard the plaster hypothesis as basically
change as well. Just as individual differences in personality lead
true: “From age 18 to age 30 there are declines in Neuroticism,
individuals toward different experiences that subsequently affect
Extraversion, and Openness to Experience, and increases in Agree-
their personalities, normative changes in personality help prepare
ableness and Conscientiousness; after age 30 the same trends are
people for normative adult roles, which in turn can support further
found, although the rate of change seems to decrease” (McCrae et
personality changes. Thus, a transactional perspective on mean-
al., 2000, p. 183).
level change in personality would focus on normative role transi-
Despite this conclusion, no study that we are aware of has
tions—that is, transitions experienced by large numbers of people.
directly tested whether mean levels of the Big Five traits do in fact
Probably the three most important social role domains that
change less after age 30 than before. This may be in part because
undergo changes in early and middle adulthood are work, marriage
past research on adult development has compared discrete age
or partnership, and parenting. These three role domains correspond
groups, rather than treating age as a continuous variable. For
to the major tasks of adulthood identified by Erikson’s (1950)
example, McCrae et al.’s (1999, 2000) two recent cross-sectional
theory of adult development: work is involved in the adult task of
studies reported means for groups of 22- to 29-year-olds and
consolidating an identity; marriage/partnership in the task of inti-
means for groups of 30- to 49-year-olds, but the studies do not
macy; and parenting children in generativity. Although individuals
report the amount of change within those critical age ranges.
differ in the exact timing of when they take on work responsibil-
We thus set out to test the plaster hypothesis by directly com-
ities, form committed partnerships, and nurture children, there are
paring rates of change during the relevant age periods. In trans-
normative age ranges for these roles, suggesting that they may be
lating the plaster hypothesis into formal predictions about rates of
linked to typical mean-level personality changes.
change, we specified two versions of it. We call the original
Which personality factors are related to these role domains?
formulation (as described in Costa & McCrae, 1994) the hard
Conscientiousness has been linked to working, work performance,
plaster hypothesis: Age effects after age 30 should not be reliably
and work commitments (Barrick & Mount, 1991; Barrick, Mount,
different from zero, and this should hold for each of the Big Five
& Judge, 2001; Roberts, 1997; Vandewater & Stewart, 1998), and

to commitment to a stable partner relationship (Neyer & Asen-
differently during particular life periods, possibly in different ways
dorpf, 2001). Agreeableness should be most closely linked to
for men and women. Contextual perspectives, viewed together,
parenting and similar generativity-relevant tasks, as exemplified in
offer a metatheoretical counterpoint to the five-factor theory:
nurturing and prosocial behaviors (Graziano & Eisenberg, 1997;
Change on the Big Five is complex and multiply determined, and
John & Srivastava, 1999). Role transitions in work, partnership,
remains a fact of life well beyond early adulthood. The nature of
and childrearing take place throughout early and middle adult-
change may be different during different periods of adulthood,
hood: Normatively, most people enter new jobs in their early 20s
resulting in curvilinear age effects (Helson et al., 2002), and men
and begin advancing in their careers thereafter (U.S. Census Bu-
and women may change in different ways, resulting in Age
reau, 2000a), marry in their mid to late 20s (U.S. Census Bureau,
Gender interactions. Thus, we decided to examine the Big Five
2000b), and raise children in their 30s (U.S. Census Bureau,
with regression models that would test for such differences.
2000c). If the timing of personality changes is linked to the timing
of role transitions, there should be important changes in Consci-
Design of the Present Study
entiousness and Agreeableness, and these changes should be ap-
parent well into the 30s.
Our interest in testing hypotheses about different age effects
Aside from these normative social role changes, other theories
during different developmental periods, and about different age
suggest possible changes in personality traits after age 30. People
effects for men and women, raised the issue of statistical power.
get better at emotion regulation as they grow older and thus tend
Testing the hard and soft versions of the plaster hypothesis re-
to have fewer negative emotional experiences (Gross et al., 1997);
quires obtaining slope estimates for limited age ranges, and these
this could translate into persistently declining levels of Neuroti-
estimates would be unreliable in small samples. Furthermore, tests
cism with age. Socioemotional selectivity theory (Carstensen,
of interactions and curvilinear effects have considerably less
Isaacowitz, & Charles, 1999) predicts that as adults progress into
power than tests of main effects and linear trends (Chaplin, 1997;
middle and later adulthood, they are less and less interested in
McClelland & Judd, 1993). In short, we needed a large sample to
gathering new information and in meeting new people, implying
test our hypotheses. This concern led us to a medium for data
declining Openness and Extraversion, and more interested in re-
collection that has been available for only a few years, but offers
lationships with close others, implying increasing Agreeableness.
access to large numbers of willing participants: the Internet. The
Web revolution of the mid-1990s resulted in the massive intercon-
nection of American society to the Internet, making it possible to
Gender Roles and Personality Change
reach large numbers of participants. Research on Internet users
Do men and women differ in their development on any of the
indicates that, although they are not perfectly representative of the
Big Five dimensions? Women and men may develop differently
general population, they are quite diverse (Lebo, 2000; Lenhart,
because of gender-based social experiences (Helson et al., 1997;
2000), probably at least as much as more traditional samples of
Stewart & Ostrove, 1998). In particular, there may be develop-
undergraduate psychology students or research volunteers re-
mental differences on Neuroticism. Adolescent girls show higher
cruited through newspaper advertisements or word of mouth.
levels of Neuroticism than boys (del Barrio, Moreno-Rosset,
Thus, we used an Internet sample to examine two issues central
Lopez-Martinez, & Olmedo, 1997; Gullone & Moore, 2000; Mar-
to adult personality development. First, in a sample sufficiently
galit & Eysenck, 1990). Yet, studies of subsequent development
large to use new analyses well-suited to this question, we directly
during middle adulthood indicate that women’s self-confidence
tested the hard and soft versions of the plaster hypothesis to see
and coping skills improve with age (Helson & Moane, 1987;
whether personality does indeed become “set like plaster” after
Helson et al., 1997), suggesting decreasing levels of Neuroticism
age 30. Second, to map out the patterning of change in more detail,
primarily in women.
we used regression models with curvilinear and interactive effects
Few studies of both men and women have directly compared
to test for different changes during different life periods and for
changes in adult men’s and women’s Neuroticism.2 However, a
gender-specific development.
large study of Finnish twins aged 18 –59 followed members of
multiple cohorts longitudinally and found that in both longitudinal
and cross-sectional analyses, women decreased in Neuroticism
with age whereas men did not change (Viken, Rose, Kaprio, &
Koskenvuo, 1994). Similarly, a longitudinal study by Wink and
Participants were part of the Gosling–Potter Internet Personality Project,
Helson (1993) found that women became less emotionally depen-
a personality study of volunteers recruited and assessed over the World
dent and more competent with age; in contrast, men started adult-
Wide Web. Personality and demographic data were available for 132,515
hood less dependent and more competent than women but then
participants (54% female) between the ages of 21 and 60; the mean age of
remained relatively stable on these traits. Thus, we expected that
our participants was 31 years (SD
9 years). All selected participants
the gender difference in Neuroticism found in late adolescence and
lived in the United States or Canada (the latter represented 9.2% of the
college-age samples would narrow with age: women should de-
crease in Neuroticism during adulthood, whereas men should not
change much.
2 In a meta-analysis, Feingold (1994) compared the size of the gender
In summary, contextual perspectives diverge from the five-
difference between studies that used high school, college, and adult sam-
factor theory’s assertion that all of the Big Five follow just one
ples. Because all adult samples were grouped into a single category,
principle—no change—starting at age 30. Rather, a variety of
however, this analysis was not sensitive to any changes in the gender gap
developmental processes may affect each Big Five dimension
after the age of 21.

Respondents reported their ethnicity as one of six categories: 5,710
& West, 1999). A POMP score is a linear transformation of any raw metric
(4.5%) respondents were Asian, 3,893 (3%) were Black, 2,414 (2%) were
into a 0 to 100 scale, where 0 represents the minimum possible score and
Latino, 2,094 (2%) were Middle Eastern, 110,004 (86%) were White,
100 represents the maximum possible score; P. Cohen et al. (1999) rec-
and 3,569 (3%) indicated Other; 4% of respondents declined to report their
ommended POMP scores as a universal metric that is more intuitive than
ethnicity. We re-ran all of the regression analyses using dummy coded
scale scores with idiosyncratic ranges. In the case of the BFI, we trans-
variables for ethnicity; these controls had very little impact on the findings,
formed the 1-to-5 BFI metric into POMP scores by subtracting 1 and
and we report the analyses without these control variables.
multiplying by 25. Sample means and standard deviations, in POMP units,
We added a question about social class during the survey period, so this
were as follows: Conscientiousness M
63.8, SD
18.3; Agreeableness
information was available for a subset of the sample. Of these 42,578
66.4, SD
18.0; Neuroticism M
51.0, SD
21.9; Openness
74.5, SD
16.4; Extraversion M
54.6, SD
class,” 23,024 (54%) “middle class,” 10,718 (25%) “upper-middle class,”
Reliabilities, scale intercorrelations, and structural invariance across
and 817 (2%) “upper class.” This social-class distribution (19% below
age groups.
Reliabilities and scale intercorrelations were of special in-
middle class, 27% above) suggests that we were including participants
terest in this Internet sample. If there were problems with administering the
from a broad range of backgrounds. As with ethnicity, controlling for
BFI on the Internet, such as many random or otherwise unreliable re-
social class did not substantially change any of the effects reported in this
sponses, the coefficient alpha reliabilities of the five scales should be
considerably lower. In contrast, attempts to self-enhance for the sake of
receiving positive feedback should result in higher intercorrelations among
the scales. Our results showed that neither was the case. First, the alpha
reliabilities were very similar to earlier data (see John & Srivastava, 1999):
The data presented here come from a noncommercial, advertisement-
.82 in the Internet sample (compared with .82) for Conscientiousness;
free Web site ( that contains personality measures
.79 (.79) for Agreeableness; .84 (.84) for Neuroticism; .80 (.81) for Open-
as well as several games, quizzes, and questionnaires for entertainment
ness; and .86 (.88) for Extraversion. The scale intercorrelations were also
purposes, and was publicized in a number of ways. Potential respondents
similar to previous research; the mean of the absolute discriminant corre-
could find out about the site through several channels: it could be found
lations among the BFI scales was .16 as compared with .20 reported by
with major search engines under key words like personality tests; it was
John and Srivastava (1999), and the highest correlation between any two
listed on portal sites, such as Yahoo!, under their directories of personality
scales was only .29, as compared with .33.
tests, and at one point was selected as a Yahoo! “Pick of the Week”; and
Another concern was whether the BFI structure was invariant across
individuals who had previously visited the Web site and signed up for its
ages; if the pattern of factor loadings was different at different ages, that
mailing list received notification when the Big Five survey was added. As
would complicate the task of comparing scale scores. To test for this, we
is common on the Internet, news of the site apparently also spread quite
split the sample into four age groups spanning a range of 10 years each,
widely through informal channels, such as e-mails or unsolicited links on
then conducted factor analyses within each age group, extracting five
other Web sites.
factors in each analysis. The Big Five factors clearly replicated within each
Computerized administration means that data entry and scoring are
age group. We then computed factor congruence coefficients between age
automated; thus, it is possible to recruit participants by appealing to their
groups; the average congruence coefficient for the same conceptual factor
motivation to receive individualized personality feedback, for the purpose
across age groups was .99, reflecting both a high degree of structural
of self-insight or entertainment. To attract as broad and diverse a sample as
invariance and the unusually low sampling error in such a large sample.
possible, and to examine the effects of different recruiting approaches, we
Furthermore, we also computed the scale reliabilities separately by age and
used two distinct Web pages. One was entitled “All About You—A Guide
found that they did not vary with age.
to Your Personality” and was said to measure “what many personality
psychologists consider to be the fundamental dimensions of personality.”
The second Web page was entitled “Find Your Star Wars Twin,” with
feedback provided about “the characters from Star Wars with whom you
How did scores on the Big Five personality dimensions change
are most similar (based on the Big Five personality test).” Instructions for
with age? We report our analyses in two sections. First, we
both versions were the same and reminded participants to answer honestly
examined the two versions of the plaster hypothesis by (a) testing
to get accurate feedback. We consider later in the Results section how these
whether age slopes after age 30 are different from zero (hard
two Web pages differed, both in who they attracted and in the substantive
plaster) and (b) comparing age slopes after age 30 to those before
age 30 (soft plaster). Second, we tested models of the data that
allow curvilinear age effects and gender differences in the magni-
tude of age effects, using regressions with polynomial age effects
The Big Five Inventory (BFI).
To measure the Big Five personality
and gender interaction terms. We also estimated these models
dimensions, we used the BFI (John, Donahue, & Kentle, 1991). The 44 BFI
separately for the two Web page formats, to see whether effects
items consist of short and easy-to-understand phrases to assess the proto-
generalized across recruitment strategies.
typical traits defining each of the Big Five dimensions, making it ideal for
a large survey where we could expect respondents to devote a limited
amount of time. The BFI scales have shown substantial internal consis-
Is Personality Fixed After Age 30? Testing the Plaster
tency, retest reliability, and clear factor structure, as well as considerable
convergent and discriminant validity with longer Big Five measures
(Benet-Martinez & John, 1998; John & Srivastava, 1999). The scales have
The hard plaster hypothesis asserts that there should be no age
also shown substantial agreement between self- and peer-reports (John &
effects on any Big Five dimension after age 30; the soft plaster
Paulhus, 2003; Rammstedt & John, 2003). BFI items are rated on a 5-point
hypothesis asserts that age effects after age 30 should be weaker
scale ranging from 1
disagree strongly to 5
agree strongly (the full
BFI is reprinted in John & Srivastava, 1999).
For this report, we scored the BFI in an intuitive metric known as
3 Means and standard deviations broken down by age, gender, and Web
percentage of maximum possible (POMP) scores (P. Cohen, Cohen, Aiken,
page are available by request from the authors.

Table 1
Linear Slopes (and 95% Confidence Intervals) for the Age Effect on Each of the Big Five Factors, Computed Separately by Gender
and Age Period

t test to reject
z test to confirm
Big Five factor
Age 21–30
Age 31–60
hard plaster
soft plaster
Implication of tests
.48 ( .06)
.26 ( .03)
Change slows but does not stop after age 30
.46 ( .06)
.31 ( .04)
Change slows but does not stop after age 30
.10 ( .06)
.28 ( .03)
Change increases after age 30
.01 ( .07)
.20 ( .04)
Change increases after age 30
.25 ( .07)
.25 ( .03)
Change is similar before and after age 30
.06 ( .08)
.03 ( .04)
Little change before or after age 30
.04 ( .06)
.04 ( .03)
Change is of similar strength but in opposite
directions before and after age 30
.04 ( .05)
.15 ( .03)
Change increases after age 30
.09 ( .08)
.07 ( .04)
Change is of similar strength but in opposite
directions before and after age 30
.14 ( .08)
.05 ( .04)
Change slows but does not stop after age 30
Sample size for age 21–30: women, n
41,840; men, n
40,831. Sample size for age 31– 60: women, n
30,027; men, n
19,817. Numbers
in parentheses are 95% confidence intervals. The metric for the slopes is the percentage of maximum possible units per year. Negative values of the z test
indicate that the direction of the effect was contrary to that predicted by the soft version of the plaster hypothesis.
* p
** p
*** p
than age effects before age 30. To test these hypotheses, we
The results for Agreeableness disagreed
computed age slopes (i.e., regression coefficients from the Big
sharply with both versions of the plaster hypothesis, as is evident
Five dimensions regressed on age) within the two theoretically
in Table 1. Agreeableness increased significantly from age 31
important age ranges, 21–30 and 31– 60; such slopes indicate how
to 60 for both men and women, contradicting the hard plaster
much the predicted Big Five score increased or decreased per year.
hypothesis. Moreover, the age slopes for ages 31– 60 were sub-
The slopes and their 95% confidence intervals are presented in
stantially greater than the age slopes for ages 21–30; that is, the
Table 1.
data not only failed to support the soft plaster hypothesis, but they
All slopes were computed separately for men and for women.
went significantly in the opposite direction.
Because the plaster hypothesis makes the same predictions for men
Neuroticism yielded different results for men
and for women, results ought to replicate across gender. Statistical
and for women. For men, the slope for ages 31– 60 was not
tests of both versions of the plaster hypothesis are reported in
significantly different from zero, consistent with the hard plaster
Table 1. The test for the hard plaster hypothesis is the standard t
hypothesis; in fact, men did not show a significant age effect in
test of whether the slope for ages 31– 60 was different from zero.
either age period. Women, however, declined consistently in Neu-
To test the soft plaster hypothesis, we used a z test (Equa-
roticism. The slope for ages 31– 60 was significantly different from
tion 3.6.11 in J. Cohen & Cohen, 1983) to compare whether the
slope for ages 21–30 was stronger than the slope for ages 31– 60.4
zero, contradicting the hard plaster hypothesis. The age 21–30
As can be seen in Table 1, the raw Con-
slope for women was not significantly weaker than the age 31– 60
scientiousness slope for men during ages 31– 60 was B
.31, and
slope, a failure to support the soft plaster hypothesis.
the slope for women was B
.26. Both of these slopes were
significantly and substantially different from zero, a clear rejection
of the hard plaster hypothesis. The soft plaster hypothesis predicts
The hard plaster hypothesis predicts a zero effect, that is, a null
that the slopes from age 31 to 60 should not be as strong as the
hypothesis; thus, a significant result of this test would reject the hard
slopes from age 21 to 30. For both men and women, the earlier
plaster hypothesis. In contrast, the soft plaster hypothesis predicts a direc-
tional effect, that is, an alternative to a null hypothesis; thus, a significant
slope (B
.46 for women and B
.48 for men) was stronger. The
result of this test would support the soft plaster hypothesis. The different
z tests indicated that the differences between the slopes were
logic of these two tests means that statistical power makes them sensitive
significant, supporting the soft plaster hypothesis for Conscien-
in opposite ways. Because of the large sample size, even a very weak
tiousness. In short, this pattern indicates that people changed less
change after age 30 would lead to a rejection of the hard plaster hypothesis;
in Conscientiousness after age 30 than before age 30, but they
but the large sample size also means that even a very weak soft-plaster
clearly did not stop changing.
effect would lead to a confirmation of the soft plaster hypothesis.

Openness slopes for both men and women were
three models, we set the criterion that a more complex model
significantly negative after age 30, contradicting the hard plaster
would be retained only if it improved fit at F
25 ( p
10–5) over
hypothesis. However, age effects for Openness did not differ
a simpler model. We used this stringent cutoff to select models that
significantly from zero for the age 21–30 decade. Because both
could realistically be replicated and examined further in future
men and women marginally increased in Openness up to age 30
studies with smaller samples.
and then decreased in Openness after age 30, we tested the differ-
Patterns of change in work and partnership
ence of the absolute slopes to see if the magnitude was weaker
suggested that the most pronounced increases in Conscientious-
after age 30. For men, the decline after age 30 was stronger than
ness might occur during the 20s, followed by continuing growth at
the increase up to age 30, a significant rejection of the soft plaster
a slower rate. In statistical terms, this implied that the age effect on
hypothesis. For women, the magnitude of the decline after age 30
Conscientiousness would follow a decelerating function, with a
was not significantly different from the magnitude of the increase
steeply increasing slope in early adulthood that becomes flatter at
up to age 30, a failure to support the soft plaster hypothesis.
later ages. Our decision criteria supported a quadratic model for
Extraversion decreased significantly from age 31
Conscientiousness (see Table 2). A positive linear age term indi-
to 60 for women; the increase for men from age 31 to 60 was weak
cated that people were increasing in Conscientiousness at all ages,
and barely significant ( p
.04) by conventional criteria. These
but a negative quadratic age term indicated that the rate of increase
findings contradicted the hard plaster hypothesis, though given the
was greater at younger ages than at older ages. None of the age
statistical power of the analysis, the weak result for men was not
terms interacted with gender, indicating that men and women did
a resounding rejection. For men, the soft plaster hypothesis was
not change in Conscientiousness at different rates. In Figure 1A,
confirmed for Extraversion. For women, though, the absolute
the quadratic functions for men and women are plotted along with
strength of the increase from age 21 to 30 was not different from
the observed Conscientiousness means at each age. The close
the absolute strength of the decrease from age 31 to 60, failing to
correspondence between the function and the means from the raw
support the soft plaster hypothesis.
data suggests that the quadratic function captures the normative
Overall, then, we did not find widespread support for either
trend quite well.
version of the plaster hypothesis. Out of 10 tests of each hypoth-
If Agreeableness is linked to nurturing and
esis—five dimensions tested separately for men and women—
raising children, we would expect that it would increase the most
only one fit the hard version (two if the result for men’s Extra-
during the late 20s and 30s. Because such an increase would occur
version is counted) and only four fit the soft version.
in the middle of the age range being analyzed, a cubic model
would be necessary to fit such an effect. The data provided clear
The Pattern of Big Five Development: Modeling Age and
support for a cubic model (see Table 2) under the decision criteria;
Gender Effects
the sample means and the cubic fit line are plotted in Figure 1B.
The increase in Agreeableness accelerated in the late 20s, and
Figure 1 shows the mean scores on each of the Big Five
Agreeableness continued to increase rapidly through the 30s be-
dimensions for each age, separately for men and women. We used
fore slowing down (but continuing to increase) in the 40s; thus, the
regression models to map out the relations of age and gender to
period of most rapid increase coincided with the ages at which
personality. In fitting these regression models, the large sample
people are typically giving birth and nurturing their dependent
allowed for very sensitive tests of polynomial (curvilinear) and
children. There was a significant interaction between gender and
interactive effects. However, a model selection process in which
the first-order age term, but no gender interactions with the qua-
one continues to add polynomial terms as long as they meet
dratic or cubic term (see Table 2); in other words, women in-
conventional significance criteria was likely to produce unrepli-
creased overall in Agreeableness more than men did, but the
cable, uninterpretable, and unnecessarily complex models in such
degree of curvature in the functions did not differ substantially
a large sample. Examining previous studies of curvilinear age
between men and women. Though we did not predict it, this would
effects on personality (e.g., Haan et al., 1986; Helson et al., 2002),
be consistent with women, on average, being involved in nurturing
we concluded that cubic (i.e., third-order) models with gender
roles to a greater extent than men.
interactions seemed like the most complex models that could make
Neuroticism was the strongest candidate for a
a substantive contribution, so we set this as a practical limit on
Age interaction. The decision criteria indicated that
model complexity. Thus, for each Big Five factor, we considered
Neuroticism was well described by the linear model. This model,
three possible models:
along with the observed Neuroticism means, is graphed in Figure
1C. The significant Age
Gender interaction (see Table 2)
Linear: B5
b1 AGE
b2 GEN
b3 AGE*GEN .
revealed that women declined substantially in Neuroticism
Quadratic: B5
throughout adulthood; men declined quite modestly. By late adult-
b1 AGE
b2 GEN
hood, the earlier gender difference had nearly disappeared.
b4 AGE 2
b5 AGE2*GEN .
For Openness, our model-selection criteria resulted
in a linear model (see Table 2; a quadratic model would have
Cubic: B5
b1 AGE
b2 GEN
improved the fit but only at F[2, 132509]
10.7). The linear
model was consistent with previous developmental findings— both
b4 AGE 2
b6 AGE 3
b7 AGE3*GEN .
men and women declined in Openness with age, but only slightly.
(In the above equations, B5 stands for the Big Five dimension
A small, unpredicted interaction suggested that men began adult-
being modeled, AGE represents age centered around its mean, and
hood slightly higher in Openness but then declined at a faster rate
GEN is a contrast code for gender.) To select from among these
(see Figure 1D).

Figure 1.
Mean Big Five scores broken down by age and gender, with fit curves from the regression models
(see Table 2). Vertical lines at age 30 indicate when the plaster hypothesis predicts that personality stops
changing. POMP
percentage of maximum possible.

Table 2
Regression Models of the Relations of Age and Gender to the Big Five
Regression term
Conscientiousness (R
10 22
10 22
.09/ .12
10 10
.03/ .02
Agreeableness (R
10 22
10 22
.10/ .10
10 7
.03/ .02
10 19
10 17
.11/ .05
.01/ .01
Neuroticism (R
10 22
.06/ .05
10 22
.19/ .22
10 22
Openness (R
10 22
.05/ .02
10 22
10 4
.02/ .00
Extraversion (R
.00/ .01
10 22
.10/ .11
10 6
Note. N
132,515. Replication
s are from the All About You and Star Wars versions of the questionnaire,
respectively. Age is mean-centered at 30.6. Gender is contrast-coded: female
1, male
1. All terms are
from final models.
Extraversion was best fit with a linear model
effect size of r
.11). Controlling for gender, the greatest
(see Table 2). An unpredicted Age
Gender interaction indicated
between-sites difference on the Big Five was for Openness; Star
that men increased slightly in Extraversion with age whereas
Wars respondents tended to be slightly more open to experience
women decreased slightly, resulting in a diminishment of gender
(partial r
differences with age (see Figure 1E).
In spite of these differences, though, the regression models
Replication of the models across Web site formats.
clearly replicated across the two Web formats; Table 2 presents the
p-values associated with the tests of the polynomial models are
standardized betas from these replication analyses for comparison
generally quite small, some so small as to exceed the computa-
purposes. When we analyzed the respondents for the two Web
tional limits of our data analysis software (SPSS 10.0.7, which ran
formats separately, we found that for Conscientiousness, Agree-
out of decimal places at p
10–22). Small p-values are one
ableness, and Neuroticism, all the same terms were significant in
indicator of the reliability of an effect; cross-validation is another.
the same direction and general magnitude, and the shape of the
The study design allowed for an internal replication analysis
plotted curves was remarkably similar. For Openness, the linear
between the two versions of the questionnaire.
term representing the predicted decline replicated across both Web
Were the respondents to the two Web pages different enough to
formats; the unpredicted Age
Gender term found in the full
constitute replication samples? Although we would not want to
sample was not significant in the Star Wars data. For Extraversion,
overstate the differences, the Web pages arguably appealed to
the main effect of gender and the Age
Gender interaction
somewhat different motives of potential participants. As the name
replicated across both Web formats. These replications further
implies, All About You was described as an opportunity to learn
underscored the reliability of the effects.
about oneself, whereas Star Wars was designed to appeal primarily
as an intriguing and fun experience. The two sites did, in fact, draw
somewhat different profiles of participants. All About You drew
66% women and 34% men, whereas Star Wars drew 39% women
In this article, we examined the relation between age, gender,
and 61% men. Respondents to All About You were on average
and personality traits in adulthood. We took advantage of a large
about 2 years older than respondents to Star Wars (equivalent to an
sample size and continuous age distribution to test hard and soft

versions of the plaster hypothesis, which assert that personality
(i.e., changes in current social climate), which can cause variation
change stops (hard plaster) or slows (soft plaster) after age 30. On
between measurements taken at different historical times; and (c)
no Big Five dimension did we find support for the hard plaster
cohort effects (i.e., generational differences), which can cause
hypothesis among both men and women, and only Conscientious-
variation between people born and raised in different historical
ness demonstrated the soft plaster effect for both genders. To test
periods. Most common designs cannot fully disentangle these
for more complex change patterns, we then fitted curvilinear and
effects: Cross-sectional designs cannot intrinsically differentiate
interactive models to the data that yielded clear and robust effects.
between developmental and cohort effects; longitudinal designs
Conscientiousness increased throughout the age range studied,
cannot separate developmental and historical effects; and time-lag
most strongly during the 20s; Agreeableness increased the most
designs (which compare different samples measured in different
during the 30s; and Neuroticism declined with age for women but
years) cannot separate historical and cohort effects. However,
not much for men. Openness showed small declines with age, and
jointly considering different designs can provide insights. If the
Extraversion declined for women but not men.
results of a cross-sectional study agree with results from longitu-
dinal studies, they can be interpreted as arising from development,
Addressing Sampling and Cohort Concerns
which is the only common effect between the two designs; and if
In a study that uses a cross-sectional design to make inferences
cross-sectional results agree with results from time-lag studies,
about developmental effects, differential sampling by age and
they can similarly be interpreted as arising from cohort differences.
cohort differences are both potential sources of confounds. Al-
(The two interpretations are not mutually exclusive.)
though no cross-sectional design can completely rule out such
In this case, the Internet findings agree with the broad trends
factors, they need not be fatal flaws if results are interpreted in
among longitudinal studies reviewed by Roberts et al. (in press):
relation to existing theory and research. In this section, we con-
Conscientiousness and Agreeableness went up, Neuroticism went
sider several factors that might address concerns about sampling
down (their review did not differentiate men and women), and
and cohort confounds. We then compare age trends in the Internet
Openness and Extraversion changed little. Time-lag studies of the
sample with those found in recently published data to see whether
Big Five are far rarer than cross-sectional and longitudinal studies;
the Internet results are comparable with those found with more
two important recent meta-analyses offer a comparison source for
traditional sampling methods and in other countries.
Neuroticism (Twenge, 2000) and Extraversion (Twenge, 2001).
Sampling considerations.
One concern was that Internet re-
These studies examined how, across studies, publication year
cruitment might produce age-confounded sampling biases.5
correlated with sample means for anxiety and Extraversion. Both
Younger Internet users might be a fairly broad group, whereas
traits increased over time from the middle to late 20th century. If
older Internet users might be a more select subset of older people
these results were due to cohort effects (rather than secular trends),
in general; if so, age effects might be artifacts of who was likely
that would show up in the cross-sectional Internet data as older
to end up in our study rather than true age effects. In part, our focus
people being both lower in Neuroticism and lower in Extraversion
on early and middle adulthood provided some assurance against
than younger people. The former was true for women but not for
such a possibility: Survey research on Internet usage and age
men, and the latter did not appear to be the case. Thus, compari-
indicates that although there is a sharp change in usage patterns
sons suggest that at the level of general trends, it seems most
around retirement age, people of preretirement age (i.e., ages
appropriate to attribute the Internet results to developmental ef-
50 – 64) have similar access rates to younger people, and preretire-
fects, though we cannot rule out the possibility that cohort differ-
ment Internet users are representative of the overall Internet pop-
ulation in how they spend their time online (Fox et al., 2001). We
ences may contribute additionally to explaining the age effects. It
were also reassured by our comparisons of the two Web page
is important to note that both developmental effects and cohort
recruiting strategies. Age effects replicated across the self-
effects are inconsistent with the five-factor theory; cohort effects
learning-oriented “All About You” page and the entertainment-
represent an environmental influence that traits are supposed to be
oriented “Find Your Star Wars Twin” page, suggesting that if there
immune to. Thus, the age effects obtained here are inconsistent
was a selection bias in our sampling, it would have to have been
identical across recruiting strategies.
We also considered what age-confounded sampling bias would
look like if it had happened. A possible scenario was that the
5 Another potential sampling confound, not specific to an Internet de-
Internet may be less familiar to older people, and thus older people
sign, could be selective mortality. Individuals high in Conscientiousness
may have to be higher in Openness in order to seek out and
tend to die earlier (Friedman et al., 1993); thus, cross-sectional increases in
participate in an online study. In fact, we found a small drop in
Conscientiousness could occur because at older ages, fewer low-
Openness with age, not the increase one would predict if older
Conscientiousness individuals are available as participants. To test for such
effects, we examined the standard deviations computed separately at each
adults needed to be especially open in order to participate. If there
age. If age effects were due to selective mortality, the standard deviations
was any Openness-related sampling bias, it was sufficiently weak
should get smaller with increasing age, as the less conscientious partici-
that it did not overwhelm the age effect.6
pants were selected out. In fact, age was not related to the standard
Untangling developmental effects from secular trends and co-
deviation, suggesting that the Conscientiousness effect was not driven by
hort effects.
Schaie (1977) discussed three kinds of effects that
selective mortality.
developmental researchers need to consider in designing studies:
6 Again, the standard deviation of Openness did not increase with age, as
(a) effects of the individual’s development, which are frequently of
would be predicted by the hypothesis that the sampling procedure favored
primary interest to developmental researchers; (b) secular trends
participants higher in Openness at older ages.

with the five-factor theory regardless of how such effects are
Table 3
substantively interpreted.
Relative Magnitude of Annual Change During Two Age
Addressing sampling and cohort effects through empirical com-
Another way to address concerns about this new
method is through comparisons with data from studies that use
Big Five dimension
Interval 1
Interval 2
more traditional approaches. If age patterns in the Internet data
Age groups that define interval
matched those found in cross-sectional data collected by other
22–29 vs. 30–49
30–49 vs. 50
sampling methods, such cross-method agreement would suggest
20–40 vs. 45–65
that the Internet results were not a product of sampling confounds.
Furthermore, some researchers have argued that cohort effects, if
they exist, should be culture-specific (Yang, McCrae, & Costa,
1998); if so, then similarities to cross-cultural findings should
diminish concerns about cohort confounds.
Data in two recent articles by McCrae et al. (1999, 2000) stood
out as a particularly appropriate source for comparing main effects
of age. These articles, which used the Revised NEO Personality
Inventory (NEO PI-R) and the NEO Five Factor Inventory (NEO-
FFI) measures (Costa & McCrae, 1992), reported Big Five means
for several adult age groups, allowing us to make comparisons of
the direction of change and, to some extent, the approximate shape
of change. Big Five self-report data were reported separately for
nine different countries (i.e., Britain, Croatia, Czechoslovakia,
The number of arrows indicates the relative rates of change within
Germany, Italy, Portugal, South Korea, Spain, and Turkey). These
each row. NEO-International data come from McCrae et al. (1999, 2000);
BFI-German data come from Lang et al. (2001), NEO
NEO Personality
data had been collected under a wide variety of recruitment and
Inventory, Revised; BFI
Big Five Inventory.
sampling procedures, including a twin study, a study of parents
and their teenage children, and several other convenience samples.
We performed secondary analyses of the results reported from
these nine samples,7 aggregating the T scores across the nine
the cubic curve at three points). Neuroticism declined in the
countries into averages (weighted by sample size) for the three age
comparison samples, but because the reports on the comparison
groups used by the original authors: 22–29, 30 – 49, and 50 . We
samples did not provide results separately for men and women, it
were thus able to derive rough age trends (based on three data
is not possible to know whether the trend was driven primarily by
points) across the nine samples; we refer to these aggregated data
women, as it was in our sample. Openness and Extraversion
as the NEO-International data. Results from data aggregated across
declined in the NEO-International data but did not change sub-
these various cultures and recruiting procedures probably reflect a
stantially in the BFI-German; the latter is consistent with findings
conservative number of reasonably robust effects, providing a
in the Internet data. Thus, the general direction of effects in the
stable baseline for making comparisons.
Internet data was similar to that of studies collected with different
Although the BFI and NEO questionnaires are mostly similar,
the instruments do define two of the five factors, Openness and
Nevertheless, none of the above considerations completely rule
Extraversion, somewhat differently (John & Srivastava, 1999).
out sampling or cohort confounds, and the present study should not
Thus, we also wanted to compare results with a sample that used
be taken as the last word about Big Five development in early and
the BFI. To do this, we derived age trends from published Big Five
middle adulthood. Future longitudinal and sequential studies are
data collected in Germany (Lang, Luedtke, & Asendorpf, 2001).
needed to further address the possibility of such confounds. More
Participants from the Berlin, Germany, metropolitan area were
importantly, we hope that such studies, which are more time-
recruited by a gold-standard procedure, stratified random sampling
intensive and expensive than cross-sectional studies, can use the
(stratification was on age and gender). They completed a German
present results as a springboard to generate and test focused
translation of the BFI. Only two age groups (20 – 40 and 45– 65) in
hypotheses about the timing and theoretical significance of
the BFI-German data overlapped with the Internet sample, so we
changes in personality traits.
could examine only one age interval.
Results of our secondary analyses of the NEO-International data
Personality Theories and Adult Development
and the BFI-German data are presented in Table 3. On Conscien-
tiousness, both comparison samples showed increases throughout
We tested predictions about Big Five development made from
adulthood that were greater in earlier than in later adulthood,
two perspectives: the biologistic Five-factor theory, and ap-
consistent with the Internet results presented in this report. Agree-
proaches that allow for contextual influences on traits. What do the
ableness also went up in both comparison samples; the NEO-
results say about personality theories and adult development?
International data showed similar-sized increases during both in-
tervals, which is consistent with the cubic trend found in the
Internet data (with only three data points, it is impossible to
7 A subset of the German data reported in McCrae et al. (1999) was also
differentiate a cubic trend from a linear trend; this can be demon-
included in McCrae et al. (2000). Therefore, only the 1999 data were used
strated by placing a straight ruler on Figure 1B such that it crosses
for the comparison baselines.