Category Archives: Personality Change

The Black Box of Meta-Analysis: Personality Change

Psychologists treat meta-analyses as the gold standard to answer empirical questions. The idea is that meta-analyses combine all of the relevant information into a single number that reveals the answer to an empirical question. The problem with this naive interpretation of meta-analyses is that meta-analyses cannot provide more information than the original studies contained. If original studies have major limitations, a meta-analytic integration does not make these limitations disappear. Meta-analyses can only reduce random sampling error, but they cannot fix problems of original studies. However, once a meta-analysis is published, the problems are often ignored and the preliminary conclusion is treated as an ultimate truth.

In this regard meta-analyses are like collateralized debt obligations that were popular until problems with CDOs triggered the financial crisis in 2008. A collateralized debt obligation (CDO) pools together cash flow-generating assets and repackages this asset pool into discrete tranches that can be sold to investors. The problem is when a CDO is considered to be less risky than the actual debt in the CDO actually is and investors believe they get high returns with low risks, when the actual debt is much more risky than investors believe.

In psychology, the review process and publication in a top journal give the appeal that information is trustworthy and can be cited as solid evidence. However, a closer inspection of the original studies might reveal that the results of a meta-analysis rest on shaky foundations.

Roberts et al. (2006) published a highly-cited meta-analysis in the prestigious journal Psychological Bulletin. The key finding of this meta-analysis was that personality levels change with age in longitudinal studies of personality.

The strongest change was observed for conscientiousness. According to the figure, conscientiousness doesn’t change much during adolescence, when the prefrontal cortex is still developing, but increases from d ~ .4 to d ~ .9 from age 30 to age 70 by about half a standard deviation.

Like many other consumers, I bought the main finding and used the results in my Introduction to Personality lectures without carefully checking the meta-anlysis. However, when I analyzed new data from longitudinal studies with large national representative samples, I could not find the predicted pattern (Schimmack, 2019a, 2019b, 2019c). Thus, I decided to take a closer look at the meta-analysis.

Roberts and colleagues list all the studies that were used with information about sample sizes, personality dimensions, and the ages that were studied. Thus, it is easy to find the studies that examined conscientiousness with participants who were 30 years or older at the start of the study.

Study NWeightStart1Max.IntervalES
Costa et al. (2000)22740.4441990.00
Costa et al. (1980)4330.08366440.00
Costa & McCrae (1988)3980.0835646NA
Labouvie-Vief & Jain (2002)3000.0639639NA
Branje et al. (2004)2850.064224NA
Small et al. (2003)2230.046866NA
P. Martin (2002)1790.03655460.10
Costa & McCrae (1992)1750.0353770.06
Cramer (2003)1550.03331414NA
Haan, Millsap, & Hartka (1986)1180.02331010NA
Helson & Kwan (2000)1060.02334247NA
Helson & Wink (1992)1010.0243990.20
Grigoriadis & Fekken (1992)890.023033
Roberts et al. (2002)780.024399
Dudek & Hall (1991)700.01492525
Mclamed et al. (1974)620.013633
Cartwright & Wink (1994)400.01311515
Weinryb et al. (1992)370.013922
Wink & Helson (1993)210.00312525
Total N / Average51441.00411119

There are 19 studies with a total sample size of N = 5,144 participants. However, sample sizes vary dramatically across studies from a low of N = 21 to a high of N = 2,274. Table 1 shows the proportion of participants that would be used to weight effect sizes according to sample sizes. By far the largest study found no significant increase in conscientiousness. I tried to find information about effect sizes from the other studies, but the published articles didn’t contain means or the information was from an unpublished source. I did not bother to obtain information from samples with less than 100 participants, because they contribute only 8% to the total sample size. Even big effects would be washed out by the larger samples.

The main conclusion that can be drawn from this information is that there is no reliable information to make claims about personality change throughout adulthood. If we assume that conscientiousness changes by half a standard deviation over a 40 year period, the average effect size for a decade is d = .12. For studies with even shorter retest intervals, the predicted effect size is even weaker. It is therefore highly speculative to extrapolate from this patchwork of data and make claims about personality change during adulthood.

Fortunately, much better information is now available from longitudinal panels with over thousand participants who have been followed for 12 (SOEP) or 20 (MIDUS) years with three or four retests. Theories of personality stability and change need to be revisited in the light of this new evidence. Updating theories in the face of new data is at the basis of science. Citing an outdated meta-analysis as if it provided a timeless answer to a question is not.

Open-SOEP: No Significant Personality Change over 12 Years

Studying personality stability and change is easy and hard. It is easy because the method is straightforward. Administer a valid measure of personality to a group of participants and repeat the measurement several times. Describing the method takes a sentence or two compared to pages that describe an intricate laboratory experiment with an elaborate deception. It is hard because it requires time and participants may drop out of a study. Meanwhile there is nothing to publish while a researcher is waiting for the next retest. In our fast paced world of academic publishing where researchers are expected to publish 5 or more articles a year, there is no place for slow research. As a result, evidence on personality change is scarce. The best evidence so far comes from a meta-analysis that patched together small studies with different measures, populations, and small samples. Although this meta-analysis is the best evidence available, it cannot be trusted because the evidence is inconclusive.

Psychologists have to thank economists and sociologists who are used to collaborate on big data collections. One of these collaborations is the German Socio-Economic Panel (SOEP). The SOEP is an ongoing longitudinal study with a representative sample that started in 1984. In 2005, the SOEP included the BFI-S; a 15-item personality measure that assesses the Big Five. Since then, the BFI-S has been administered in four-year intervals in 2009, 2013, and 2017. Thus, we now have longitudinal data spanning 12-years with four waves of data. This makes it possible to revisit the question of personality stability with much better data than a meta-analysis of heterogeneous studies can provide. Surely, the results are based on a German sample, but there is little evidence that personality development varies across cultures.


One drawback of the SOEP is that each personality dimension is measured with just three items. This makes scale scores unreliable and scale scores can be contaminated with method variance (e.g., evaluative bias, acquiescence bias). To avoid these problems and to examine measurement invariance, it is better to analyze the data with a measurement model that examines personality change at the level of latent variables that correct for measurement error. I developed a measurement model for the SOEP (Schimmack, 2019a) and I already demonstrated invariance across the first three waves of the SOEP (Schimmack, 2019b). Here I added the fourth wave of data from 2017 to the dataset to produce even better information about long-term changes in personality.

To analyze the data, I first fitted the measurement model for the BFI-S to the data from each wave and imposed equality constraints to ensure measurement invariance. The longitudinal stability of personality was examined using the latent-trait-state (LTS) model that decomposes stability over time into two components; (a) a stable trait component that never changes and (b) a changing state component. The changing state component allows for factors that influence personalty to change over time and to change personality. These changing factors may produce changes that last a long time or changes that are more temporary. The time course of changes in personality is modeled with an autoregressive parameter that reflects how many of the changes at time 1 are still present at time 2.

The LTS model is typically fitted without modeling mean level changes. However, the model can also be used to model the mean structure in the data. In latent variable models, changes in personality are assumed to occur at the level of the latent traits, while item means (intercepts) are assumed to be constant over time. As the latent trait is stable, it cannot be used to model mean-level changes. Thus, one option is to free the means of the state factors. However, the influence of the state factors decreases over time, which is inconsistent with the idea of lasting changes in personality. Thus, a better option is to let the means of the occasion specific factors to vary freely, even if the occasion specific variance is zero. Although this model may lack realism, it would show the pattern of mean level changes in the data without imposing some model on the data (e.g., a linear trend).

The model specification and the complete results can be found on OSF (ttps:// The overall model fit was acceptable, CFI = .971, RMSEA = .019, SRMR = .031.

Rank-order Stability and Change

A study of the first three waves in the SOEP replicated earlier findings of high retest stability in personality with stabilities over .9 over a one-year period (Conley, 1984; Schimmack, 2019c). However, three ways are insufficient to separate trait variance from state variance, and few studies with four waves of personality are available. Anusic and Schimmack (2016) used a meta-analytic approach to do so on the basis of smaller studies. Their model suggested that about 70% of the reliable variance in personality is trait variance and that the remaining 30% state variance are rather unstable with a low annual stability of .3. This would suggest that any changes in personality do not last long and individuals quickly revert back to their trait level of personality.

Table 1 shows the results for the SOEP data.


The results show a similar split between trait and state variance as the meta-analysis, with about two-thirds of the variance being trait variance and one-third being state variance. A new finding is that the halo factor, an evaluative bias in personality ratings, also has 60% trait variance. Thus, this response style can also be considered a stable trait. In contrast, acquiescence bias has less trait variance and seems to be more influenced by momentary factors that are inconsistent over time.

The results for the stability of the state variance are different from the meta-analysis. The SOEP data suggest that changes in personality are more persistent than the meta-analysis suggested. The annual stability estimates are around .7. Thus, any changes that are evident at time 2 would still be evident over the next years. The stability over 4-years is around .3. These results are more encouraging for researchers who are interested in personality change than the meta-analytic results in Anusic and Schimmack, 2016). Nevertheless, the relatively small amount of state variance and the high stability of the state variance imply that it takes time to find even small changes in personality. Not surprisingly, it has been difficult to uncover predictors of personality change even in large samples like the SOEP (Specht et al., 2011).

In sum, the results confirm that personality ratings are highly stable over extended periods of time and that a large portion of this stability is caused by stable factors that ensure persistent individual differences in personality over the life span.

Mean Levels

Table 2 shows the results for the mean levels. Means in the first year, 2005, are used as the reference group. The results provide little evidence for personalty change in adulthood. None of the Big Five dimensions shows a consistent trend over time. The results for conscientiousness are most important because a meta-analysis suggested that conscientiousness increases substantially throughout adulthood. There is no evidence for such a trend in the SOEP.


The general pattern of decreases for all five dimensions suggests that acquiescence bias might have changed over time. Thus, I also fitted a model with free means for acquiescence bias but the results did not change. Thus, it does not account for the small decrease in the Big Five. Adding means for the halo factor, instead, reduces changes for most scales, but would suggest a stronger decrease in neuroticism. However, the pattern is never a gradual change, but a drop from time 1 to time 2 with no major changes afterwards. This suggests that some panel effect or period effects have small effects on personality ratings, but there is no evidence to support the claim that personality systematically changes throughout adulthood.


Personality research was attacked by situationists who claimed that personality is a mere social construction. In the 1980s, personality researchers had presented evidence that personality traits are real and stable using twin studies, multi-rater studies, and longitudinal studies. However, two meta-analysis by Roberts and colleagues suggested that personality exists but is less stable than personality psychologists assumed. These meta-analysis had a strong influence on personality psychology in the 2000s. They are featured in personality textbooks and often cited as evidence that personality still develops throughout adulthood. However, more recent evidence are more consistent with the view of personality as mostly stable throughout adulthood. Costa and McCrae famously compared personality to plaster. While it can be shaped and molded early on, it finally sets into a shape that can not be altered. Yes, there may be cracks here and there, but the overall shape is set. While this image may be too rigid, it is consistent with the evidence that even major life-events that occur during adulthood seem to have very little influence on personality (Specht et al., 2011).

The idea of personalty change is often coupled with the notion that personality develops and that there can be personal growth in adulthood. The problem with these notions is that it implies that there is a normative or desirable direction of personality change. For example, an increase in conscientiousness is seen as evidence of growing maturity. However, the measurement model that I used distinguishes between the denotative and connotative aspects of personality. Lazy is both descriptive and evaluative. However, evaluations are rooted in cultural norms and values. Why is it good to work as much as possible, to avoid mistakes at any costs? Should education and policies try to increase conscientiousness levels? Is there an optimal level? These are all very difficult questions that go well beyond the existing science of personality. Once we focus on the denotative aspect of personality, we see that some people work harder than others or that some people are more creative than others, and that these differences are fairly stable, without any evidence what causes this stability. Just like people differ in personality, they differ in other characteristics that have received more attention. Current culture aims towards greater acceptance of differences in sexual orientation, gender identity, body types, religion, etc. Maybe we should also include personalty traits there and let introverts be proud introverts and disagreeable people be proud disagreeable people. Maybe personality differences only exist because they were not a problem during human evolution or diversity is even an advantage that allows humans as a group to adapt to different circumstances. Thus, the strong evidence of personality stability is not necessary a problem that needs to be solved because there is normal personality. There is only normal variation in personality.

Open-SOEP: Cohort vs. Age Effects on Personality

The German Socio-Economic Panel (SOEP) is a unique and amazing project. Since 1984, representative samples of German families have been surveyed annually. This project has produced a massive amount of data and hundreds of publications. The traditional journal publications make it difficult to keep track with developments and to find related articles. A better way to make use of these data may be open science where researchers can quickly share information.

In 2005, the SOEP included a brief, 15-item, measure of the Big Five personality traits. These data were used for cross-sectional studies that related personality to other variables measured in the SOEP such as well-being (Rammstedt, 2007). In 2009, the SOEP repeated the measurement of the Big Five. This provided longitudinal data for analyses of stability and change of personality. Researchers rushed to analyze the data and to report their findings. JPSP published two independent articles based on the same data (Lucas & Donnellan, 2011; Specht, Egloff, Schmukle, 2011). Both articles examined age-differences across birth-cohorts and over time. Ideally age-effects would show up in both analyses and produce similar trends in the data. Both articles also paid little attention to cohort differences in personality (i.e., Germans born in 1920 who grew up during Nazi times might differ from Germans born in 1950 who grew up during the revolutionary 60s).

In 2017, the Big Five questions were administered again, which makes it easier to spot age-trends and to distinguish age-effects from cohort effects. Recently, the first article based on the three-waves of data was published in JPSP (Wagner, Lüdtke, & Robitzsch, 2019). The article focused on retest correlations (consistency of individual differences over time), and did not examine mean levels of personality. The article does not mention cohort effects.

Cohort/Culture Effects

Like many Western countries, German culture has changed tremendously during the 20st century. In addition, German culture has been shaped by unique historical events such as the rise and fall of Hitler, the second world war, followed by the Wirschaftswunder, the division of the country into a democratic and a socialist country and the unification of Germany after the fall of the Berlin Wall. The SOEP data provide a unique opportunity to examine whether personality is shaped by culture.

So far, studies of cultural influences on personality have mostly relied on cross-cultural comparisons of Western cultures with non-Western cultures. The main finding of these studies is that citizens of modern, individualistic nations tend to be more extraverted and open to experiences than citizens in traditional, collectivistic cultures.

Based on these findings, one might expect higher levels of extraversion and openness in younger generations of Germans who grew up in a more individualistic culture than their parents and grandparents.


The data are the Big Five ratings for the three waves in the SOEP (vp, zp, & bdp). Data were prepared and analyzed using R (see OSF for R-code). The three items for each of the Big Five scales were summed and analyzed as a function of 7 cohorts spanning 10 years (born 1978 to 1988 age 17-27 to age 77 to 87) and three waves (2005, 2009, 2013). The overall mean was subtracted from each of the 21 means and the mean differences were divided by the pooled standard deviation. This way, mean differences in the figures are standardized mean differences to ease interpretation of effect sizes.


Openness to Experience

Openness to experience showed a clear cohort effect (Figure 1) with the lowest scores for the oldest cohort (1918-28) and the highest scores for the youngest cohort (1978 to 1988). The difference between the youngest and oldest cohorts is d = .72, which is considered a large effect size. In comparison, there is no clear age trend in Figure 1. While, scores decrease from t1 to t2, they increase from t2 to t3. All differences between t1 and t2 are small, |d| < .2.


Extraversion also shows a cohort effect in the predicted direction, but the effect size is smaller, d = .34.

In contrast, there are no age effects and the overall difference between 2005 and 2013 is d = -0.01.


I next examined conscientiousness because studies of age effects tend to show the largest age effects for this Big Five dimension. Regarding cohort effects, one might expect a decrease because older generations worked very hard to rebuild post-war Germany.

Consistent with the developmental literature, the youngest age-cohort shows an increase in conscientiousness from 2005 to 2013, although the effect size is small (d = .21). The other age-cohorts show very small decreases in conscientiousness except for the oldest age-cohort that shows a small decrease, d = -.22. Regarding cohort effects, there is no general tend, but the youngest cohort shows very low levels of conscientiousness even in 2013 when they are 25 to 35 years old.


Developmental studies suggest that agreeableness increases as people get older. However, the SOEP data do not confirm this trend.

Within each cohort, agreeableness scores decrease although the effect sizes are very small. The overall decrease from 2005 to 2013 is d = -.09. In contrast, there is a clear cohort effect with agreeableness being the highest in the oldest generation. The decrease tends to level of for the last three generations. The effect size is moderate, d = -.38.


The main result for neuroticism is that there is neither a pronounced cohort effect, d = -.09, nor age effect, d = -.13.


Previous analysis of personality data in the SOEP have focused on age effects and interpreted cross-sectional differences between older and younger Germans as age effects. However, these analyses were based on only two waves of data, which makes it difficult to interpret changes in personality scores over time. The third wave shows that some of the trends did not continue and suggest that there are no notable effects of aging in the SOEP data. The only age-effect consistent with the literature is an increase in conscientiousness in the youngest cohort of 17 to 27-year olds.

However, the data are consistent with cohort effects that are consistent with cross-cultural studies. The more individualistic a culture becomes, the more open and extraverted individuals become. Deeper analysis might help to elucidate which factors contribute to these changes (e.g., education level). The results also suggested that agreeableness decreased which might be another consequence of increasing individualism.

Overall, the results suggest that personality is influenced by cultural factors during adolescence and early adulthood, but that personality remains fairly stable throughout adulthood. This conclusion is also supported by other longitudinal studies (e.g., MIDUS) that show little changes in Big Five scores over time. Maybe Costa and McCrae were not entirely wrong when they compared personality to plaster that can be shaped while it is setting, but remains stable after it is dried.

Should Governments Shape Personality

Dear Wiebke, Patrick, Mitja, Jaap, Marie, Christian, Richard, Maike, Ulrich, Jenny, Cornelia, Johannes, and Brent.

You suggested that personality traits are actionable targets for public policy (Bleidorn et al., 2019).  I was surprised and actually shocked by this proposal.  I have taught personality psychology for over a decade and I always emphasize that individual differences are normal and should be celebrated like we celebrate other aspects of human diversity in culture and in sports.  Therefore I don’t think personality interventions are needed or desirable. Maybe there is some fundamental misunderstanding, but reading your article suggests that you are really proposing that public policy should target personality traits.

This idea is not new.  Socialistic governments and fascist governments had ideals of the model citizen and aimed to fit their citizens into this mold.

In marked contrast, democracies and market economies are built on the idea that citizens’ well-being is maximized by personal choice. The rule of governments is mainly to protect the safety of citizens and to regulate conflict when individual preferences are in conflict.  Well-being surveys consistently show that free and individualistic societies produce higher well-being than societies that impose ideological or religious norms on their citizens. 

The history of psychology also casts a shadow on attempts to shape individuals’ personality.  When homosexuality was a taboo, the Diagnostic and Statistical Manual of Mental Discorders included homosexuality as a mental illness.  Today most psychologists consider it a success that homosexuality is considered an expression of personal preferences and conversion therapy to cure homosexuals from some invented illness is considered unethical. More generally, mental illness has been defined in terms of patients’ suffering and concerns about patients’ well-being rather than in terms of cultural norms of acceptable or unacceptable characteristics.

New insights into biological influences on many illnesses (e.g., cancer) have given rise to personalized medicine which is based on the idea that the same treatment can have different effects for different individuals.  Rather than trying to fit patients to treatments, personalized medicine aims to fit treatments to patients.

Given these general trends one could argue that modern societies need personality psychology because a better understanding of individual differences is needed to create policies that respect individual freedom and creates opportunities for individuals to pursue their own well-being and to maximize their own potential. The call to shape personality, however, seems to suggest the opposite.  In fact, the call for governments to regulate personality development seems so absurd that it is seems improbable that a group of modern, presumably liberal leaning, psychologists would argue for it.  Does this mean I misunderstood your article? I hope so, but reading it didn’t help me to understand your position.

We agree that personality traits as enduring factors (a.k.a. causes, dispositions) within an individual that influence their thoughts, feelings, and behaviors.  You propose that governments should influence personality traits because personality traits influence life outcomes.  For example, personality traits influence divorce.  If governments want to reduce the divorce rates, they could target the personality traits that lead to divorce.  Another advantage of changing personality traits is that they are broad dispositions that influence a range of diverse behaviors. For example, conscientiousness influences class attendance, health behaviors, and making your bed every morning. Instead of having different interventions for each behavior, making individuals more conscientious would influence all three behaviors.  

Most of the article discusses empirical research whether it is actually possible to change personality traits.  I am not going to quibble with you about the evidence here because it is irrelevant to the main question that your article brings up: if it were possible to change personality, should governments role out interventions that shape personality? As the article focused on the Big Five traits, the question is whether governments should make citizens more or less neurotic, extraverted, agreeableness, conscientious, or open to experience?

“Our most general assertion is that personality traits are both stable and changeable, which makes personality trait change a powerful and hitherto relatively underused resource for policy makers”

You appear to be so convinced that government interventions that target personality are desirable that you ask only when to intervene, what intervention to use, who to target, and how to intervene. You never stop to wonder whether interventions are a good idea in the first place.

For example, you suggest that increasing conscientiousness in adolescence is a desirable policy goal because “it could elicit a cascade of positive outcomes” (p. 19).  And decreasing neuroticism is good because it “could significantly reduce one’s likelihood of experiencing negative life events” (p. 19).

In passing you mention the main problem of your proposal to regulate personality. “This is not to say that there are optimal trait levels that should be universally promoted in all people” However, you do not reconcile this observation with your call for personality policies. If there are no optimal levels, then what should be the target of personality policies?  And are the previous examples not evidence that you consider higher conscientiousness and lower neuroticism as optimal? If they are not considered more optimal, why should governments develop interventions to increase conscientiousness and to reduce neuroticism?

You end with the conclusion that “personality traits are ideal targets for interventions designed to improve life success,” which once more begs the question what the goal of personality interventions should be.  What is life success?  We know the answer is 42 (h/t Hitchhiker’s Guide to the Galaxy), but we don’t really understand the question very well.  

To end on a more positive note, I do think that governments can play a role in helping individuals to have better lives with higher well-being, and national rankings of quality of life and well-being show that some governments are doing a better job than others.  One main indicator of a good life is a healthy and long life, and health care is both a major contributor to GDP and a major policy agenda. Good health includes physical health and mental health.  Prevention and treatment of mental health problems such as anxiety, depression, or addiction are important. Unlike personality, health can be defined in terms of optimal functioning and we can evaluate policies in terms of their effectiveness to maximize optimal functioning. Addressing those concerns is an important policy agenda and psychologists can play an important role in addressing these issues. But I prefer to leave normal variation in personality alone. As you noted yourself, there are no optimal personality traits. The best personality policy is to create free societies that let individuals pursue their own happiness in the way they want to pursue it.

Your disagreeable colleague,