Prefix
This blog post was inspired by my experience to receive a rejection of a replication manuscript. We replicated Diener et al.’s (1995) JPSP article on the personality structure of affect. For the most part, it was a successful replication and a generalization to non-student samples. The manuscript was desk rejected because the replication study was not close enough in terms of the items and methods that we used. I was shocked that JPSP would reject replication studies, which made me wonder what the criteria for acceptance are.
Abstract
In 2015, JPSP started to publish online only replication articles. I examined what these articles have revealed about the replicability of articles published in JPSP before 215. There were only 21 articles published between 2015 and 2020. Only 7 of these articles reported replications of JPSP articles, one included replications of 3 articles. Out of these 9 replications, six were successful and 3 were failures. This finding shows once more that psychologists do everything in their power to appear trustworthy without doing the things that are required to gain or regain trust. While fabulous review articles tout the major reforms that have been made (Nelson et al., 2019), the reality is often much less glamourous. It remains unclear which articles in JPSP can be trusted and selection for significance may undermine the value of self-replications in JPSP.
Introduction
The past decade has revealed a replication crisis in social psychology. First, Bargh’s famous elderly walking study did not replicate. Then, only 14 out of 55 (25%) significant results could be replicated in an investigation of the replicability of social psychology (Open Science Collaboration, 2015). While some social psychologists tried to dismiss this finding, additional evidence further confirmed that social psychology has a replication crisis (Motyl et al., 2017). A statistical method that corrects for publication bias and other questionable research practices estimates a replication rate of 43% (Schimmack, 2020). This estimate was replicated with a larger dataset of the most cited articles by eminent social psychologists (49%; Schimmack, 2021). However, the statistical estimates assume that it is possible to replicate studies exactly, but most replication studies are often conceptual replications that vary in some attributes. Most often the population between original and replication studies differ. Due to regression to the mean, effect sizes in replication studies are likely to be weaker. Thus, the statistical estimates are likely to overestimate the success rate of actual replication studies (Bartos & Schimmack, 2021). Thus, 49% is an upper limit and we can currently conclude that the actual replication rate is somewhere between 25% and 50%. This is also consistent with analyses of statistical power in social psychology (Cohen, 1961; Sedlmeier & Gigerenzer, 1989).
There are two explanations for the emergence of replication failures in the past decade. One explanation is that social psychologists simply did not realize the importance of replication studies and forgot to replicate their findings. They only learned about the need to replicate findings in 2011 and when they started conducting replication studies, they realized that many of their findings are not replicable. Consistent with this explanation, Nelson, Simmons, and, Simonsohn (2019) report that out of over 1,000 curated replication attempts, 96% have been conducted since 2011. The problem with this explanation is that it is not true. Psychologists have conducted replication studies since the beginning of their science. Since the late 1990, many articles in social psychology reported at least two and sometimes many more conceptual replication studies. Bargh reported two close replications of his elderly priming study in an article with four studies (Bargh et al., 1996).
The real reason for the replication crisis is that social psychologists selected studies for significance (Motyl et al., 2017; Schimmack, 2021; Sterling, 1959; Sterling et al., 1995). As a result, only replication studies with significant results were published. What changed in 2011 is that researchers suddenly were able to circumvent censorship at traditional journals and were able to published replication failures in new journals that were less selective, which in this case, was a good thing (Doyen ,Klein, Pichon, & Cleeremans, 2012; Ritchie, Wiseman, & French, 2012). The problem with this explanation is that it is true, but it makes psychological scientists look bad. Even undergraduate students with little formal training in philosophy of science realize that selective publishing of successful studies is inconsistent with the goal to search for the truth (Ritchie, 2021). However, euphemistic descriptions of the research culture before 2011 avoid mentioning questionable research practices (Weir, 2015) or describe these practices as honest (Nelson et al. 2019). Even suggestions that these practices were at best honest mistakes are often met with hostility (Fiske, 2016). Rather than cleaning up the mess that has been created by selection for significance, social psychologists avoid discussion of their practices to hide replication failures. As a result, not much progress has been made in vetting the credibility of thousands of published articles that provide no empirical support of their claims because most of these results might not replicate.
In short, social psychology suffers from a credibility crisis. The question is what social psychologists can do to restore credibility and to regain trust in their published results. For new studies this can be achieved by avoiding the pitfalls of the past. For example, studies can be pre-registered and journals may accept articles before the results are known. But what should researchers, teachers, students, and the general public do with the thousands of published results?
One solution to this problem is to conduct replication studies of published findings and to publish the results of these studies whether they are positive or negative. In their fantastic (i.e., imaginative or fanciful; remote from reality) review article, Nelson et al. (2019) proclaim that “top journals [are] routinely publishing replication attempts, both failures and successes” (p. 512). That would be wonderful, if it were true, but top journals are considered top journals because they are highly selective in what they are publishing and they have limited journal space. So, every replication study competes with an article that has an intriguing, exiting, and groundbreaking new discovery. Editors would need superhuman strength to resist the temptation to publish the sexy new finding and instead to publish a replication of an article from 1977 or 1995. Surely, there are specialized journals for this laudable effort that makes an important contribution to science, but unfortunately do not meet the high threshold of a top journal that has to maintain its status as a top journal.
The Journal of Personality and Social Psychology found an ingenious solution to this problem. To avoid competition with groundbreaking new research, replication studies can be published in the journal, but only online. Thus, these extra articles do not count towards the limited page numbers that are needed to ensure high profit margins for predatory (i.e., for-profit) publisher. Here, I examined what articles JPSP has published as e-online only publications.
Data
The first e-only replication study was published in 2015. Over the past five years, JPSP has published 21 articles as e-replications (not counting 2021).

In the years from 1965 to 2014, JPSP has published 9,428 articles. Thus, the 21 replication articles provide new, credible evidence for 21/9428 = 0.22% of articles that were published before 2015, when selection bias undermined the credibility of the evidence in these articles. Despite the small sample size, it is interesting to examine the nature and the outcome of the studies reported in these 21 articles.
1. SUCCESS
Eschleman, K. J., Bowling, N. A., & Judge, T. A. (2015). The dispositional basis of attitudes: A replication and extension of Hepler and Albarracín (2013). Journal of Personality and Social Psychology, 108(5), e1–e15. https://doi.org/10.1037/pspp0000017
Hepler, J., & Albarracín, D. (2013). Attitudes without objects: Evidence for a dispositional attitude, its measurement, and its consequences. Journal of Personality and Social Psychology, 104(6), 1060–1076. https://doi.org/10.1037/a0032282
The original authors introduced a new measure called the Dispositional Attitude Measure (DAM). The replication study was designed to examine whether the DAM shows discriminant validity compared to an existing measure, the Neutral Objects Satisfaction Questionnaire (NOSQ). The replication studies replicated the previous findings, but also suggested that DAM and NOSQ are overlapping measures of the same construct. If we focus narrowly on replicability, this replication study is a success.
2. FAILURE
Van Dessel, P., De Houwer, J., Roets, A., & Gast, A. (2016). Failures to change stimulus evaluations by means of subliminal approach and avoidance training. Journal of Personality and Social Psychology, 110(1), e1–e15. https://doi.org/10.1037/pspa0000039
This article failed to show evidence that subliminal stimuli to change evaluations that was reported by Kawakami et al. in 2007. Thus, this article counts as a failure.
Kawakami, K., Phills, C. E., Steele, J. R., & Dovidio, J. F. (2007). (Close) distance makes the heart grow fonder: Improving implicit racial attitudes and interracial interactions through approach behaviors. Journal of Personality and Social Psychology, 92, 957–971. http://dx.doi.org/10.1037/0022-3514.92.6.957
Citation counts suggest that the replication failure has reduced citations, although 4 articles already cited it in 2021.

Most worrisome, an Annual Review of Psychology chapter (editor Susan Fiske) perpetuates the idea that subliminal stimuli could reduce prejudice. “Interventions seeking to automate more positive responses to outgroup members may train people to have an “approach” response to Black faces (e.g., by pulling a joystick toward themselves when Black faces appear on a screen; see Kawakami et al. 2007)” (Paluck, Porat, Clark, & Green, 2021, p. 543). The chapter does not cite the replication failure.
3. SUCCESS
Rieger, S., Göllner, R., Trautwein, U., & Roberts, B. W. (2016). Low self-esteem prospectively predicts depression in the transition to young adulthood: A replication of Orth, Robins, and Roberts (2008). Journal of Personality and Social Psychology, 110(1), e16–e22. https://doi.org/10.1037/pspp0000037
The original article used a cross-lagged panel model to claim that low self-esteem causes depression (rather than depression causing low self-esteem).
Orth, U., Robins, R. W., & Roberts, B. W. (2008). Low self-esteem prospectively predicts depression in adolescence and young adulthood. Journal of Personality and Social Psychology, 95(3), 695–708. https://doi.org/10.1037/0022-3514.95.3.695
The replication study showed the same results. In this narrow sense it is a success.
The same year, JPSP also published an “original” article that showed the same results.
Orth, U., Robins, R. W., Meier, L. L., & Conger, R. D. (2016). Refining the vulnerability model of low self-esteem and depression: Disentangling the effects of genuine self-esteem and narcissism. Journal of Personality and Social Psychology, 110(1), 133–149. https://doi.org/10.1037/pspp0000038
Last year, the authors published a meta-analysis of 10 studies that all consistently show the main result.
Orth, U., Clark, D. A., Donnellan, M. B., & Robins, R. W. (2021). Testing prospective effects in longitudinal research: Comparing seven competing cross-lagged models. Journal of Personality and Social Psychology, 120(4), 1013-1034. http://dx.doi.org/10.1037/pspp0000358
The high replicability of the key finding in these articles is not surprising because it is a statistical artifact (Schimmack, 2020). The authors also knew about this because I told them as a reviewer when their first manuscript was under review at JPSP, but neither the authors not the editor seemed to care about it. In short, statistical artifacts are highly replicable.
4. SUCCESS
Davis, D. E., Rice, K., Van Tongeren, D. R., Hook, J. N., DeBlaere, C., Worthington, E. L., Jr., & Choe, E. (2016). The moral foundations hypothesis does not replicate well in Black samples. Journal of Personality and Social Psychology, 110(4), e23–e30. https://doi.org/10.1037/pspp0000056
The main focus of this “replication” article was to test the generalizability of the key finding in Graham, Haidt, and Nosek’s (2009) original article to African Americans. They also examined whether the results replicate in White samples.
Graham, J., Haidt, J., & Nosek, B. A. (2009). Liberals and conservatives rely on different sets of moral foundations. Journal of Personality and Social Psychology, 96(5), 1029–1046. https://doi.org/10.1037/a0015141
Study 1 found weak evidence that the relationship between political conservatism and authority differs across racial groups, beta = .25, beta = .47, chi(1) = 3.92, p = .048. Study 2 replicated this finding, but the p-value was still above .005, beta = .43, beta = .00, chi(1) = 7.04. While stronger evidence for the moderator effect of race is needed, the study counts as a successful replication of the relationship among White or predominantly White samples.
5. EXCLUDED
Crawford, J. T., Brandt, M. J., Inbar, Y., & Mallinas, S. R. (2016). Right-wing authoritarianism predicts prejudice equally toward “gay men and lesbians” and “homosexuals”. Journal of Personality and Social Psychology, 111(2), e31–e45. https://doi.org/10.1037/pspp0000070
This article reports replication studies, but the original studies were not published in JPSP. Thus, the results provide no information about the replicability of JPSP.
Rios, K. (2013). Right-wing authoritarianism predicts prejudice against “homosexuals” but not “gay men and lesbians.” Journal of Experimental Social Psychology, 49, 1177–1183. http://dx.doi.org/10.1016/j.jesp.2013.05.013
6. EXCLUDED
Panero, M. E., Weisberg, D. S., Black, J., Goldstein, T. R., Barnes, J. L., Brownell, H., & Winner, E. (2016). Does reading a single passage of literary fiction really improve theory of mind? An attempt at replication. Journal of Personality and Social Psychology, 111(5), e46–e54. https://doi.org/10.1037/pspa0000064
I excluded this article because it did not replicate a JPSP article. The original article was published in Science. Thus, the outcome of this replication study tells us nothing about the replicability published in JPSP.
7. EXCLUDED
Twenge, J. M., Carter, N. T., & Campbell, W. K. (2017). Age, time period, and birth cohort differences in self-esteem: Reexamining a cohort-sequential longitudinal study. Journal of Personality and Social Psychology, 112(5), e9–e17. https://doi.org/10.1037/pspp0000122
This article challenges the conclusions of the original article and presents new analyses using the same data. Thus, it is not a replication study.
8. EXCLUDED
Gebauer, J. E., Sedikides, C., Schönbrodt, F. D., Bleidorn, W., Rentfrow, P. J., Potter, J., & Gosling, S. D. (2017). The religiosity as social value hypothesis: A multi-method replication and extension across 65 countries and three levels of spatial aggregation. Journal of Personality and Social Psychology, 113(3), e18–e39. https://doi.org/10.1037/pspp0000104
This article is a successful self-replication of an article by the first two authors. The original article was published in Psychological Science. Thus, it does not provide evidence about the replicability of JPSP articles.
Gebauer, J. E., Sedikides, C., & Neberich, W. (2012). Religiosity, social self-esteem, and psychological adjustment: On the cross-cultural specificity of the psychological benefits of religiosity. Psychological Science, 23, 158–160. http://dx.doi.org/10.1177/0956797611427045
9. EXCLUDED
Siddaway, A. P., Taylor, P. J., & Wood, A. M. (2018). Reconceptualizing Anxiety as a Continuum That Ranges From High Calmness to High Anxiety: The Joint Importance of Reducing Distress and Increasing Well-Being. Journal of Personality and Social Psychology, 114(2), e1–e11. https://doi.org/10.1037/pspp0000128
This article replicates an original study published in Psychological Assessment. Thus, it does not tell us anything about the replicability of research in JPSP.
Vautier, S., & Pohl, S. (2009). Do balanced scales assess bipolar constructs? The case of the STAI scales. Psychological Assessment, 21, 187–193. http://dx.doi.org/10.1037/a0015312
10. EXCLUDED
Hounkpatin, H. O., Boyce, C. J., Dunn, G., & Wood, A. M. (2018). Modeling bivariate change in individual differences: Prospective associations between personality and life satisfaction. Journal of Personality and Social Psychology, 115(6), e12-e29. http://dx.doi.org/10.1037/pspp0000161
This article is a method article. The word replication does not appear once in it.
11. EXCLUDED
Burns, S. M., Barnes, L. N., McCulloh, I. A., Dagher, M. M., Falk, E. B., Storey, J. D., & Lieberman, M. D. (2019). Making social neuroscience less WEIRD: Using fNIRS to measure neural signatures of persuasive influence in a Middle East participant sample. Journal of Personality and Social Psychology, 116(3), e1–e11. https://doi.org/10.1037/pspa0000144
“In this study, we demonstrate one approach to addressing the imbalance by using portable neuroscience equipment in a study of persuasion conducted in Jordan with an Arabic-speaking sample. Participants were shown persuasive videos on various health and safety topics while their brain activity was measured using functional near infrared spectroscopy (fNIRS). Self-reported persuasiveness ratings for each video were then recorded. Consistent with previous research conducted with American subjects, this work found that activity in the dorsomedial
and ventromedial prefrontal cortex predicted how persuasive participants found the videos and how much they intended to engage in the messages’ endorsed behaviors.”
This article reports a conceptual replication study. It uses a different population (US vs. Jordan) and a different methodology. As a key finding did replicate, it might be considered a successful replication, but a failure could have been attributed to the difference in population and methodology. It is also not clear that a failure would have been reported. The study should have been conducted as a registered report.
12. FAILURE
Wilmot, M. P., Haslam, N., Tian, J., & Ones, D. S. (2019). Direct and conceptual replications of the taxometric analysis of type a behavior. Journal of Personality and Social Psychology, 116(3), e12–e26. https://doi.org/10.1037/pspp0000195
This article fails to replicate the claim that Type A and Type B are distinct types rather than extremes on a continuum. Thus, this article counts as a failure.
Strube, M. J. (1989). Evidence for the Type in Type A behavior: A taxometric analysis. Journal of Personality and Social Psychology, 56(6), 972–987. https://doi.org/10.1037/0022-3514.56.6.972

It is difficult to evaluate the impact of this replication failure because the replication study was just published and the original article received hardly any citations in recent years. Overall, it has 68 citations since 89.
13. EXCLUDED
Kim, J., Schlegel, R. J., Seto, E., & Hicks, J. A. (2019). Thinking about a new decade in life increases personal self-reflection: A replication and reinterpretation of Alter and Hershfield’s (2014) findings. Journal of Personality and Social Psychology, 117(2), e27–e34. https://doi.org/10.1037/pspp0000199
This article replicated an original articles published in PNAS. It therefore cannot be used to examine the replicability of articles published in JPSP.
Alter, A. L., & Hershfield, H. E. (2014). People search for meaning when they approach a new decade in chronological age. Proceedings of the National Academy of Sciences of the United States of America, 111, 17066–17070. http://dx.doi.org/10.1073/pnas.1415086111
14. EXCLUDED
Mõttus, R., Sinick, J., Terracciano, A., Hřebíčková, M., Kandler, C., Ando, J., Mortensen, E. L., Colodro-Conde, L., & Jang, K. L. (2019). Personality characteristics below facets: A replication and meta-analysis of cross-rater agreement, rank-order stability, heritability, and utility of personality nuances. Journal of Personality and Social Psychology, 117(4), e35–e50. https://doi.org/10.1037/pspp0000202
This article replicates a previous study of personality structure using the same items and methods using a different sample. The results are a close replication. Thus, it is a success, but the study is excluded because the original study was published in 2017. Thus, the study does not shed light on the replicability of articles published in JPSP before 2015.
Mõttus, R., Kandler, C., Bleidorn, W., Riemann, R., & McCrae, R. R. (2017). Personality traits below facets: The consensual validity, longitudinal stability, heritability, and utility of personality nuances. Journal of Personality and Social Psychology, 112, 474–490. http://dx.doi.org/10.1037/pspp0000100
15. EXCLUDED
van Scheppingen, M. A., Chopik, W. J., Bleidorn, W., & Denissen, J. J. A. (2019). Longitudinal actor, partner, and similarity effects of personality on well-being. Journal of Personality and Social Psychology, 117(4), e51–e70. https://doi.org/10.1037/pspp0000211
This study is a replication and extension of a previous study that examined the influence of personality on well-being in couples. A key finding was that personality similarity explained very little variance in well-being. While evidence for the lack of an effect is important, the replication crisis is about the reporting of too many significant results. A concern could be that the prior article reported a false negative results, but the studies were based on large samples with high power to detect even small effects.
Dyrenforth, P. S., Kashy, D. A., Donnellan, M. B., & Lucas, R. E. (2010). Predicting relationship and life satisfaction from personality in nationally representative samples from three countries: The relative importance of actor, partner, and similarity effects. Journal of Personality and Social Psychology, 99, 690–702. http://dx.doi.org/10.1037/a0020385
16. EXCLUDED
Buttrick, N., Choi, H., Wilson, T. D., Oishi, S., Boker, S. M., Gilbert, D. T., Alper, S., Aveyard, M., Cheong, W., Čolić, M. V., Dalgar, I., Doğulu, C., Karabati, S., Kim, E., Knežević, G., Komiya, A., Laclé, C. O., Ambrosio Lage, C., Lazarević, L. B., . . . Wilks, D. C. (2019). Cross-cultural consistency and relativity in the enjoyment of thinking versus doing. Journal of Personality and Social Psychology, 117(5), e71–e83. https://doi.org/10.1037/pspp0000198
This article mainly aims to examine the cross-cultural generality of a previous study by Wilson et al. (2014). Moreover, the study was published in Science. Thus, it does not help to examine the replicability of research published in JPSP before 2015.
Wilson, T. D., Reinhard, D. A., Westgate, E. C., Gilbert, D. T., Ellerbeck, N., Hahn, C., . . . Shaked, A. (2014). Just think: The challenges of the disengaged mind. Science, 345, 75–77. http://dx.doi.org/10.1126/science.1250830
17. MIXED
Yeager, D. S., Krosnick, J. A., Visser, P. S., Holbrook, A. L., & Tahk, A. M. (2019). Moderation of classic social psychological effects by demographics in the U.S. adult population: New opportunities for theoretical advancement. Journal of Personality and Social Psychology, 117(6), e84-e99. http://dx.doi.org/10.1037/pspa0000171
17a. EXCLUDED
This article reports replications of seven original studies. It also examined whether results are moderated by age / student status. Conformity to a simply presented descriptive norm (Asch, 1952; Cialdini, 2003; Sherif, 1936).
[None of these references are from JPSP]
17a. SUCCESS
The effect of a content-laden persuasive message on attitudes as moderated by argument quality and need for cognition (e.g., Cacioppo, Petty, & Morris, 1983).
Cacioppo, J. T., Petty, R. E., & Morris, K. J. (1983). Effects of need for cognition on message evaluation, recall, and persuasion. Journal of Personality and Social Psychology, 45, 805–818. http://dx.doi.org/10.1037/0022-3514.45.4.805
17c. EXCLUDED
Base-rate underutilization (using the “lawyer/engineer” problem; Kahneman & Tversky, 1973). [not in JPSP]
17d. EXCLUDED
The conjunction fallacy (using the “Linda” problem; Tversky & Kahneman, 1983). [not in JPSP]
17e. EXCLUDED
Underappreciation of the law of large numbers (using the “hospital” problem; Tversky & Kahneman, 1974). [Not in JPSP]
17f. SUCCESS
The false consensus effect (e.g., Ross, Greene, & House, 1977). Ross, L., Greene, D., & House, P. (1977).
The “false consensus effect”: An egocentric bias in social perception and attribution processes. Journal of Experimental Social Psychology, 13, 279 –301. http://dx.doi.org/10
.1016/0022-1031(77)90049-X
17g. FAILURE
The effect of “ease of retrieval” on self-perceptions (e.g., Schwarz et al., 1991).
Schwarz, N., Bless, H., Strack, F., Klumpp, G., Rittenauer-Schatka, H., & Simons, A. (1991). Ease of retrieval as information: Another look at the availability heuristic. Journal of Personality and Social Psychology, 61, 195–202. http://dx.doi.org/10.1037/0022-3514.61.2.195
Because the replication failure was just published, it is not possible to examine whether it had any effect on citations.
18. SUCCESS
In sum, the article successfully replicated 2 JPSP articles and failed to replicate 1.
Van Dessel, P., De Houwer, J., Gast, A., Roets, A., & Smith, C. T. (2020). On the effectiveness of approach-avoidance instructions and training for changing evaluations of social groups. Journal of Personality and Social Psychology, 119(2), e1–e14. https://doi.org/10.1037/pspa0000189
This is another replication of the Kawakami et al. (2007) article, but it focusses on Experiment 1 that did not use subliminal stimuli. This article reports a successful replication in Study 1, t(61) = 1.72, p = .045 (one-tailed), Study 3, t(981) = 2.19, p = .029, t(362) = 2.76, p = .003. Thus, this article counts as a success. It should be noted, however, that these effects disappear in studies with a delay between the training and testing sessions (Lai et al., 2016).
Kawakami, K., Phills, C. E., Steele, J. R., & Dovidio, J. F. (2007). (Close) distance makes the heart grow fonder: Improving implicit racial attitudes and interracial interactions through approach behaviors. Journal of Personality and Social Psychology, 92, 957–971. http://dx.doi.org/10.1037/0022-3514.92.6.957
19. EXCLUDED
Aknin, L. B., Dunn, E. W., Proulx, J., Lok, I., & Norton, M. I. (2020). Does spending money on others promote happiness?: A registered replication report. Journal of Personality and Social Psychology, 119(2), e15–e26. https://doi.org/10.1037/pspa0000191
This article replicated a study that was published in Science. It therefore does not tell us anything about the replicability of articles published in JPSP.
Dunn, E. W., Aknin, L. B., & Norton, M. I. (2008). Spending money on others promotes happiness. Science, 319, 1687–1688. http://dx.doi.org/10.1126/science.1150952
20. EXCLUDED
Calderon, S., Mac Giolla, E., Ask, K., & Granhag, P. A. (2020). Subjective likelihood and the construal level of future events: A replication study of Wakslak, Trope, Liberman, and Alony (2006). Journal of Personality and Social Psychology, 119(5), e27–e37. https://doi.org/10.1037/pspa0000214
Although this article reports two replication studies (both failures), the original studies were published in a different journal. Thus, the results do not provide information about the replicability of research published in JPSP.
Wakslak, C. J., Trope, Y., Liberman, N., & Alony, R. (2006). Seeing the forest when entry is unlikely: Probability and the mental representation of events. Journal of Experimental Psychology: General, 135, 641–653. http://dx.doi.org/10.1037/0096-3445.135.4.641
21. EXCLUDED
Burnham, B. R. (2020). Are liberals really dirty? Two failures to replicate Helzer and Pizarro’s (2011) study 1, with meta-analysis. Journal of Personality and Social Psychology, 119(6), e38–e42. https://doi.org/10.1037/pspa0000238
Although this article reports two replication studies (both failures), the original studies were published in a different journal. Thus, the results do not provide information about the replicability of research published in JPSP.
Helzer, E. G., & Pizarro, D. A. (2011). Dirty liberals! Reminders of physical cleanliness influence moral and political attitudes. Psychological Science, 22, 517–522. http://dx.doi.org/10.1177/0956797611402514
Results
Out of the 21 articles published under the e-replication format, only 7 articles report replications of studies published in JPSP before 2015. One of these article reports replications of three articles, but two of these articles report replications of different studies in the same article (one failure, one success; Kawakami et al., 2007). Thus, there are a total of 9 replications with 6 success and 3 failures. This is a success rate of 67%, 95%CI = 31% to 98%.
The first observation is that the number of replication studies of studies published in JPSP is abysmally low. It is not clear why this is the case. Either researchers are not interested in conducting replication studies or JPSP is not accepting all submissions of replication studies for publication. Only the editors of JPSP know.
The second Some evidence that JPSP published more successful than failed replications. This is inconsistent with the results of the Open Science Collaboration project and predictions based on statistical analyses of the p-values in JPSP articles (Open Science Collaboration, 2015; Schimmack, 2020, 2021). Although this difference may simply be sampling error because the sample of replication studies in JPSP is so small, it is also possible that this high success rate reflects reflects systematic factors that select for significance.
First, researchers may be less motivated to conduct studies with a low probability of success, especially in research areas that have been tarnished by the replication crisis. Who still wants to do priming studies in 2021? Thus, bad research that was published before 2015 may simply die out. The problem with this slow death model of scientific self-correction is that old studies continue to be cited as evidence. Thus, JPSP should solicit replication studies of prominent articles with high citations even if these replication studies may produce failures.
Second, it is unfortunately possible that editors at JPSP prefer to publish studies that report successful outcomes rather than replication failures. To ensure consumers of JPSP, editors should make it transparent whether replication studies get rejected and why they get rejected. Given the e-only format, it is not clear why any replication studies would be rejected.
Conclusion
Unfortunately, the results of this meta-analysis show once more that psychologists do everything in their power to appear trustworthy without doing the things that are required to gain or regain trust. While fabulous review articles tout the major reforms that have been made (Nelson et al., 2019), the reality is often much less glamourous. Trying to get social psychologists to openly admit that they made (honest) mistakes in the past and to correct themselves is akin to getting Trump to admit that he lost the 2020 election. Most of the energy is wasted on protecting the collective self-esteem of card carrying social psychologists in the face of objective, scientific criticism of their past and current practices. It remains unclear which results in JPSP are replicable and provide solid foundations for a science of human behavior and which results are nothing but figments of social psychologists’ imagination. Thus, I can only warn consumers of social psychological research to be as careful as they would be when they are trying to buy a used car. Often the sales pitch is better than the product (Ritchie, 2020; Singal, 2021).