A major replication project of 100 studies published in 2008 showed that only 37% of published significant results could be replicated (i.e., the replication study also produced a significant result). This finding has raised concerns about the replicability of psychological science (OSC, 2015).

Conducting actual replication studies to examine the credibility of published results is effortful and costly. Therefore, my colleagues and I developed a cheaper and faster method to estimate the replicability of published results on the basis of published test statistics (Schimmack, 2022). The method produces two estimates of replicability that represent the best possible and the worst possible scenario. The expected replication rate (ERR) assumes that it is possible to replicate studies exactly. When this is not the case, selection for significance and regression to the mean will lead to lower success rates in actual replication studies. The expected discovery rate (EDR) is an estimate of the actual number of statistical results that researchers obtain in their laboratories. If selection for significance is ineffective in reducing the risk of false positive results or in selecting studies with more power, replication studies are expected to be no more successful than original studies (Brunner & Schimmack, 2020). In the absence of any further information, I am using the average of the EDR and ERR as the best prediction of the outcome of actual replication studies. I call this index the Actual Replicability Prediction (ARP). Whereas previous rankings relied exclusively on the ERR, the 2021 rankings start using the ARP to rank the replicability of journals.

Figure 1 shows the average ARP, ERR, and EDR for a broad range of psychology journals. Given the large number of test statistics in each year (k > 100,000), the estimates are very precise. The dashed lines show the 95%CI around the linear trend line. The results show that the ARP has increased from around 50% to slightly under 60%. This finding shows that results published in psychological journals have become a bit more replicable, although this prediction needs to be verified with actual replication studies.

However, the increase is not uniform across journals. Whereas some journals in social psychology showed some big increases, other journals show no changes. The big increases in social psychology are in part due to very low replication rates in this field before 2015 (OSC, 2015). For readers of journals changes are less important than actual replication rates. Table 1 shows the rankings of journals. Predicted replication rates range from an astonishing 97% in the Journal of Individual Differences to a disappointing 37% for Annals of Behavioural Medicine. Of course, results for 2021 are influenced by sampling error. More detailed information about previous years and trends can be found by clicking on the Journal Name.

For now, you can compare these results to previous results using prior rankings from 2020, 2019, or 2018 (these posts only report the ERR).

If you liked this post, you might also be interested in “Estimating the False Discovery Risk in Psychological Science‘.”