The hallmark of science is that it is self-correcting. While initial results may be false, replication studies can reveal that these findings cannot be reproduced. Unfortunately, psychologists disabled the self-correction mechanism of science when they decided to publish only statistically significant results (Sterling, 1959). As a result, replication failures occurred, but remained unpublished and could not correct false claims. This has created large bubbles of topics in psychology that are based on illusory evidence.
In the past decades, some areas of research have been scrutinized using registered replication reports. These projects combine the result of many labs to provide a powerful test of a hypothesis and the results are published independent of the outcome. Ego depletion has failed in two registered replication reports. Thus, it is the most severely tested theory in psychology.
However, a look into Web of Science shows that researchers continue to publish articles on ego-depletion and that citations are still increasing despite evidence that the published studies are not credible. This shows that psychology is not a science because self-correction is a necessary feature of science.
To examine the credibility of the empirical findings in ego-depletion articles, I conducted a z-curve analysis. I looked for articles that were published in 121 psychology journals, including all leading social psychology journals (Schimmack, 2022). This search retrieved 166 matching articles. A search for test-statistics in these articles produced 1,818 results of hypothesis tests that were converted into absolute z-scores as a measure of the strength of evidence against the null-hypothesis. Figure 2 shows the results of a z-cure analysis.
Visual inspection of the plot (i.e., histogram of z-scores) shows that the most common results are just significant (z = 1.96 equals p = .05, two-tailed). This is not a natural phenomenon. The observed discovery rate of 69% significant results is inconsistent with the expected discovery rate of 13% based on the distribution of statistically significant z-scores. As a result, published effect sizes are dramatically inflated. Moreover, a low EDR of 13% implies that up to 34% of significant results could be false positive results that were produced without a real effect. The 95% confidence interval ranges from 18% all the way up to 84%. Thus, it is unclear how many published results are false positives and more importantly, it is unclear which results are false positives or not. Not surprisingly, even key proponents of ego-depletion theory are unable to identify conditions that produce the effect (Vohs et al., 2021).
It is instructive to compare the results with those for literatures that have already been discredited like research on variations in a single candidate gene like the serotonin reuptake inhibitor gene (Schimmack, 2022). This literature shows a decline in publications and citations after it became apparent that key findings could not be replicated. The evidence for ego-depletion is just as weak, but so far literature reviews fail to take into account that convincing evidence for ego-depletion effects is lacking.