Ioannidis is Wrong Again

In 2005, Ioannidis wrote an influential article with the title “Why most published research findings are false.” This article has been widely cited by scientists and in the popular media as evidence that we cannot trust scientific results (The Atlantic).

It is often overlooked that Ioannidis’s big claim was not supported by empirical evidence. It rested entirely on hypothetical examples. The problem with big claims that are based on intuition rather than empirical observations is that they can induce confirmation bias. Just like original researchers with their pet theories, Ioannidis was no longer an objective meta-scientists who could explore how often science is wrong. He had to go out and find evidence to support his claim. And that is what he did.

In 2017, Denes Szucs and John P. A. Ioannidis published an article that examined the risk of false positive results in cognitive neuroscience and psychology. The abstract suggests that the empirical results support Ioannidis’s claim that most published result are false positives.

We conclude that more than 50% of published findings deemed to be statistically significant are likely to be false.”

The authors shared their data, which made it possible for me to verify this conclusion using my own statistical method that can be used to assess the maximum false positive rate (Bartos & Schimmack, 2020; Brunner & Schimmack, 2020). I first used the information about t-values and their degrees of freedom to compute absolute z-scores. Z-scores have the advantage that they all have the same sampling distribution so the values provide standardized information about the strength of evidence against the null-hypothesis. The distribution of the absolute z-scores were then analyzed using zcurve.2.0 (Bartos & Schimmack, 2020).

Figure 1 shows the results with the assumption that there is no publication bias. As a result, both non-significant and significant results are fitted. Visual inspection shows some evidence that there are too many significant results, especially those that just reached significance (z > 1.96 corresponds to p = .05, two-tailed). There are also too few results that just missed to be significant or are sometimes considered to be marginally significant (p < .10, z > 1.65). This pattern suggests that researchers used questionable research practices to present marginally significant results as significant. However, in the big picture of all tests, this bias is relatively small. The observed discovery rate of 64% is only slightly higher than the expected discovery rate of 60%. This is a small amount of inflation and even with this large sample size, the deviation is not statistically significant (i.e., 64% is within the 95%CI of the EDR from 55% to 66%).

Szucs and John P. A. Ioannidis also create a scenario without researcher bias and still conclude that most published results are false.

For example, if we consider the recent estimate of 13:1 H0:H1 odds [30], then FRP exceeds 50% even in the absence of bias.

Figure 1 shows that the assumption is totally incompatible with the data. A model that assumes no bias has a discovery rate of 60%, and a discovery rate of 60% implies that no more than 3% of significant results can be false positives (Soric, 1989). Even the upper limit of the 95% CI is only 4% false discoveries. Thus, empirical data clearly falsifies Szucs and Ioannidis’ wild guess that psychologists test only 7% true hypotheses. Even actual replication studies have produced 37% significant results, which puts the rate of true hypothesis at a minimum of 37% (OSC, 2015). Thus, the conclusion in the abstract is based on false assumptions and not on an unbiased examination of the data.

Despite the small amount of bias in Figure 1, it is likely that some researcher bias is present. It is therefore reasonable to see what happens when a model allows for researcher bias. To do so, z-curve can be fitted only to the distribution of significant results and correct for the selection for significance. These results are shown in Figure 2.

This model shows clearer evidence of selection for significance. The expected discovery rate is 42% and the 95% CI , 24% to 52%, does not include the observed discovery rate of 64%. It is therefore save to assume that publication bias inflates the observed discovery rate. However, even with a discovery rate of 42%, the maximum false discovery rate is only 7%, and even if we use the lower bound of the 95%CI of the EDR, 24%, the false discovery rate is only 17%, which is still well below the 50% level needed to support Ioannidis’s famous claim that most published results are false.

In short, an objective assessment of Ioannidis’s own data falsifies his claim that most published results are false positives. So, how did he end up concluding that the data support his claim?

To make any claims about the false discovery rate, the authors had to make several assumptions because their model did not estimate the actual power of studies and did not measure the actual amount of bias. Thus, all Ioannidis had to do was to adjust the assumptions to fit the data. As in 2005, Ioannidis then presents these speculations as if they are empirical facts.

Non-scientists may be surprised that somebody can get away with this big claims that are not supported by evidence. After all, scientific articles are peer-reviewed. However, insiders are well aware that peer-review is an imperfect method of quality control. However, it is amazing that Ioannidis has been getting away with his bold claim that undermines trust in science for so long. Science is not perfect, and Ioannidis is a perfect example of the staying power of false claims, but science is still the best way to search for truth and solutions. Fortunately, Ioannidis was wrong about science. Science needs improvement, but it has produced many important and robust findings such as the discovery of highly effective vaccines against Covid-19. We should not blindly trust science. Instead, we need to examine the data and the assumptions underlying scientific claims, including meta-scientific ones. When we do this, it turns out that Ioannidis fight against researchers bias is based on a biased assessment of bias.

2 thoughts on “Ioannidis is Wrong Again

    1. Ulrich Schimmack – Since Cohen (1962) published his famous article on statistical power in psychological journals, statistical power has not increased. The R-Index makes it possible f to distinguish studies with high power (good science) and studies with low power (bad science). Protect yourself from bad science and check the R-Index before you believe statistical results.
      Ulrich Schimmack says:

      Thank you for sharing your self-reflection.

Leave a ReplyCancel reply