The word incredible has two meanings. One meaning is that something wonderful, spectacular, and remarkable occurred. The other meaning is that something is difficult to belief.
For several decades, experimental social psychologists reported spectacular findings about human behavior and cognition that culminated in the discovery of time-reversed, subliminal, erotic priming (Bem, 2011). The last decade has revealed that many of these incredible findings cannot be replicated (Schimmack, 2020). The reason is that social psychologists used a number of statistical tricks to inflate their effect sizes in order to produce statistically significant results (John et al., 2012). This has produced a crisis of confidence about published results in social psychology journals. What evidence can textbook writers and lecturers trust?
A shocking finding was that only 25% of published results in social psychology could be replicated and the percentage for classic experiments with random assignment to groups was even lower (OSC, 2015). Eminent social psychologists have responded in two ways. They either ignored these embarrassing results and pretended that everything is fine or they argued that the replication studies were poorly designed and carried out, maybe even with the intention to produce replication failures. Neither response is satisfactory. It is telling that eminent social psychologists have resisted calls to self-replicate their famous findings (cf. Schimmack, 2020).
Meanwhile, authors of the reproducibility project have responded to criticism by replicating their replication studies. Moreover, they improved statistical power to produce significant results by collaborating across labs. The results of this replication of replications project have just been published under the title “Many Labs 5” (ML5).
The project focussed on 10 original studies that failed to replicate in the OSC-Reproducibilty Project, but with some concerns about the replication studies. The success rate for ML5 was 20% (2 out of 10). However, none of the studies would have produced a significant result with the original sample size. These results reinforce the claim that experimental social psychology suffers from a replication crisis that casts a shadow of doubt over decades of research and the empirical foundations of social psychology.
One important question for the future of social psychology is whether any published findings provide credible evidence and how credible findings can be separated from incredible findings without the need for costly actual replications. One attempt to find credible evidence are prediction markets. The idea is that the wisdom of crowds makes it possible to identify credible findings. For the 10 studies in ML5, the average estimated success rate was about 30%, which is relatively close to the actual success rate of 20%. Thus, market participants were well calibrated to the low replicability of social psychological findings. However, they were not able to predict which of the 10 studies would replicate. The reason is that even the studies that replicated had very small effect sizes that were not statistically significant from those of studies that did not replicate. Thus, none of the 10 studies was particularly credible and instilled confidence among market participants.
An alternative approach to predict replicability relies on statistical information about the strength of evidence against the null-hypothesis in original studies (Bartos & Schimmack, 2020; Brunner & Schimmack, 2020; Schimmack, 2012). Results with p-values that are just significant (p < .05 & > .01) provide weak evidence against the null-hypothesis when questionable research practices are used because it is relatively easy to get these results. In contrast, very small p-values are difficult to obtain with QRPs. Thus, studies with high test-statistics should be more likely to replicate.
After examining the results from the OSC-reproducibility project for social psychology, I proposed that we should distrust all findings with a z-score less than 4 (Schimmack, 2015). Importantly, this rule is specific to social psychology and other disciplines (e.g., cognitive psychology) or sciences may require different rules (Schimmack, 2015).
How does the 4-sigma rule fare with the ML5 results? Only 1 out of the 10 studies has a z-score greater than 4. This study deserves some special mention because it was published by Jens Forster, who left academia under suspicions of research fraud (Retraction Watch). Fraud does not require actual data that need to be massaged to produce significant results and no statistical method can correct for research fraud. Thus, it is reasonable to apply the 4-sigma rule to the remaining studies. Consistent with the 4-sigma rule, none of the 9 remaining studies would have produced a significant result with the original sample size. Thus, the ML5 results provide support for this rule to social psychology.
The problem for social psychology is that most test-statistics that are published are below this criterion. Figure 1 (from Schimmack, 2020) shows the distribution of published test-statistics in a representative sample of studies from social psychology collected by Motyl and colleagues.
The graph shows clear evidence of QRPs because journals hardly ever report a non-significant result, despite low power (Expected discovery rate 19%) to produce significant results, which has been the case since the beginning of experimental social psychology (Cohen, 1962; Sterling, 1959). Moreover, we see that most published test-statistics are between 2 and 4 sigma. The results from OSC-RPP and ML5 suggest that most of these results are difficult to replicate even with larger samples. Moreover, these results suggest that the replicability estimates provided by z-curve (43%) are overly optimistic because the model does not account for fraud and other extremely questionable practices that can produce significant results without actual effects.
In conclusion, experimental social psychology is the poster-child of pseudo-science, where researchers ape real sciences to sell incredible stories with false evidence. Social psychologists have shied away from this reality, just like Trump is trying to hold on to his lie that he won the 2020 election. It is time to through out this junk science and to usher in a new era of credible, honest, and responsible social psychology that addresses real world problems with real scientific evidence, and to hold charlatans accountable for their actions and denial. It is problematic that textbooks still paddle research and theories that rest on incredible evidence that was obtained with questionable research practices.