It is well known that many psychology articles report too many significant results because researchers selectively publish results that support their predictions (Francis, 2014; Sterling, 1959; Sterling et al., 1995; Schimmack, 2021). This often leads to replication failures (Open Science Collaboration, 2015).
One way to examine whether a set of studies reported too many significant results is to compare the success rate (i.e., the percentage of significant results) with the mean observed power in studies (Schimmack, 2012). In this video, I illustrate this bias detection method using Vohs et al.’s (2006) Science article “The Psychological Consequences of Money.”
I use this students for training purposes because the article reports 9 studies and a reasonably large number of studies is needed to have good power to detect selection bias. Also, the article is short and the results are straight forward. Thus, students have no problem filling out the coding sheet that is needed to compute observed power (Coding Sheet).
The results show clear evidence of selection bias that undermine the credibility of the reported results (see also TIVA). Although bias tests are available, few researchers use them to protect themselves from junk science and articles like this one continue to be cited at high rates (683 total, 67 in 2019). A simple way to protect yourself from junk science is to adjust the alpha level to .005 because many questionable practices produce p-values that are just below .05. For example, the lowest p-value in these 9 studies was p = .006. Thus, not a single study was statistically significant with alpha = .005.