The core feature of science is that it can self-correct itself. Any mature science has a history with older theories, findings, or measures that were replaced with better ones. Unfortunately, psychology lacks clear evidence of progress that is marked by a graveyard of discarded concepts that failed to be supported by empirical evidence. The reason is that psychologists have used a biased statistical tool, called Null-Hypothesis-Significance-Testing, to test theoretical prediction. This tool only allows to confirm theoretical predictions when the null-hypothesis is rejected, but it does not allow to falsify predictions, when the null-hypothesis is not rejected. Due to this asymmetry, psychology journals only publish results that support a theory and fail to report results when a prediction was not confirmed.
While this problem has been known for decades (Sterling, 1959), only some psychologists have recently started to do something about this problem in the past decade. The so-called open science movement has started to publish studies even when they fail to support existing theories, and new statistical methods have been developed to correct for publication bias to examine whether a literature is credible or not.
Another problem is that even replicable findings can be misleading when measures are invalid or biased. Psychologists also have neglected to carefully validate their measures and many claims in the literature are distorted by measurement error.
Unfortunately, scientific self-correction is a slow process and motivated biases prevent researchers from correcting themselves. This glossary serves as a warning for consumers of psychological research (e.g., undergraduate students) that some of the information that they may encounter in books, articles, or lectures may not be based on scientific evidence. Of course, it is also possible that the information provided here is misleading and will be corrected in the future. However, the evidence presented here can alert readers to the fact that published results in a particular literature may be less credible than they appear and that unscientific practices were used to produce a literature that appears to be much stronger than the evidence actually is.
Behavioral priming is the most commonly used term for a specific form of priming. The key difference between cognitive priming and behavioral priming is that cognitive priming examines the influence of stimuli that are in the focus of attention on related cognitions. This is typically done in studies in which stimuli are presented in close temporal sequence and responses to the second stimulus are recorded. For example, showing the word “hospital” speeds up identification of the word “doctor” as a word. In contrast, behavioral priming assumes that stimuli that are no longer in the focus of attention continue to influence subsequent behaviors. A classic study was the finding that showing words related to the elderly made participants walk slower from one room to another. A replication failure of this study triggered the replication crisis in social psychology. During the replication crisis, it has become apparent that behavioral priming researchers used unscientific practices to provide false evidence for behavioral priming effects (Schimmack, 2017a, Schimmack, 2017b). Nobel Laureate Daniel Kahneman featured behavioral priming research in his popular book “Thinking: Fast and Slow,” but he distanced himself from this research after behavioral priming researchers were unwilling or unable to replicate their own findings (Kahneman, 2012, 2017). Most recently, Kahneman declared ” behavioral priming research is effectively dead. Although the researchers never conceded, everyone now knows that it’s not a wise move for a graduate student to bet their job prospects on a priming study. The fact that social psychologists didn’t change their minds is immaterial” (Kahneman, 2022). You may hear about priming studies in social psychology with various primes (e.g., elderly priming, flag-priming, goal priming, professor priming, religious priming, money priming, etc.). Although it is impossible to say that none of these findings are real, only results from pre-registered studies with large samples should be trusted. Even if behavioral priming effects can be demonstrated under controlled laboratory conditions, it is unlikely that residual activation of stimuli have a strong influences on behavior outside our awareness. This does not mean that our behavior is not influenced by previous situations. For example, slick advertising can influence our behavior, but it is much more likely that it does so with awareness (I want an I-phone because I think it is cooler) than without awareness (an I-phone add makes you want to buy one without you knowing why you prefer an I-phone over another smart phone).
The key assumption of construal level theory is that individuals think about psychologically distant events differently than about psychologically close events and that these differences can influence their decisions, emotions, and behaviors. Self-serving meta-analysis that do not correct for publication bias suggest that hundreds of studies provide clear evidence for construal-level theory (Soderberg et al., 2022). However, meta-analysis that correct for bias show no clear evidence for construal-level effects (Maier, 2022). This finding is consistent with my own statistical analysis of the construal level literature (Schimmack, 2022). The literature shows strong evidence that unscientific practices were used to publish only results that support the theory, while hiding findings that failed to support predictions. After taking this bias into account, the published results have a high false positive risk and it is currently unclear which findings are replicable or not. New evidence will emerge from a large replication project, but the results will not be known until 2024 (https://climr.org/).
The main hypothesis of ego-depletion theory is that exerting mental effort requires energy and that engaging in one task that requires mental energy reduces the ability to exert mental energy on a second task. Hundreds of studies have examined ego-depletion effects with simple tasks like crossing out letters or a measure of handgrip strength. Ten years after the theory was invented, it was also proposed that blood glucose levels track the energy that is required for mental effort. A string of replication failures showed that the evidence of blood glucose effects is not robust and statistical analyses showed clear evidence that unscientific methods were used to produce initial evidence for glucose effects; the lead author even admitted to the use of these practices (Schimmack, 2014). Even the proponents of ego-depletion effects no longer link it to glucose. More important, even the basic ego-depletion effect is not replicable. Two large registered replication report, one led by key proponents of the theory, failed to produce the effect (Vohs et al., 2021). This is not surprising because statistical analyses of the published studies show that unscientific practices were used to present only significant results in support of the theory (Schimmack, 2022).
The main theoretical assumption in the implicit bias literature is that individuals can hold two attitudes that can be in conflict with each other (dual-attitude model). One attitude is consciously accessible and can be measured with (honest) self-reports. The other attitude is not consciously accessible and can only be measured indirectly; typically with computerized tasks like the Implicit Association Test. The key evidence to support the notion of implicit bias is that self-ratings of some attitudes are only weakly correlated with scores on implicit measures like the IAT. The key problem with this evidence is that measurement error alone can produce low correlations between two measures. In studies that correct for random and systematic measurement error, the valid variance in self-ratings and implicit measures is often highly correlated. This suggests that discrepancies between self-ratings and implicit measures are mostly due to measurement error (Schimmack, 2019).
The concept of implicit self-esteem is based on theories that assume individuals have two types of self-esteem. Explicit self-esteem is consciously accessible and can be measured with honest self-reports. Implicit self-esteem is not consciously accessible and can only be measured indirectly. The most widely used indirect measure of implicit self-esteem is the self-esteem Implicit Association Test. Evidence for the distinction between implicit and explicit self-esteem rests entirely on the fact that self-ratings and IAT scores have very low correlations. However, several studies have shown that the main reason for this low correlation is that most of the variance in self-esteem IAT scores is measurement error (Schimmack, 2019). It is therefore surprising that a large literature of studies with this invalid measure has produced statistically significant results that seem to support predictions based on implicit self-esteem theory. The reason for this seemingly robust evidence is that researchers used unscientific practices to hide findings that are not consistent with predictions. This can be seen in a statistical analysis of the published studies (Schimmack, 2022). At present, credible evidence for an unconscious form of self-esteem that is hidden from honest self-reflection is lacking.
The serotonin transporter gene theory (also 5-HTTLPR, serotonin transporter polymorphism) postulated that genetic variations in the serotonin reuptake mechanism are linked to personality traits like neuroticism that are a risk factor for mood disorders. When it became possible to measure this variation in human DNA, many studies used this biological marker as a predictor of personality measures and measures of depression and anxiety. After an initial period of euphoria, replication failures showed that many of the first results could not be replicated even in studies with much larger samples. It became apparent that variations in a single gene have much smaller effects on complex traits than initial studies suggested and research on this topic decreased. This healthy self-correction of science is visible in decreasing publications and citations of the older, discredited studies. A statistical analysis of the published studies further confirms that significant results were obtained with unscientific methods that led to the selection of significant results (Schimmack, 2022). After correcting for this bias, there is little evidence that the genetic variation in the serotonin reuptake gene makes a practically significant contribution to variation in personality. The research has moved on to predicting personality from patterns of genetic variations across a large number of genes (genome-wide association studies). This correction is one of the few examples of scientific progress in psychology that is reflected in a body of false claims that have been discarded.
The stereotype-threat literature is based on the main hypothesis that stereotypes about performance (e.g., White people can’t dance) can be elicited in situations in which individuals are under pressure to perform well (e.g., A White men on a date with a black woman) and activation of the stereotype impairs the performance. Initially, stereotype threat researchers focused on African Americans’ performance in academic testing situations. Later, the focus shifted to women’s performance on math and STEM related tests. Stereotype threat effects have often been used to counter biological theories of performance differences, but performance can also be influenced by environmental factors. The focus on the testing situation is partially explained by psychologists’ preference to conduct experimental studies and it is easier to experimentally manipulate the testing situation than to study actual environmental influences on performance (e.g., discrimination by teachers or lack of funding in poor neighborhoods). Meta-analyses of this literature show that more recent studies with large samples have much smaller effect sizes than the initial studies with small samples (Flore & Wicherts, 2014; Shewach et al., 2019). These meta-analyses also found evidence of publication bias. A z-curve analysis confirms these findings (Schimmack, 2022). A large replication study found no evidence of stereotype-threat effects for women and math (Flore et al., 2018). It is possible that stereotype threat effects occur for some groups under some conditions, but at present there are no robust findings to show these effects. These results suggest that situational factors in testing situations are unlikely to explain performance differences in high-stake testing situations.
Terror Management Theory
The basic idea of terror management theory is that humans are aware of their own mortality and that thoughts about one’s own death elicit fear. To cope with this fear or anxiety, humans engage in various behaviors that reduce death anxiety. To study these effects, participants in experimental studies are either asked to think about death or some other unpleasant event (e.g., dental visits). Numerous studies show statistically significant effects of these manipulations on a variety of measures (wikipedia). However, there is strong evidence that this evidence was produced with unscientific practices that prevented disconfirming evidence from being published. After correcting for this bias, the published studies lack credible evidence for terror management effects (Schimmack, 2022).
Unconscious Thought Theory
Unconscious thought theory assumed that unconscious processes are better at solving complex decision problems than conscious thought. Publications supporting the theory increased from 2004 to 2011, but output decreased since then (Schimmack, 2022). A meta-analysis and failed replication study in 2017 suggested that evidence for unconscious thought theory was inconsistent and often weak, especially in bigger samples (Nieuwenstein et al., 2017) . A direct examination of publication bias shows strong evidence that unscientific practices were used to publish evidence in favor rather than against the theory (Schimmack, 2022). At present, strong evidence from pre-registered studies with positive results is lacking. Thus, the theory lacks empirical support.