The concept of implicit bias has become an accepted idea in the general public (e.g., Starbucks’ closure in 2018). The idea that individuals could be prejudice without awareness originated in social psychology when social psychologists used cognitive paradigms to study social cognition.
Patricia Devine’s (1989) article continues to be cited as empirical evidence for the existence of unconscious prejudice.
Devine’s article also influenced the authors of the Implicit Association Test that is now widely used to measure “implicit racial bias”.
Experiment 3 was motivated by several previous demonstrations of automatic expressions of race-related stereotypes and attitudes that are consciously disavowed by the subjects who display them (Crosby, Bromley, & Saxe, 1980; Devine, 1989; Fazio et al., 1995; Gaertner & McLaughlin, 1983; Greenwald & Banaji, 1995; Wittenbrink, Judd, & Park, 1997)
(Greenwald, McGee, & Schwartz, 1998, p 1473).
Not surprisingly, it is also featured in social psychology textbooks.
Gilovich, Keltner, Chen, and Nisbett (2019) write “automatic and controlled processes can result in quite different attitudes in the same person toward members of outgroups” (Devine, 1989a, 1989b; Devine, Montehith, Zuwerink, Elliot, 1991; Devine, Forscher, Austin, & Cox, 2012; Devine, Plant, Amodio, Harmon-Jones, & Vance, 2002)” (p. 15).
Myers and Twenge (2018) write “Patricia Devine and her colleagues (1989, 2012; Forscher et al., 2015) report that people low and high in prejudice sometimes have similar automatic (unintentional) prejudicial responses. The result: Unwanted (dissonant) thoughts and feelings often persist. Breaking the prejudice habit is not easy” (p. 258).
The idea that individuals can have different conscious and unconscious aspects of personality goes back to old psychoanalytic theories. However, social psychologists claim that they have scientific evidence to support this claim.
“A great many studies have shown that stimuli presented outside of awareness can prime a schema sufficiently to influence subsequent information processing (Bargh, 1996; Debner & Jacoby, 1994; Devine, 1989b; Draine & Greenwald, 1998; Ferguson, 2008; Ferguson, Bargh, & Nayak, 2005; Greenwald, Klinger, & Liu, 1989; Klinger, Burton, & Pitts, 2000; Lepore & Brown, 1997; Welsh & Ordonez, 2014)” (Gilovich et al., p. 122).
Gilovich et al. (1989) give a detailed description of Devine’s study on pages 391 and 392.
Patricia Devine (1989b) examined the joint operation of these automatic and controlled processes by investigating the schism that exists for many people between their knowledge of racial stereotypes and their own beliefs and attitudes toward those same groups. More specifically, Devien sought to demonstrate that what separates prejudiced and nonprejudiced people is not their knowledge of derogatory stereotypes, but whether they resist those stereotypes. To carry out her investigation, Devine relied on the distinction between controlled processes, which we direct more consciously, and automatic processes, which we do not consciously control. The activation of stereotypes is typically an automatic process; thus, stereotypes can be triggered even if we don’t want them to be. Even a nonprejudiced person will, under the right circumstances, access an association between say, Muslims and fanaticism, blacks and criminality, and WASPs and emotional repression, because those associations are present in our culture. Whereas a bigot will endorse or employ such stereotypes, a non-prejudiced person will employ more controlled cognitive processes to discard or suppress them – or at least try to.
To test these ideas, Devine selected groups selected groups of high- and low-prejudiced participants on the basis of their scores on the Modern Racism Scale (Devine, 1989b). To show that these two groups don’t differ in their automatic processing of stereotypical information – that is, that the same stereotypes are triggered in both high-prejudiced and low-prejudiced people – she presented each participant with a set of words, one at a time, so briefly that the words could not be consciously identified. Some of them saw neutral words (number, plant, remember) and others saw words stereotypically associated with blacks (welfare, jazz, busing). Devine hypothesized that although the stereotypical words were presented too briefly to be consciously recognized , they would nonetheless prime the participants’ stereotypes of blacks. To test this hypothesis, she presented the participants with a written description of an individual who acted in an ambiguously hostile manner (a feature of the African-American stereotype). In one incident, for example, the person refused to pay his rent until his apartment was repaired. Was he being needlessly belligerent or appropriately assertive?
The textbook describes the results as follows.
The results indicated that he was seen as more hostile – and more negative overall – by participants who had earlier been primed by words designed to activate stereotypes of blacks (words such as jazz, its’ important to note, that are not otherwise connected to the concept of hostility). Most important, this result was found equally for prejudiced and non-prejudiced participants.
The sample consisted of 78 White subjects in the judgment condition (p. 11)
The description of the priming words in the textbook leaves out that derogatory, racists terms were included in the list of primes.
Replication 1 primes included the following: nigger, poor, afro, jazz, slavery, musical, Harlem, busing, minority, oppressed, athletic, and prejudice. Replication 2 primes included the following: Negroes, lazy, Blacks, blues,
rhythm, Africa, stereotype, ghetto, welfare, basketball, unemployed,
and plantation (p. 10).
The textbook also does not describe the conditions accurately. Rather than comparing 100% to 0% words related to African Americans, the lists included 80% or 20% stereotypic and racists stimuli. Thus, even participants in the control condition were primed, but less often.
The mean ratings were submitted to a mixed-model ANOVA, with prejudice level (high vs. low), priming (20% vs. 80%), and replication (1 vs. 2) as between-subjects variables and scale (hostility related vs. hostility unrelated) as a within-subjects variable. The analysis revealed that the Priming X Scale interaction was significant, F(1, 70) = 5.04, p < .03 (p. 11).
The description of the results makes it impossible to compute a standardized effect size (standard deviations are not reported). The p-value is just significant and published results with p-values close to .05 often do not replicate (Open Science Collaboration, 2015).
Moreover, the results do not show that high-prejudice and low-prejudice participants independently show the effect. In fact, it is unlikely that follow-up tests would be significant because the overall effect is just significant, and power to get significant results decreases when each group is tested individually.
The analysis on hostility-related scales revealed only a significant priming main effect, F(l, 70) = 7.59, p < .008. The Prejudice Level x Priming interaction was nonsignificant, F(l, 70) = 1.19, p = .28.
Devine also makes the mistake to interpret a non-significant result as evidence for the absence of an effect. That is, the interaction between prejudice levels and priming was not significant, p = .28. This finding is used to support the claim that both groups show the same priming effect. However, an alternative explanation is that there is a difference between the groups, but the statistical test failed to show it (a false negative result or a type-II error). Again, to demonstrate that low-prejudice subjects were influenced by the priming manipulation, it would have been better to test the priming effect in the low-prejudice group alone. This was not done. To make matters worse, means are not reported separately for each group, so that it is impossible to test this hypothesis post-hoc. As a result, the article provides no empirical evidence for the claim that low-prejudice individuals’ responses were influenced by subliminal activation of stereotypes.
The lack of empirical evidence in this seminal study would not be a problem, if replication studies had provided better evidence for Devine’s claims that are featured in the textbook. However, follow-up studies have produced different results. These follow-up studies are not mentioned on pages 391-393, although Lepore and Brown (1997) were mentioned earlier on page 122. The reason for the omission on pages 391-393 is that Lepore and Brown’s (1997) findings contradict Devine’s claim that unconscious bias is the same for high and low prejudice individuals.
Lepore and Brown
The article by Lepore and Brown (1997) is cited much less frequently than Devine’s (1989) article.
Study 2 and 3 of their article are conceptual replication studies of Devine’s study. The results seem to show that subliminal stereotype activation is possible, but they also contradict Devine’s claim that the effect is the same for individuals who score high or low on a prejudice measure.
Study 2 differed from Devine’s study in the type of priming stimuli that were used.
In the prime condition, 13 words evocative of the category Black people were used. They were category labels themselves and neutral associates of the category, based on free responses in pretesting. The words used were as follows: Blacks, Afro-Caribbean, West Indians, colored, afro, dreadlocks, Rastafarian, reggae, ethnic, Brixton, Notting Hill,3 rap, and culture.
The sample size was small with 51 participants who were not selected from a screening task. Groups were formed by a median split. Thus, the groups differed much less in prejudice levels than those in Devine’s study.
The statistical analysis showed a 3-way interaction, F(1,47) = 6.07, p < .02, that was again just significant.
High-prejudice participants in the prime condition rated the target person more extremely on the negative construct (Ms = 6.76 vs. 5.88), t(46) = 3.43, p < .005 and less extremely on the positive construct (Ms = 6.31 vs. 6.88), ?(46) = 2.22, p < .025. Low-prejudice participants increased their ratings on the positive scales (Ms = 6.98 vs. 6.54), ;(46) = 1.69, p < .05, but showed no difference on the negative ones (Ms = 5.65 vs. 5.73).
These results are inconsistent with Devine, who claimed equal effects of primes for low and high prejudice participants (without showing evidence for it).
Study 3: A Conceptual Replication of Devine (1989)
The experiment was designed with 13 priming words. Three were category labels (i.e., Blacks, West Indians, and Afro-Caribbean), six were negative (i.e., nigger, rude, dirty, crime, unemployed, and drugs), and the remaining four were evocative of the category (i.e., dreadlocks, reggae, Brixton, and ethnic).
The sample size for this study was small (N = 45) and a median split was used to define groups of high and low prejudice.
The means show the pattern predicted by Devine that both groups increased negative ratings after priming with racist primes.
High-prejudice participants in fact significantly increased their
ratings on the negative scales comparing the prime and no-prime
conditions, r(40) = 2.62, p < .01
The same comparison was not significant in the low-prejudice group, r(40) = 1.30, p < .10 [one-tailed].
Thus, even this conceptual replication study failed to provide evidence that low-prejudice are affected by subliminal priming with racist primes.
Moreover, all of these published results are just significant. This is an unlikely outcome because statistical results are highly variable and should produce some non-significant and some highly significant results. When all p-values are clustered into the region of just significant results, it suggests that the published studies were selected from a larger set of studies that failed to produce significant results. Thus, it is unclear how robust these findings really are.
Although Devine’s study had a huge influence on social psychology and the notion of implicit racial bias, there are no credible, unbiased replication studies of this study. Moreover, subliminal priming in general may not be a robust and replicable phenomenon. However, social psychology textbooks hide these problems from students, and present unconscious bias as a scientifically proven reality. This blog post shows that the scientific evidence is much less consistent and robust than textbooks imply.