Category Archives: Racism

No Justice, No Peace: A History of Slavery Predicts Violence Today

Some human behaviors attract more attention than others. Homicides are rare, but very salient human behaviors. Governments investigate and keep records of homicides and social scientists have developed theories of homicides.

In the 1960s, social scientists suggested that inequality can lead to more violence. One simple reason is that the rewards for poor people to commit violent crimes increase with greater inequality in wealth (Becker, 1968).

Cross-national studies confirm that societies with more income inequality have higher homicide rates (Avison & Loring, 1986; Blau & Blau, 1982; Chamlin & Cochran, 2006; Corcoran & Stark, 2020; Fajnzylber, Lederman & Loayza, 2002; Krahn et al., 1986; Pratt & Godsey, 2003; Pridemore, 2008).

A recent article in Psychological Science replicated this finding (Clark, Winegard, Beardslee, Baumeister, & Shariff, 2020). However, the main focus of the article was on personality attributes as a predictor of violence. The authors main claim was that religious people are less likely to commit crimes and that among non-religious individuals those with lower intelligence would be more likely to commit homicides.

A fundamental problem with this article is that the authors relied on an article by a known White-supremacist, Richard Lynn, to measure national differences in intelligence (Lynn & Meisenberg, 2010). This article with the title “National IQs calculated and validated for 108 nations” claims that the values used by Clark et al. (2020) do reflect actual differences in intelligence. The problem is that the article contains no evidence to support this claim. In fact, the authors reveal their racist ideology when they claim that a correlation between their scores and skin color of r = -.9 validates their measure as a measure of intelligence. This is not how scientific validation works. This is how racists abuse science to justify their racist ideology.

The article also makes a common mistake to impose a preferred causal interpretation on a correlation. Lynn and Meisenberg (2010) find that their scores correlate nearly perfectly with educational attainment. They interpret this as evidence that intelligence causes educational attainment and totally ignore the plausible alternative explanation that education influences performance on logical problems. This has important implications for Clark et al.’s (2020) article because the authors buy into Lynn and Meisenberg’s racist interpretation of the correlation between performance on logic problems and educational attainment. An alternative interpretation of their finding would be that religion interacts with education. In nations with low levels of formal education, religion provides a moral code that prevents homicides. In countries with more education, other forms of ethics can take the place of religion. High levels of homicides would be observed in countries where neither religion nor education teach a moral code.

Aside from this fundamental flaw in Clark et al.’s (2020) article, closer inspection of their data shows that they overlooked confounding factors and that their critical interaction is no longer significant when these factors are included in the regression model. In fact, financial and racial inequality are much better predictors of national differences in violence than religion and the questionable measure of intelligence. Below I present the statistical results that support this conclusion that invalidate Clark et al’s (2020) racist conclusions.

Statistical Analysis

Distribution Problems

Not long ago, religion was a part of life in most countries. Only over the past century, some countries became more secular. Even today, most countries are very religious. Figure 1 shows the distribution of religiosity based on the Relig_ARDA variable in Clark et al.’s dataset. This skewed distribution can create problems when a variable is used in a regression model, especially if the variable is multiplied with another variable to test interaction effects.

It is common practice to transform variables to create a more desirable distribution for the purpose of statistical analysis. To do so, I reversed the item to measure atheism and then log-transformed the variable. To include countries that scored 100% on religiosity, I added 0.001 to all atheism scores before I carried out the log transformation. The distribution of log-atheism is less skewed.

The distribution of homicides (rates per 100,000 inhabitants) is also skewed.

Because homicide rates are right-skewed, a direct log-transformation can be applied to get a more desirable distribution. To include nations with a value of 1, I added a value of 1 before the log-transformation. The resulting distribution for log-homicides is more desirable.

The controversial IQ variable did not require a transformation.

Bivariate Relationships

The next figure shows a plot of homicides as a function of the questionable intelligence (QIM). There is a visible negative correlation. However, the plot also highlights countries in Latin America and the United States. These countries have in common that they were established by decimating the indigenous population and bringing slaves from Africa to work for the European colonialists. It is notable that nations with a history of slavery have higher homicide rates than other nations. Thus, aside from economic inequality, racial inequality may be another factor that contributes to violence even though slavery ended over 100 years ago, while racial inequality persists until today. Former slave countries also tend to score lower on the QIM measure. Thus, slavery may partially account for the correlation between QIM and homicide rates.

The next plot shows homicide rates as a function of atheism. A value of 0 would mean the country it totally atheistic, while more negative values show increasing levels of religion. There is no strong relationship between religion and homicide rates. This replicates the results in the original article by Clark et al. Remember that their key finding was a interaction between QIM and religion. However, the plot also shows a clear distinction between less religious countries. Former slave countries are low in religion and have high homicide rates, while other countries (mainly in Europe) are low in religion and have low homicide rates.

Regression Models

To examine the unique contribution of different variables to the prediction of homicide rates, I conducted several regression analyses. I started with the QIM x religion interaction to see whether the interaction is robust to transformations of the predictor variables. The results clearly show the interaction and main effects for QIM and religion (t-values > 2 are significant at p < .05).

Next I added slavery as a predictor variable.

The interaction is no longer significant. This shows that the interaction emerged because former slave countries tend to score low on QIM and religion.

I then added the GINI coefficient, the most widely used measure of income inequality, to the model. Income inequality was an additional predictor. The QIM x religion interaction remained non-significant.

I then added GDP to the model. Countries wealth is strongly related to many positive indicators. Given the skewed distribution, I used log-GDP as a predictor, which is also the most common way economists use GDP.

GDP is another significant predictor, while the QIM x religion interaction remains non-significant. Meanwhile, the strong relationship between QIM and homicide rates has decreased from b = -.71 without controls to b = -.25 with controls. However, it is still significant. As noted earlier, QIM may reflect education and Clark et al. (2020) included a measure of educational attainment in their dataset. It correlates r = .68 with QIM. I therefore substituted QIM with education.

However, education did not predict homicide rates. Thus, QIM scores capture something about nations that the education measure does not capture.

We can compare the social justice variables (slavery, GDP, GINI) with the personal-attribute (atheist, QIM) variables. A model with the social justice variables explains 62% of the variation in homicide rates across nations.

The personal-attribute model explains only 40% of the variance.

As these predictors overlap, the personal-attributes add only 3% additional variance to the variance that is explained by slavery, income inequality, and wealth.

Replicating Slavery’s Effect in the United States

The United States provide another opportunity to test the hypothesis that a legacy of slavery and racial inequality is associated with higher levels of homicides. I downloaded statistics about homicides (homicide stats). In addition, I used a measure of urbanization to predict homicides (urbanization). I also added a measure of income inequality (GINI). I classified states that fought for the confederacy as slave states (civil war facts). Results were similar for different years in which homicide rates were available from 1996 to 2018. So, I used the latest data.

In a model with all predictor variables, slavery was the only significant predictor. Income inequality showed a trend, and urbanization was not a unique predictor. When urbanization was removed from the model, the effect of income inequality was a bit stronger.

Overall, these results are consistent with the cross-national data and suggest that a history of slavery and persistent racial inequality create social conditions that lead to more violence and homicides. These results are consistent with recent concerns that systemic racism contributes to killing of civilians by civilians and police officers who historically had the role to enforce racial inequality.

Meta-Science Reflections

Clark et al.’s (2020) article is flawed in numerous ways. Ideally, the authors would have the decency to retract it. The main flaw is the use of a measure with questionable validity and to never question the validity of the measure. This flaw is not unique to this article. It is a fundamental flaw that has also led to a large literature on implicit bias based on an invalid measure. The uncritical use of measures has to stop. A science without valid measures is not a science and statistical results that are obtained with invalid measures are not scientific results.

A second flaw of the article is that psychologists are trained to conduct randomized laboratory experiments. Random assignment makes it easy to interpret statistically significant results. Unless something went really wrong or sampling error produced a false result, a statistically significant result means that the experimental manipulation influenced the dependent variable. Causality is built into the design. However, things are very different when we look at naturally occurring covariation because everything is correlated with everything. Observed relationships may not be causal and they can be produced by variables that were not measured. The only way to deal with this uncertainty is to carefully test competing theories. It is also necessary to be careful in the interpretation of results. Clark et al. (2020) failed to do so and make overly strong statements based on their correlational findings.

Many scholars have argued that religion reduces violent behavior within human social groups. Here, we tested whether intelligence moderates this relationship. We hypothesized that religion would have greater utility for regulating violent behavior among societies with relatively lower average IQs than among societies with relatively more cognitively gifted citizens. Two studies supported this hypothesis

This statement would be fine if they had conducted an experiment, but of course, it is impossible to conduct an experiment to examine this question. This also means it is no longer possible to use evidence as support for a hypothesis. Correlational evidence simply cannot verify a hypothesis. It can only falsify wrong theories. Clark et al. (2020) failed to acknowledge competing theories of homicides and to test their theory against competing theories.

The last meta-scientific observation is that all conclusions in science rests on a combination of data and assumptions. When the same data lead to different conclusions, like they did here, we get insights into researchers’ assumptions. Clark et al.’s (2020) assumptions were (a) there are notable difference in intelligence between nations, (b) these differences are measured with high validity by Lynn and Weisenberg’s (2010) questionable IQ scores, and homicides are caused by internal dispositions like being an atheist with low intelligence. Given Lynn and Weisenberg’s finding that their questionable measure correlates highly with skin tone, they also implicitly share the racist assumption that dark skinned people are more violent because they are less intelligent. The present blog post shows that an entirely different story fits the data. Homicides are caused by injustice such as unfair distributions of wealth and discrimination and prejudice based on skin color. I am not saying that my interpretation of the data is correct because I am aware that alternative explanations are possible. However, I rather have a liberal/egalitarian bias than a racist bias.

Implicit Racism, Starbucks, and the Failure of Experimental Social Psychology

Implicit racism is in the news again (CNN).   A manager of a Starbucks in Philadelphia called 911 to ask police to remove two Black men from the coffee store because they had not purchased anything.  The problem is that many White customers frequent Starbucks without purchasing things and the police is not called.  The incident caused widespread protests and Starbucks announced that it would close all of its stores for “implicit bias training.”

Starbucks’ CEO Derrick Johnson explains the need for store-wide training in this quote.

“The Starbucks situation provides dangerous insight regarding the failure of our nation to take implicit bias seriously,” said the group’s president and CEO Derrick Johnson in a statement. “We refuse to believe that our unconscious bias –the racism we are often unaware of—can and does make its way into our actions and policies.”

But was it implicit bias? It does not matter. CEO Derrick Johnson could have talked about racism without changing what happened or the need for training.

“The Starbucks situation provides dangerous insight regarding the failure of our nation to take racism seriously,” said the group’s president and CEO Derrick Johnson in a statement. “We refuse to believe that we are racists and that racism can and does make its way into our actions and policies.”

We have not heard from the store manager why she called the police. This post is not about a single incidence at Starbucks because psychological science can rarely provide satisfactory answers to single events.  However, the call for training of thousands of Starbucks’ employees is not a single event.  It implies that social psychologists have developed scientific ways to measure “implicit bias” and developed ways to change it. This is the topic of this post.

What is implicit bias and what can be done to reduce it?

The term “implicit” has a long history in psychology, but it rose to prominence in the early 1990s when computers became more widely used in psychological research.  Computers made it possible to present stimuli on screens rather than on paper and to measure reaction times rather than self-ratings.  Computerized tasks were first used in cognitive psychology to demonstrate that people have associations that can influence their behaviors.  For example, participants are faster to determine that “doctor” is a word if the word is presented after a related word like “hospital” or “nurse.”

The term implicit is used for effects like this because the effect occurs without participants’ intention, conscious reflection, or deliberation. They do not want to respond this way, but they do, whether they want to or not.  Implicit effects can occur with or without awareness, but they are generally uncontrollable.

After a while, social psychologists started to use computerized tasks that were developed by cognitive psychologists to study social topics like prejudice.  Most studies used White participants to demonstrate prejudice with implicit tasks. For example, the association task described above can be easily modified by showing traditionally White or Black names (in the beginning computers could not present pictures) or faces.

Given the widespread prevalence of stereotypes about African Americans, many of these studies demonstrated that White participants respond differently to Black or White stimuli.  Nobody doubts these effects.  However, there remain two unanswered questions about these effects.

What (the fuck) is Implicit Racial Bias?

First, do responses in this implicit task with racial stimuli measure a specific form of prejudice?  That is, do implicit tasks measure plain old prejudice with a new measure or do they actually measure a new form of prejudice?  The main problem is that psychologists are not very good at distinguishing constructs and measures.  This goes back to the days when psychologists equated measures and constructs.  For example, to answer the difficult question whether IQ tests measure intelligence, it was simply postulated that intelligence is what IQ tests measure.  Similarly, there is no clear definition of implicit racial bias.  In social psychology implicit racism is essentially whatever leads to different responses to Black and White stimuli in an implicit task.

The main problem with this definition is that different implicit tasks show low convergent validity.  Somebody can take two different “implicit tests” (the popular Implicit Association Test, IAT, or the Affective Misattribution Task) and get different results.  The correlations between two different tests range from 0 to .3, which means that the tests disagree more with each other than that they agree.

20 years after the first implicit tasks were used to study prejudice we still do not know whether implicit bias even exist or how it could be measured, despite the fact that these tests are made available to the public to “test their racial bias.”  These tests do not meet the standards of real psychological tests and nobody should take their test scores too seriously.  A brief moment of self-reflection is likely to provide better evidence about your own feelings towards different social groups.  How would you feel if somebody from this group would move in next door? How would you feel if somebody from this group would marry your son or daughter?  Responses to questions like this have been used for over 100 years and they still show that most people have a preference for their own group over most other groups.  The main concern is that respondents may not answer these survey questions honestly.  But if you do so in private for yourself and you are honest to yourself, you will know better how prejudice you are towards different groups than by taking an implicit test.

What was the Starbucks’ manager thinking or feeling when she called 911? The answer to this question would be more informative than giving her an implicit bias test.

Is it possible to Reduce Implicit Bias?

Any scientific answer to this question requires measuring implicit bias.  The ideal study to examine the effectiveness of any intervention is a randomized controlled trial.  In this case it is easy to do so because many White Americans who are prejudice do not want to be prejudice. They learned to be prejudice through parents, friends, school, or media. Racism has been part of American culture for a long time and even individuals who do not want to be prejudice respond differently to White and African Americans.  So, there is no ethical problem in subjecting participants to an anti-racism training program. It is like asking smokers who want to quit smoking to participate in a test of a new treatment of nicotine addiction.

Unfortunately, social psychologists are not trained in running well-controlled intervention studies.  They are mainly trained to do experiments that examine the immediate effects of an experimental manipulation on some measure of interest.  Another problem is that published articles typically report only report successful experiments.  This publication bias leads to the wrong impression that it may be easy to change implicit bias.

For example, one of the leading social psychologist on implicit bias published an article with the title “On the Malleability of Automatic Attitudes: Combating Automatic
Prejudice With Images of Admired and Disliked Individuals” (Dasgupta & Greenwald, 2001).  The title makes two (implicit) claims.  Implicit attitudes can change  (it is malleable) and this article introduces a method that successfully reduced it (combating it).  This article was published 17 years ago and it has been cited 537 times so far.


Study 1

The first experiment relied on a small sample of university students (N = 48).  The study had three experimental conditions with n = 18, 15, and 15 for each condition.  It is now recognized that studies with fewer than n = 20 participants per condition are questionable (Simmons et al., 2011).

The key finding in this study was that scores on the Implicit Association Test (IAT) were lower when participants were exposure to positive examples of African Americans (e.g., Denzel Washington) and negative examples of European Americans (e.g., Jeffrey Dahmer – A serial killer)  than in the control condition, F(1, 31) = 5.23, p = .023.

The observed mean difference is d = .80.  This is considered a large effect. For an intervention to increase IQ it would imply an increase by 80% of a standard deviation or 12 IQ points.  However, in small samples, these estimates of effect size vary a lot.  To get an impression of the range of variability it is useful to compute the 95%CI around the observed effect size. It ranges form d = .10 to 1.49. This means that the actual effect size could be just 10% of a standard deviation, which in the IQ analogy would imply an increase by just 1.5 points.  Essentially, the results merely suggest that there is a positive effect, but they do not provide any information about the size of the effect. It could be very small or it could be very large.

Unusual for social psychology experiments, the authors brought participants back 24 hours after the manipulation to see whether the brief exposure to positive examples had a lasting effect on IAT scores.  As the results were published, we already know that it did. The only question is how strong the evidence was.

The result remained just significant, F(1, 31) = 4.16, p = .04999. A p-value greater than .05 would be non-significant, meaning the study provided insufficient evidence for a lasting change.  More troublesome is that the 95%CI around the observed mean difference of d = .73 ranged from d = .01 to 1.45.  This means it is possible that the actual effect size is just 1% of a standard deviation or 0.15 IQ points.  The small sample size simply makes it impossible to say how large the effect really is.

Study 2

Study 1 provided encouraging results in a small sample.  A logical extension for Study 2 would be to replicate the results of Study 1 with a larger sample in order to get a better sense of the size of the effect.  Another possible extension could be to see whether repeated presentations of positive examples over a longer time period can have lasting effects that last longer than 24 hours.  However, multiple-study articles in social psychology are rarely programmatic in this way (Schimmack, 2012). Instead, they are more a colorfull mosaic of studies that were selected to support a good story like “it is possible to combat implicit bias.”

The sample size in Study 2 was reduced from 48 to 26 participants.  This is a terrible decision because the results in Study 1 were barely significant and reducing sample sizes increases the risk of a false negative result (the intervention actually works, but the study fails to show it).

The purpose of Study 2 was to generalize the results of racial bias to aging bias.  Instead of African and European Americans, participants were exposed to positive and negative examples of young and old people and performed an age-IAT (old vs. young).

The statistical analysis showed again a significant mean difference, F(1, 24) = 5.13, p = .033.  However, the 95%CI again showed a wide range of possible effect sizes from d = .11 to 1.74.  Thus, the study provides no reliable information about the size of the effect.

Moreover, it has to be noted that study two did not report whether a 24-hour follow up was conducted or not.  Thus, there is no replication of the finding in Study 1 that a small intervention can have an effect that lasts 24 hours.

Publication Bias: Another Form of Implicit Bias [the bias researchers do not want to talk about in public]

Significance tests are only valid if the data are based on a representative sample of possible observations.  However, it is well-known that most journals, including social psychology journals publish only successful studies (p < .05) and that researchers use questionable research practices to meet this requirement.  Even two studies are sufficient to examine whether the results are representative or not.

The Test of Insufficient Variance examines whether reported p-values are too similar than we would expect based on a representative sample of data.  Selection for significance reduces variability in p-values because p-values greater than .05 are missing.

This article reported a p-value of .023 in Study 1 and .033 in Study 2.   These p-values were converted int z-values; 2.27 and 2.13, respectively. The variance for these two z-scores is 0.01.  Given the small sample sizes, it was necessary to run simulations to estimate the expected variance for two independent p-values in studies with 24 and 31 degrees of freedom. The expected variance is 0.875.  The probability of observing a variance of 0.01 or less with an expected variance of 0.875 is p = .085.  This finding raises concerns about the assumption that the reported results were based on a representative sample of observations.

In conclusion, the widely cited article with the promising title that scores on implicit bias measures are malleable and that it is possible to combat implicit bias provided very preliminary results that by no means provide conclusive evidence that merely presenting a few positive examples of African Americans reduces prejudice.

A Large-Scale Replication Study 

Nine years later, Joy-Gaba and Nosek (2010) examined whether the results reported by Dasgupta and Greenwald could be replicated.  The title of the article “The Surprisingly Limited Malleability of Implicit Racial Evaluations” foreshadows the results.

“Implicit preferences for Whites compared to Blacks can be reduced via exposure to admired Black and disliked White individuals (Dasgupta & Greenwald, 2001). In four studies (total N = 4,628), while attempting to clarify the mechanism, we found that implicit preferences for Whites were weaker in the “positive Blacks” exposure condition compared to a control condition (weighted average d = .08). This effect was substantially smaller than the original demonstration (Dasgupta & Greenwald, 2001; d = .82).”

On the one hand, the results can be interpreted as a successful replication because the study with 4,628 participants again rejected the null-hypothesis that the intervention has absolutely no effect.  However, the mean difference in the replication study is only d = .08, which corresponds to an effect size estimate of 1.2 IQ points if the study had tried to raise IQ.  Moreover, it is clear that the original study was only able to report a significant result because the observed mean difference in this study was inflated by 1000%.

Study 1

Participants in Study 1 were Canadian students (N = 1,403). The study differed in that it separated exposure to positive Black examples and negative White examples.  Ideally, real-world training programs would aim to increase liking of African Americans rather than make people think about White people as serial killers.  So, the use of only positive examples of African Americans makes an additional contribution by examining a positive intervention without negative examples of Whites.  The study also included age to replicate Study 2.

Like US Americans, Canadian students also showed a preference for White over Blacks on the Implicit Association Test. So failures to replicate the intervention effect are not due to a lack of racism in Canada.

A focused analysis of the race condition showed no effect of exposure to positive Black examples, t(670) = .09, p = .93.  The 95%CI of the mean difference in this study ranged from -.15 to .16.  This means that with a maximum error probability of 5%, it is possible to rule out effect sizes greater than .16.  This finding is not entirely inconsistent with the original article because the original study was inconclusive about effect sizes.

The replication study is able to provide a more precise estimate of the effect size and the results show that the effect size could be 0, but it could not be d = .2, which is typically used as a reference point for a small effect.

Study 2a

Study 2a reintroduced the original manipulation that exposed participants to positive examples of African Americans and negative examples of European Americans.  This study showed a significant difference between the intervention condition and a control condition that exposed participants to flowers and insects, t(589) = 2.08, p = .038.  The 95%CI for the effect size estimate ranged from d = .02 to .35.

It is difficult to interpret this result in combination with the result from Study 1.  First, the results of the two studies are not significantly different from each other.  It is therefore not possible to conclude that manipulations with negative examples of Whites are more effective than those that just show positive examples of Blacks.  In combination, the results of Study 1 and 2a are not significant, meaning it is not clear whether the intervention has any effect at all.  Nevertheless, the significant result in Study 2a suggests that presenting negative examples of Whites may influence responses on the race IAT.

Study 2b

Study 2b is an exact replication of Study 2a.  It also replicated a significant mean difference between participants exposed to positive Black and negative White examples and the control condition, t(788) = 1.99, p = .047 (reported as p = .05). The 95%CI ranges  from d = .002 to d = .28.

The problem is that now three studies produced significant results with exposure to positive Black and negative White examples (Original Study 1; replication Study 2a & 2b) and all three studies had just significant p-values (p = .023, p = .038, p = .047). This is unlikely without selection of data to attain significance.

Study 3

The main purpose of Study 3 was to compare an online sample, an online student sample, and a lab student sample. None of the three samples showed a significant mean difference.

Online sample: t(999) = .96, p = .34

Online student sample: t(93) = 0.51, p = .61

Lab student sample: t(75) = 0.70, p = .48

The non-significant results for the student samples are not surprising because sample sizes are too small to detect small effects.  The non-significant result for the large online sample is more interesting.  It confirms that the two p-values in Studies 2a and 2b were too similar. Study 3 produces greater variability in p-values that is expected and given the small effect size variability was increased by a non-significant result rather than a highly significant one.


In conclusion, there is no reliable evidence that merely presenting a few positive Black examples alters responses on the Implicit Association Test.   There is some suggestive evidence that presenting negative White examples may reduce prejudice presumably by decreasing favorable responses to Whites, but even this effect is very weak and may not last more than a few minutes or hours.

The large replication study shows that the highly cited original article provided misleading evidence that responses on implicit bias measures can be easily and dramatically changed by presenting positive examples of African Americans. If it were this easy to reduce prejudice, racism wouldn’t be the problem that it still is.

Newest Evidence

In a major effort, Lai et al. (2016) examined several interventions that might be used to combat racism.  The first problem with the article is that the literature review fails to mention Joy-Gaba and Nosek’s finding that interventions were rather ineffective or evidence that implicit racism measures show little natural variation over time (Cunningham et al., 2001). Instead they suggest that the ” dominant view has changed over the past 15 years to one of implicit malleability” [what they mean malleability of responses on implict tasks with racial stimuli].  While this may accurately reflect changes in social psychologists’ opinions, it ignores that there is no credible evidence to suggest that implicit attitude measures are malleable.

More important, the study also failed to find evidence that a brief manipulation could change performance on the IAT a day or more later, despite a large sample size to detect even small lasting effects.  However, some manipulations produced immediate effects on IAT scores.  The strongest effect was observed for a manipulation that required vivid imagination.

Vivid counterstereotypic scenario.

Participants in this intervention read a vivid second-person story in which they are the
protagonist. The participant imagines walking down a street late at night after drinking at a bar. Suddenly, a White man in his forties assaults the participant, throws him/her into the trunk of his car, and drives away. After some time, the White man opens the trunk and assaults the participant again. A young Black man notices the second assault and knocks out the White assailant, saving the day.  After reading the story, participants are told the next task (i.e., the race IAT) was supposed to affirm the associations: White = Bad, Black = Good. Participants were instructed to keep the story in mind during the IAT.

When given this instruction, the pro-White bias in the IAT was reduced.  However, one day later (Study 2) or two or three days later (Study 1) IAT performance was not significantly different from a control condition.

In conclusion, social psychologists have found out something that most people already know.  Changing attitudes, including prejudice, is hard because they are stable and difficult to change, even when participants want to change them.  A simple, 5-minute manipulation is not an intervention and it will not produce lasting changes in attitudes.

General Discussion

Social psychology has failed Black people who would like to be treated with the same respect as White people and White people who do not want to be racist.

Since Martin Luther King gave his dream speech, America has made progress towards a goal of racial equality without the help of social psychologists. Nevertheless, racial bias remains a problem, but social psychologists are too busy with sterile experiments that have no application to the real world (No! Starbucks’ employees should not imagine being abducted by White sociopaths to avoid calling 911 on Black patrons of their stores) and performance on an implicit bias test is only relevant if it predicted behavior and it doesn’t do that very well.

The whole notion of implicit bias is a creation by social psychologists without scientific foundations, but 911 calls that kill black people are real.  Maybe Starbucks could  fund some real racism research at Howard University because the mostly White professors at elite Universities seem to be unable to develop and test real interventions that can influence real behavior.

And last but not least, don’t listen to self-proclaimed White experts.



Social psychologists who have failed to validate measures and failed to conduct real intervention studies that might actually work are not experts.  It doesn’t take a Ph.D. to figure out some simple things that can be taught in a one-day workshop for Starbucks’ employees.  After all, the goal is just to get employees to treat all customers equally, which doesn’t even require a change in attitudes.

Here is one simple rule.  If you are ready to call 911 to remove somebody from your coffee shop and the person is Black, ask yourself before you dial whether you would do the same if the person were White and looked like you or your brother or sister. If so, go ahead. If not, don’t touch that dial.  Let them sit at a table like you let dozens of other people sit at their table because you make most of your money from people on the go anyways. Or buy them a coffee, or do something, but think twice or three times before you call the police.

And so what if it is just a PR campaign.  It is a good one. I am sure there are a few people who would celebrate a nation-wide racism training day for police (maybe without shutting down all police stations).

Real change comes from real people who protest.  Don’t wait for the academics to figure out how to combat automatic prejudice.  They are more interested in citations and further research than to provide real solutions to real problems.  Trust me, I know. I am (was?) a White social psychologist myself.