The Implicit Association Test (IAT) was introduced in the late 1990s as a measure of implicit associations, cognitions, and attitudes that might not be fully captured by self-report measures. The prospect of assessing socially sensitive constructs such as prejudice using a brief reaction-time task made the method widely appealing. Early publications emphasized that implicit and explicit measures could diverge and that the IAT might detect evaluative processes that are difficult to access through introspection.
However, the IAT spread rapidly within social psychology before receiving the kind of psychometric validation typically applied to psychological tests. As a result, modest correlations between IAT scores and self-ratings were often interpreted as evidence that the IAT measures a distinct, implicit construct, rather than as possible indicators of measurement error Theoretical discussions also suggested that implicit attitudes could reflect introspectively inaccessible processes.
An alternative interpretation of the low implicit–explicit correlations is that IAT scores contain substantial method-specific and error variance. This view is supported by three patterns in the literature.
First, different implicit measures—such as the evaluative priming task and the affect misattribution paradigm—tend to show low convergent validity with the IAT.
Second, meta-analyses consistently find that IAT scores predict behavior only weakly, often less strongly than explicit measures, and provide limited incremental validity over self-reports.
Third, latent variable analyses show that once measurement error and method variance are modeled explicitly, a single-factor model often fits as well as, or better than, two-factor models that assume distinct implicit and explicit constructs.
Given this background, an adversarial collaboration on the validity of implicit measures provided an important opportunity to evaluate competing optimistic and pessimistic interpretations of the evidence in a joined project (Axt et al., 2025). However, this project did not include a psychometrically trained critic of IAT research and did not respond to the challenges raised in Schimmack (2020). This omission may help explain why the published model rests on strong identification assumptions that were challenged by Schimmack (2020) and strongly influence the results.
Figure 1 of the article specifies a model in which implicit and explicit attitudes are represented by separate latent variables with a correlation of r = .41. Implicit attitudes are treated as a factor identified by four tasks: the standard IAT, the single-category IAT, the evaluative priming task (EPT), and the affect misattribution paradigm (AMP). However, the factor loadings reveal that two of the measures contribute little to the factor. The loading of the EPT is approximately .08, implying that less than 1% of its variance reflects the common factor. The AMP loading of .25 similarly implies that most of its variance is unique. This leaves the two IAT variants as the primary indicators of the latent implicit factor. This makes it impossible to distinguish construct variance that may reflect implicit racial biases, from shared method variance between two nearly identical tasks (one IAT with Black and White pictures, one with only Black pictures).
The published model constrains the residuals of the two IATs to be uncorrelated, thereby assuming no shared method variance. This assumption is questionable because both tasks use nearly identical procedures, differ only in stimuli, and are known to correlate strongly for method-related reasons. Previous psychometric work has shown that IATs commonly exhibit substantial method variance, and omitting such variance can lead to inflated estimates of discriminant validity (Schimmack, 2020).

To evaluate this empirically, alternative models can be fitted to the data. A one-factor model in which all 12 measures load on a single latent attitude factor does not achieve standard fit criteria (χ²(54) = 331.43, CFI = .914, RMSEA = .065). Inspection of modification indices identifies three large residual correlations: (a) between the two explicit self-report measures (SRS and ANES), (b) between the two IATs, and (c) a smaller correlation between two behavioral tasks. Adding these three theoretically plausible residual correlations yields excellent model fit (χ²(51) = 69.51, CFI = .994, RMSEA = .017). Under this model, the data no longer support a clear distinction between implicit and explicit latent factors.

In contrast, the published model includes the residual correlation between the two self-report measures but not between the two IATs. Modification indices still suggest a substantial residual correlation between the two IATs (MI ≈ 43), but adding this parameter leads to identification problems under the authors’ original specification. This indicates that conclusions about discriminant validity depend heavily on the assumption that the two IATs do not share method variance—a strong assumption that may not be justified.
Alternative specifications are possible. For example, assuming equal validity for the IAT and AMP (given their similar correlations with the behavioral tasks) yields an estimated implicit–explicit latent correlation of approximately .74 (95% CI: .56–.92). Under this model, the evidence for distinct implicit and explicit constructs becomes substantially weaker, but can be rejected at the typical 5% criterion of a type-I error.
Finally, it is important to distinguish between the predictive validity of latent variables and that of observed scores. The latent implicit factor in the authors’ model predicts the latent behavioral disposition with a standardized effect of about .16, which is small and consistent with prior meta-analytic estimates. However, because IAT scores include considerable error and method variance, the predictive validity of raw IAT scores is smaller still.
In summary, the conclusions of the adversarial collaboration depend largely on an identification constraint that prevents modeling residual covariance between two highly similar IAT tasks. When this assumption is relaxed, alternative models fit the data well and yield substantially higher correlations between implicit and explicit attitudes. This suggests that the data do not provide strong evidence for discriminant validity between implicit and explicit constructs once method variance is taken into account.
Optimism Is Not a Scientific Interpretation
A central limitation of the article is that it frames its findings partly in terms of “optimistic” versus “pessimistic” interpretations. Scientific evaluation should not depend on an emotional framing; it should rest on empirical evidence and rigorous measurement.
For example, the authors write that their results “could be viewed as encouraging for the predictive validity and utility of indirect measures,” noting that the study demonstrates that implicit attitudes “can reliably correlate with socially important behavioral outcomes” and explain variance beyond self-reports (Kang, 2024). However, the effect sizes reported in the study are consistent with 20 years of prior research in which implicit measures—whether IAT, evaluative priming, or AMP—showed small predictive validity and minimal incremental prediction beyond explicit measures. The present findings do not change this conclusion, especially when the shared method variance between IATs is taken into account.
In addition, the behavioral outcomes examined in the article are laboratory-style tasks that are only loosely connected to the real-world consequences of prejudice. To evaluate the societal relevance of implicit attitudes, research would need to examine behaviors that matter directly for marginalized groups—such as treatment in healthcare settings, hiring decisions, or the use of force in policing. Even then, decades of social psychology show that self-report attitudes (and attitude-related latent variables) have limited ability to predict specific behaviors, especially behaviors that are rare, highly constrained, or context-dependent. Thus, asking whether the IAT predicts such outcomes may be the wrong question.
A more informative scientific approach may be to study meaningful social behaviors directly, and to investigate the situational and structural conditions that shape them, rather than relying on reaction-time–based measures of attitudes. Put differently, the field may gain more by studying social behavior without attitudes than by continuing to study attitudes without social behavior.




