Comparing Replicability Estimated by Z-Curve to Real Large-Scale Replication Attempts

Reference

Sotola, L. (2023). How Can I Study from Below, that which Is Above? : Comparing Replicability Estimated by Z-Curve to Real Large-Scale Replication Attempts. Meta-Psychology, 7. https://doi.org/10.15626/MP.2022.3299


Scientific Contribution Evaluation

Strengths of the Contribution

Sotola (2023) makes a distinctive and meaningful scientific contribution because it provides the first and only empirical validation of z-curve estimates against real replication outcomes across multiple large-scale replication projects. Simulation-based validations existed before this paper, but no study had tested whether z-curve’s ERR, EDR, and midpoint estimates matched actual replication success rates. This fills an important gap, because reviewers repeatedly ask for evidence that z-curve corresponds to real-world outcomes rather than only theoretical or simulation-derived expectations.

The study is also transparent, reproducible, and conducted with sincere methodological care. It shows convincingly that z-curve’s midpoint estimate closely reflects real replicability—coming within about two percentage points of the true replication rate—which is an unusually strong and practically important result.

Limitations That Temper the Rating

The scientific contribution is not perfect. The largest methodological flaw—the recoding of marginally significant p-values (p between .05 and .10) as .049999—introduces avoidable bias and was not quantified. The article also does not provide domain-specific robustness analyses or alternative extraction procedures. Nonetheless, these are weaknesses of execution rather than concept, and they do not undermine the article’s primary contribution.

Overall Assessment

As a scientific contribution, the article:

  • provides new empirical validation that did not previously exist,
  • improves confidence in the use of z-curve across journals and subfields,
  • directly addresses common reviewer objections about the lack of empirical testing,
  • and demonstrates transparency and intellectual honesty typical of the Meta-Psychology format.

Overall Rating: 8.5 / 10

This reflects:

  • 10/10 for contribution originality and relevance,
  • 9/10 for empirical importance,
  • 7/10 for methodological execution,
  • yielding a balanced 8.5 as an overall score.

With the marginal-significance recoding issue resolved, the paper would approach 9.0–9.5.

1 thought on “Comparing Replicability Estimated by Z-Curve to Real Large-Scale Replication Attempts

Leave a Reply