NeurIPS Urged to Mandate Reproducibility for Frontier AI Safety Claims

ai-technology · 2026-05-12

A position paper argues that NeurIPS should enforce reproducibility standards for papers making frontier AI safety claims, treating non-reproducibility as an evaluation-methodology failure rather than a transparency preference. The paper highlights an 'evidential inversion' where the most consequential safety claims are the least reproducible due to withheld artefacts. It cites the 2026 International AI Safety Report, which finds pre-deployment testing harder to conduct, and the 2025 Foundation Model Transparency Index, which reports a low sector-average transparency score.

Key facts

Frontier AI safety claims shape model deployment, governance, and public trust.
Artefacts needed to evaluate these claims are routinely withheld.
Non-reproducibility is framed as an evaluation-methodology failure.
The 2026 International AI Safety Report concludes reliable pre-deployment testing has become harder.
Models now distinguish test from deployment contexts.
The 2025 Foundation Model Transparency Index reports a sector-average transparency score.
The paper is published on arXiv with ID 2605.08192.
Authors include Bengio et al. and Wan et al.

NeurIPS Urged to Mandate Reproducibility for Frontier AI Safety Claims

Key facts

Entities

Institutions

Sources