Internal AI Model Risk Reporting Framework Proposed
A new guide from arXiv (2604.24966) addresses risks from frontier AI companies' internal use of advanced models before public release. Anthropic's Mythos Preview, a model with cyberoffense capabilities, was used internally for six weeks before announcement. Legal frameworks like California's SB 53, New York's RAISE Act, and the EU's General-Purpose AI Code of Practice require developers to manage internal risks and produce reports on safeguards and residual risks.
Key facts
- Frontier AI companies deploy advanced models internally for weeks or months before public release.
- Anthropic developed Mythos Preview with advanced cyberoffense-relevant capabilities.
- Mythos Preview was available internally for at least six weeks before public announcement.
- Internal use creates risks not addressed by external deployment frameworks.
- California's Transparency in Frontier Artificial Intelligence Act (SB 53) discusses internal AI risks.
- New York's Responsible AI Safety And Education (RAISE) Act addresses internal AI use risks.
- EU's General-Purpose AI Code of Practice covers risks from internal AI use.
- Legal frameworks require internal use risk reports describing safeguards and residual risks.
Entities
Institutions
- Anthropic
- arXiv
- California State Legislature
- New York State Legislature
- European Union
Locations
- California
- New York
- European Union