Coding Agent with World Model Achieves 28% Solve Rate on ARC-AGI-3

ai-technology · 2026-05-07

A recent preprint on arXiv (2605.05138) assesses a coding-agent framework for ARC-AGI-3, which operates an executable Python world model. This model is validated against real-world observations and is simplified through refactoring to achieve a form of simplicity bias akin to MDL. Planning occurs within the model prior to execution. The system comprises a scripted controller, established world-model interfaces, verification programs, and a planning executor, all devoid of any game-specific coded logic. In 25 public ARC-AGI-3 games, each session utilizes a new agent instance without prior file or conversation access. The agent successfully completed 7 games (28% solve rate), surpassed a Relative Human Action Efficiency of 75% in 6 games, and recorded an average score per game. Variability in results was noted across multiple independent playthroughs for certain games. This method is purposefully straightforward, emphasizing explicit verification and refactoring over learned components.

Key facts

arXiv:2605.05138 evaluates a coding-agent system for ARC-AGI-3
Agent maintains an executable Python world model
System uses scripted controller, predefined interfaces, verifier programs, plan executor
No hand-coded game-specific logic
Tested on 25 public ARC-AGI-3 games
Each playthrough uses a fresh agent instance
Agent fully solved 7 games (28% solve rate)
Relative Human Action Efficiency >75% on 6 games
Multiple playthroughs for some games show run-to-run variability

Coding Agent with World Model Achieves 28% Solve Rate on ARC-AGI-3

Key facts

Entities

Institutions

Sources