Distilling Game Code World Model Generation into Lightweight LLMs
Researchers have introduced a method to distill the ability to generate Game Code World Models (GameCWMs) from large frontier models into smaller, more accessible LLMs. GameCWMs are Python implementations of game rules that include legal actions, state transitions, observations, and rewards, enabling AI agents to use solvers like Monte Carlo Tree Search. Current generation relies on large models and iterative refinement, limiting scalability. The team created a curated dataset of 30 games spanning perfect and imperfect information, and used post-training to transfer capabilities. This work aims to democratize automated environment construction for AI agents.
Key facts
- Large Language Models can generate executable code from natural language.
- Code World Models translate game rules into Python for AI solvers.
- GameCWMs implement rules, actions, state transitions, observations, and rewards.
- Current approaches rely on frontier models and inference-time refinement.
- This work distills GameCWM generation into smaller models via post-training.
- A curated dataset of 30 games was introduced.
- Games span perfect and imperfect information.
- Goal is to improve accessibility and scalability of environment generation.
Entities
—