ARTFEED — Contemporary Art Intelligence

KL-Constrained Adversarial Curriculum Improves World Model Learning

ai-technology · 2026-05-20

Researchers propose PROWL, a method to improve world model learning by actively eliciting failures. A policy is trained to find high-error trajectories for a diffusion-based world model, which is then fine-tuned on these trajectories. This adversarial loop converts rare failures into stable training signals without drifting out of distribution. The approach addresses the issue of passive data under-sampling critical transitions.

Key facts

  • Modern video world models achieve short-horizon realism but fail on rare transitions.
  • Passive data under-samples high-impact regimes.
  • PROWL uses a KL-constrained adversarial curriculum.
  • A policy exposes high-error trajectories of a diffusion-based world model.
  • The world model is fine-tuned on adversarially discovered trajectories.
  • The method avoids out-of-distribution exploitation.
  • It converts rare failures into near-distribution training signals.
  • The approach maintains pressure on unresolved weaknesses.

Entities

Institutions

  • arXiv

Sources