ARTFEED — Contemporary Art Intelligence

RAW-Dream: Task-Agnostic World Models for VLA Reinforcement Learning

ai-technology · 2026-05-13

A new arXiv preprint (2605.12334) introduces RAW-Dream (Reinforcing VLAs in task-Agnostic World Dreams), a paradigm for training Vision-Language-Action (VLA) models via reinforcement learning in world models. The method addresses the scalability issue of existing approaches that require task-specific data for fine-tuning world and reward models. RAW-Dream disentangles world model learning from downstream tasks by using a world model pre-trained on diverse task-free behaviors for trajectory prediction, and an off-the-shelf Vision-Language Model (VLM) for reward generation. This enables zero-shot inference on unseen tasks, reducing reliance on costly real-world interactions.

Key facts

  • arXiv preprint 2605.12334 proposes RAW-Dream.
  • RAW-Dream stands for Reinforcing VLAs in task-Agnostic World Dreams.
  • It uses a world model pre-trained on task-free behaviors.
  • Reward generation employs an off-the-shelf VLM.
  • Aims to enable zero-shot inference on unseen tasks.
  • Reduces sample complexity of policy training.
  • Disentangles world model learning from downstream task dependencies.
  • Addresses scalability limitations of existing VLA fine-tuning methods.

Entities

Institutions

  • arXiv

Sources