ARTFEED — Contemporary Art Intelligence

Lifted World Models Enable High-Level Planning for Embodied Agents

ai-technology · 2026-04-30

The recently introduced framework in arXiv preprint 2604.26182 presents lifted world models that translate high-level actions into sequences of low-level joint movements, facilitating efficient planning for complex entities such as human agents. Conventional world models forecast future observations based on actions, but the high-dimensional nature of action spaces (like controlling each joint of a human) leads to poor scalability for search-based techniques like CEM. The innovative approach involves training a lightweight policy that integrates with a static world model, resulting in a lifted model capable of predicting future observations from a single high-level action. For human-like embodiments, the high-level action space consists of a limited set of 2D waypoints marked on the current observation frame, each indicating a near-term target position for a leaf joint (pelvis), thereby simplifying planning and enhancing control scalability.

Key facts

  • arXiv preprint 2604.26182 introduces lifted world models for planning and control.
  • World models predict future observations conditioned on agent actions.
  • High-dimensional action spaces (e.g., human joint control) make planning expensive.
  • A lightweight policy maps high-level actions to sequences of low-level joint actions.
  • The policy composes with a frozen world model to create a lifted world model.
  • High-level actions are defined as 2D waypoints on the current observation frame.
  • Each waypoint specifies a near-term goal position for a leaf joint (pelvis).
  • The framework aims to improve scalability of search-based planning methods like CEM.

Entities

Institutions

  • arXiv

Sources