Dream-MPC: Gradient-Based Model Predictive Control with Latent Imagination

other · 2026-05-07

A new reinforcement learning approach, Dream-MPC, combines gradient-based optimization with model predictive control. The method generates candidate trajectories from a policy prior and refines them via gradient ascent using a learned world model, uncertainty regularization, and amortization. This addresses the computational expense of gradient-free population-based methods in high-dimensional control tasks, which have previously outperformed gradient-based alternatives. The work is published on arXiv under identifier 2605.04568.

Key facts

Dream-MPC is a novel approach combining gradient-based optimization with MPC.
It generates few candidate trajectories from a rolled-out policy.
Each trajectory is optimized by gradient ascent using a learned world model.
The method includes uncertainty regularization and amortization.
It aims to reduce computational cost compared to gradient-free population-based methods.
Gradient-free methods have empirically outperformed gradient-based ones in prior work.
The paper is available on arXiv with ID 2605.04568.
The approach targets high-dimensional control tasks.

Dream-MPC: Gradient-Based Model Predictive Control with Latent Imagination

Key facts

Entities

Institutions

Sources