V-VLAPS: Value-Guided Planning Improves VLA Models for Robotics

ai-technology · 2026-05-25

Researchers introduce V-VLAPS (Value-Guided Vision-Language-Action Planning and Search), a method that enhances vision-language-action (VLA) models for robotic manipulation by adding a lightweight value head trained on offline rollouts. This value head predicts Monte Carlo returns, guiding Monte Carlo Tree Search toward higher-value branches. The approach addresses failures in reactive VLA policies under distribution shift and long-horizon tasks, where prior planning methods relied on policy priors and visit-count exploration without learned value signals. The work builds on findings that VLA representations encode rollout success and failure, enabling value estimation during planning.

Key facts

V-VLAPS augments VLA-guided planning with a lightweight value head
Value head is trained on offline VLA rollouts to predict Monte Carlo returns
Predictions guide Monte Carlo Tree Search toward higher-value branches
Reactive VLA policies fail under distribution shift and long-horizon tasks
Prior planning methods lacked learned value signals to correct poor policy actions
VLA representations encode rollout success and failure information
Method aims to improve robotic manipulation execution
Paper available on arXiv with ID 2601.00969

V-VLAPS: Value-Guided Planning Improves VLA Models for Robotics

Key facts

Entities

Institutions

Sources