WarmPrior Improves Robotic Manipulation with Temporal Priors
A new method called WarmPrior enhances generative policies for robotic control by replacing the standard Gaussian source distribution with a temporally grounded prior constructed from recent action history. This approach consistently improves success rates on manipulation tasks by straightening probability paths, similar to optimal-transport couplings in Rectified Flow. WarmPrior also reshapes exploration in prior-space reinforcement learning, boosting sample efficiency and final performance. The research identifies the source distribution as a key underexplored design axis in generative robot control.
Key facts
- WarmPrior is a temporally grounded prior for generative policies
- It replaces the standard Gaussian source distribution
- Constructed from readily available recent action history
- Consistently improves success rates on robotic manipulation tasks
- Straightens probability paths, echoing Rectified Flow
- Also reshapes exploration distribution in prior-space RL
- Improves both sample efficiency and final performance
- Identifies source distribution as underexplored design axis
Entities
Institutions
- arXiv