ExpertGen Framework Automates Expert Policy Learning for Robotics via Sim-to-Real Transfer
The ExpertGen framework streamlines the process of learning expert policies in simulation, facilitating effective sim-to-real transitions for robotics. It tackles the issue of gathering extensive, high-quality robotics data, as acquiring human demonstrations via teleoperation is too costly to scale in real-world settings. ExpertGen begins with a behavior prior derived from a diffusion policy that has been trained on imperfect demonstrations, which can be generated by large language models or sourced from humans. Subsequently, reinforcement learning refines this prior to enhance task success by adjusting the initial noise of the diffusion model while maintaining the original policy's integrity. By freezing the pretrained diffusion policy, ExpertGen ensures exploration remains within safe, human-like behavior boundaries. This method allows for the efficient development of robust and generalizable behavior cloning policies. The framework was detailed in the paper titled "ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors," with the arXiv identifier 2603.15956v2. The announcement type is replace-cross, signifying an updated version on the arXiv preprint server.
Key facts
- ExpertGen automates expert policy learning in simulation for scalable sim-to-real transfer.
- Human demonstrations via teleoperation are expensive to acquire at scale in the real world.
- The framework uses a diffusion policy trained on imperfect demonstrations as a behavior prior.
- Imperfect demonstrations can be synthesized by large language models or provided by humans.
- Reinforcement learning optimizes the diffusion model's initial noise to steer toward high task success.
- The original diffusion policy remains frozen during reinforcement learning.
- Freezing the policy regularizes exploration within safe, human-like behavior manifolds.
- The paper is available on arXiv with identifier 2603.15956v2 and announcement type replace-cross.
Entities
Institutions
- arXiv