Probabilistic Latent Embeddings for Sim-to-Real RL Transfer
A novel framework for reinforcement learning incorporates probabilistic latent embeddings alongside dynamic policy adaptation, facilitating secure and effective policy transfer from simulations to real-world applications. This method tackles the Sim2Real gap in cyber-physical systems, such as autonomous vehicles, where zero-shot techniques frequently compromise performance or pose safety risks. By modeling a set of Constrained Markov Decision Processes (CMDPs) across various environmental contexts, the framework utilizes meta-RL to deduce latent context variables, allowing for dynamic policy adjustments.
Key facts
- Deep RL agents for cyber-physical systems are first trained in simulators due to limited resources and safety concerns.
- The Sim2Real gap causes performance degradation or safety violations in real-world deployment.
- Existing zero-shot approaches like robust safe RL and domain randomization mitigate the issue but at the cost of degraded performance or residual safety risks.
- The proposed framework uses probabilistic latent embeddings and dynamic policy adaptation.
- It considers a family of Constrained Markov Decision Processes (CMDPs) under different environment contexts.
- The framework leverages latent context variables in meta-RL to infer environment contexts.
- The paper is from arXiv:2605.27659v1.
- The research focuses on safe and efficient policy transfer.
Entities
Institutions
- arXiv