Probabilistic Latent Embeddings for Sim-to-Real RL Transfer

other · 2026-05-28

A novel framework for reinforcement learning incorporates probabilistic latent embeddings alongside dynamic policy adaptation, facilitating secure and effective policy transfer from simulations to real-world applications. This method tackles the Sim2Real gap in cyber-physical systems, such as autonomous vehicles, where zero-shot techniques frequently compromise performance or pose safety risks. By modeling a set of Constrained Markov Decision Processes (CMDPs) across various environmental contexts, the framework utilizes meta-RL to deduce latent context variables, allowing for dynamic policy adjustments.

Key facts

Deep RL agents for cyber-physical systems are first trained in simulators due to limited resources and safety concerns.
The Sim2Real gap causes performance degradation or safety violations in real-world deployment.
Existing zero-shot approaches like robust safe RL and domain randomization mitigate the issue but at the cost of degraded performance or residual safety risks.
The proposed framework uses probabilistic latent embeddings and dynamic policy adaptation.
It considers a family of Constrained Markov Decision Processes (CMDPs) under different environment contexts.
The framework leverages latent context variables in meta-RL to infer environment contexts.
The paper is from arXiv:2605.27659v1.
The research focuses on safe and efficient policy transfer.

Probabilistic Latent Embeddings for Sim-to-Real RL Transfer

Key facts

Entities

Institutions

Sources