Target-Aligned Generation Boosts Cross-Domain Offline RL

other · 2026-05-14

A new framework called Target-aligned Coverage Expansion (TCE) has been introduced by researchers to tackle the challenges of distributional mismatch in cross-domain offline reinforcement learning when the source and target environments are not aligned. TCE employs a dual score-based generative model to create transitions that are consistent with the target, allowing for the choice between integrating target-near transitions or enhancing coverage via generation. Experimental results indicate that TCE surpasses leading baselines in various cross-domain settings.

Key facts

Cross-domain offline RL adapts a policy from source to target domain using pre-collected datasets.
Environment dynamics may differ between source and target domains.
Key challenge is reducing distributional mismatch with limited target data.
TCE framework uses dual score-based generative model for target-aligned generation.
TCE decides between direct incorporation of target-near transitions or coverage expansion.
TCE consistently outperforms state-of-the-art cross-domain offline RL baselines.
Extensive experiments across diverse cross-domain environments were conducted.
The approach is guided by theoretical analysis.

Target-Aligned Generation Boosts Cross-Domain Offline RL

Key facts

Entities

Institutions

Sources