Path-Coupled Bellman Flows for Distributional RL
A novel approach known as Path-Coupled Bellman Flows (PCBF) has been introduced by researchers, aiming to resolve issues of boundary mismatch and the high variance associated with bootstrapping in current flow-based methods. PCBF operates with source-consistent Bellman-coupled paths, initiating from a base prior at t=0, achieving the Bellman target at t=1, and preserving a pathwise affine connection to the successor flow at intermediate times. This approach links current and successor return flows through a common base noise and utilizes a λ-parameterized control-variate target. Notably, it eliminates the need for time-t marginals to fulfill a distributional Bellman fixed point across all t. The research can be found on arXiv with the identifier 2605.08253.
Key facts
- PCBF is a continuous-time DRL method
- Addresses boundary mismatch and high-variance bootstrapping
- Uses source-consistent Bellman-coupled paths
- Current path starts from base prior at t=0
- Reaches Bellman target at t=1
- Maintains affine relation to successor flow at intermediate times
- Couples flows through shared base noise
- Uses λ-parameterized control-variate target
Entities
Institutions
- arXiv