Goal Generalisation in Sequential Reinforcement Learning Studied
Researchers have investigated how reinforcement learning agents generalise goals to novel environments after sequential training. The study analyzed over 100 training pipelines and evaluated behavior across more than 250 out-of-distribution environments. Key findings show that salient features drive generalisation, and goals learned early in training can persist and influence later learning. To explain these phenomena, the authors introduced latent policy gradients, a method that predicts out-of-distribution behavior by simulating the evolution of low-dimensional latent variables during training based on achieving high reward. The research addresses a gap in understanding unintended goal-directed behavior outside training distribution.
Key facts
- arXiv:2605.23565v1
- Over 100 sequential training pipelines studied
- Behavior evaluated across over 250 out-of-distribution environments
- Salient features drive generalisation
- Early learned goals persist and influence later goals
- Latent policy gradients method introduced
- Method simulates evolution of low-dimensional latent variables
- Addresses lack of principled understanding of out-of-distribution generalisation
Entities
—