Goal Generalisation in Sequential Reinforcement Learning Studied

other · 2026-05-25

Researchers have investigated how reinforcement learning agents generalise goals to novel environments after sequential training. The study analyzed over 100 training pipelines and evaluated behavior across more than 250 out-of-distribution environments. Key findings show that salient features drive generalisation, and goals learned early in training can persist and influence later learning. To explain these phenomena, the authors introduced latent policy gradients, a method that predicts out-of-distribution behavior by simulating the evolution of low-dimensional latent variables during training based on achieving high reward. The research addresses a gap in understanding unintended goal-directed behavior outside training distribution.

Key facts

arXiv:2605.23565v1
Over 100 sequential training pipelines studied
Behavior evaluated across over 250 out-of-distribution environments
Salient features drive generalisation
Early learned goals persist and influence later goals
Latent policy gradients method introduced
Method simulates evolution of low-dimensional latent variables
Addresses lack of principled understanding of out-of-distribution generalisation

Entities

—

Sources

arXiv cs.AI — 2026-05-25