Linear Recurrent Memory Theory for Reinforcement Learning
A theoretical paper on arXiv (2605.31261) explains why linear recurrent neural networks work well as memory units in partially observable reinforcement learning. The authors construct two linear filters: one reproduces belief vector logits in hidden Markov models (HMMs) with deterministic transitions, serving as a sufficient statistic for optimal policy learning; the other achieves vanishing state-decoding error under nearly deterministic transitions. Results extend to action-controlled HMMs with time-varying dynamics. Numerical experiments confirm the filters' effectiveness as feature extractors.
Key facts
- Paper on arXiv: 2605.31261
- Studies linear recurrent neural networks in reinforcement learning
- Constructs two linear filters for HMMs
- First filter reproduces belief vector logits under deterministic transitions
- Second filter reduces state-decoding error under nearly deterministic transitions
- Results extend to action-controlled HMMs
- Numerical experiments validate findings
- Filters serve as strong feature extractors
Entities
Institutions
- arXiv