Linear Recurrent Memory Theory for Reinforcement Learning

other · 2026-06-01

A theoretical paper on arXiv (2605.31261) explains why linear recurrent neural networks work well as memory units in partially observable reinforcement learning. The authors construct two linear filters: one reproduces belief vector logits in hidden Markov models (HMMs) with deterministic transitions, serving as a sufficient statistic for optimal policy learning; the other achieves vanishing state-decoding error under nearly deterministic transitions. Results extend to action-controlled HMMs with time-varying dynamics. Numerical experiments confirm the filters' effectiveness as feature extractors.

Key facts

Paper on arXiv: 2605.31261
Studies linear recurrent neural networks in reinforcement learning
Constructs two linear filters for HMMs
First filter reproduces belief vector logits under deterministic transitions
Second filter reduces state-decoding error under nearly deterministic transitions
Results extend to action-controlled HMMs
Numerical experiments validate findings
Filters serve as strong feature extractors

Linear Recurrent Memory Theory for Reinforcement Learning

Key facts

Entities

Institutions

Sources