StateX enhances RNN recall by expanding recurrent state post-training

ai-technology · 2026-04-27

A new method called StateX improves the recall ability of recurrent neural networks (RNNs) by expanding their recurrent state size after training. RNNs, including linear attention and state-space models, are popular for processing long contexts due to constant per-token complexity, but they struggle with tasks requiring accurate recall because all context is compressed into a fixed-size state. Prior work shows recall correlates with state size, but training large-state RNNs is costly. StateX is a post-training framework that modifies architectures to scale up state size with negligible parameter increase. Experiments on models with up to 7 billion parameters demonstrate improved recall on long-context tasks. The paper is available on arXiv under identifier 2509.22630.

Key facts

StateX is a post-training framework for expanding RNN states.
It targets linear attention and state-space models.
State expansion improves recall ability without significant parameter increase.
Experiments were conducted on models with up to 7 billion parameters.
The paper is available on arXiv: 2509.22630.

StateX enhances RNN recall by expanding recurrent state post-training

Key facts

Entities

Institutions

Sources