Looped Transformers: Fixed-Point Framework for Test-Time Scaling

other · 2026-04-24

A novel theoretical framework examines the stability and generalization aspects of looped transformer architectures, which offer potential for scaling compute during testing by focusing on more challenging problems. This research presents a fixed-point analysis across three dimensions: reachability, input-dependence, and geometry. The findings demonstrate that looped networks lacking recall have countable fixed points and fail to achieve significant input-dependence across any spectral regime. Conversely, incorporating recall with outer normalization creates a reliable environment where fixed points are reachable, locally smooth concerning input, and supported by stable backpropagation. Additionally, single-layer looped transformers were tested on chess, sudoku, and prefix-sums tasks. The paper can be found on arXiv with ID 2604.15259.

Key facts

Looped transformers promise test-time compute scaling by spending more iterations on harder problems.
A fixed-point based framework analyzes looped architectures along three axes: reachability, input-dependence, and geometry.
Looped networks without recall have countable fixed points and cannot achieve strong input-dependence at any spectral regime.
Recall combined with outer normalization produces a regime with reachable, locally smooth fixed points and stable backpropagation.
Empirical training of single-layer looped transformers was performed on chess, sudoku, and prefix-sums tasks.
The paper is titled 'Stability and Generalization in Looped Transformers'.
The paper is available on arXiv under ID 2604.15259.
The study addresses whether looped architectures can extrapolate to harder problems at test time rather than memorize training-specific solutions.

Looped Transformers: Fixed-Point Framework for Test-Time Scaling

Key facts

Entities

Institutions

Sources