Absorber LLM: Causal Synchronization for Efficient Long-Context Inference

ai-technology · 2026-04-25

Researchers propose Absorber LLM, a method that formulates long-context retention as self-supervised causal synchronization. The approach absorbs historical contexts into model parameters, enabling a contextless model to match the original model's future generations. This addresses the high computational cost of self-attention in transformers and overcomes limitations of constant-memory alternatives like RNNs and SSMs, which lose long-tail dependencies, and Test-Time Training (TTT), which overfits token-level projection. Experiments on long-context tasks demonstrate effectiveness. The paper is available on arXiv.

Key facts

Absorber LLM uses causal synchronization for test-time training.
It addresses high memory consumption of self-attention in transformers.
Constant-memory alternatives like RNNs and SSMs lose long-tail dependencies.
TTT methods overfit token-level projection and fail to preserve causal effect.
The method absorbs historical contexts into parameters.
A contextless model matches the original model with full context on future generations.
Experiments show effectiveness on long-context tasks.
The paper is on arXiv with ID 2604.20915.

Absorber LLM: Causal Synchronization for Efficient Long-Context Inference

Key facts

Entities

Institutions

Sources