ARTFEED — Contemporary Art Intelligence

Invariant Properties Discovered in Softmax Attention Mechanisms

other · 2026-05-07

A new arXiv preprint (2605.02907) reveals invariant properties in softmax attention, a core component of transformer models. The authors define the 'energy field' as row-centered attention logits and demonstrate two classes of invariants. Mechanism-level invariants arise from the algebraic structure of softmax attention, including a per-row zero-sum constraint, a rank bound determined by head dimension, and spectral signatures. Model-level regularities, not required by the mechanism, hold across all tested autoregressive language models from various architecture families. The energy field's variance distributes over key positions without concentration, a property traced to 'key incoherence' in the key matrix. These findings have practical consequences for understanding and improving attention-based models.

Key facts

  • arXiv preprint 2605.02907
  • Softmax attention maps query-key interactions into probability distributions
  • Energy field defined as row-centered attention logit
  • Two classes of invariants: mechanism-level and model-level
  • Mechanism-level invariants include per-row zero-sum constraint, rank bound, spectral signatures
  • Model-level regularities hold across all tested autoregressive language models
  • Energy field variance delocalizes over key positions
  • Delocalization traced to key incoherence in the key matrix

Entities

Institutions

  • arXiv

Sources