ARTFEED — Contemporary Art Intelligence

Transformers with Average Attention Match Arithmetic Circuits

ai-technology · 2026-05-07

A recent study published on arXiv (2605.04683) investigates the capabilities of transformer encoders functioning as sequence-to-sequence mappings for vectors. The researchers reveal that average hard attention can effectively replicate arithmetic circuits when these circuits are input into the encoder. These simulated circuit families maintain a constant depth while allowing for unlimited addition, binary multiplication, and sign gates. In this research, transformers substitute feed-forward networks with arithmetic circuits. Additionally, the functions generated by these transformers using typical average attention can also be computed by the same class of circuit families. The findings are applicable to transformers over the reals, rationals, and any intermediary ring. This paper falls under the category of Computer Science > Computational Complexity.

Key facts

  • arXiv paper ID 2605.04683
  • Title: Average Attention Transformers and Arithmetic Circuits
  • Analyzes computational power of transformer encoders
  • Average hard attention can simulate arithmetic circuits
  • Simulated circuits have constant depth
  • Circuits use unbounded addition, binary multiplication, sign gates
  • Transformers use arithmetic circuits instead of feed-forward networks
  • Results hold for reals, rationals, and intermediate rings
  • Classified under Computer Science > Computational Complexity

Entities

Institutions

  • arXiv

Sources