ARTFEED — Contemporary Art Intelligence

Nearly Optimal Attention Coresets Achieved

ai-technology · 2026-05-09

A new computer science paper proves the existence of nearly optimal coresets for estimating the Attention mechanism in small space. The result shows that for any set of unit-norm keys and values in ℝ^d, there exists a subset of size at most O(√d e^{ρ+o(ρ)}/ε) that approximates the attention output for all queries with norm bounded by ρ, outperforming prior work. An improved lower bound of Ω(√d e^ρ/ε) is also provided.

Key facts

  • Paper titled 'Nearly Optimal Attention Coresets'
  • Proves existence of coresets for Attention mechanism
  • Coreset size: O(√d e^{ρ+o(ρ)}/ε)
  • Works for unit-norm keys and values in ℝ^d
  • Approximation error ≤ ε for all queries with norm ≤ ρ
  • Outperforms best known results
  • Improved lower bound: Ω(√d e^ρ/ε)
  • Submitted to arXiv (2605.05602)

Entities

Institutions

  • arXiv

Sources