ARTFEED — Contemporary Art Intelligence

Thinking as Compression: LLMs Naturally Shorten Context Without Special Training

ai-technology · 2026-05-28

A new research paper from arXiv reveals that reasoning models can inherently compress long contexts by generating thinking traces, eliminating the need for dedicated compression modules. The study introduces Thinking as Compression (TaC), a paradigm where the model's own reasoning process serves as compressed context. TaC outperforms existing compression methods without specialized training. To address budget control and shortcut behaviors, the authors propose TaC-C, which uses a simple reward mechanism to constrain thinking output. The findings suggest that LLMs possess intrinsic compression capabilities that have been underexplored.

Key facts

  • Paper ID: arXiv:2605.28713v1
  • Context compression aims to shorten long inputs for LLM inference acceleration
  • Existing methods rely on complex compression modules or compression-specific training
  • TaC directly prompts the thinking model to generate thinking traces as shortened context
  • TaC outperforms most representative compression methods
  • TaC-C introduces a simple reward mechanism for budget control and to avoid shortcut behaviors
  • The research was published on arXiv
  • The paper is categorized as a new announcement type

Entities

Institutions

  • arXiv

Sources