Compressed Reasoning Data in LLM Post-Training: A Taxonomy and Empirical Study

ai-technology · 2026-05-28

A new paper on arXiv (2605.28008) investigates the use of compressed reasoning data in supervised fine-tuning (SFT) for large language models (LLMs). The authors propose a taxonomy of chain-of-thought (CoT) reasoning: Explicit CoT (outputs all operations), Composed CoT (combines multiple operations into one step), and Implicit CoT (omits intermediate operations). They construct a synthetic compositional reasoning task to control difficulty, compression granularity, and data size, conducting experiments across different model families and sizes. Key finding: coarser CoT requires more data to match performance of finer CoT, but can reduce token cost. The study aims to understand when and how compressed reasoning data works in post-training.

Key facts

Paper on arXiv: 2605.28008
Taxonomy of CoT: Explicit, Composed, Implicit
Synthetic compositional reasoning task used
Experiments across multiple model families and sizes
Coarser CoT requires more data to match finer CoT performance
Compressed reasoning can reduce token cost
Focus on supervised fine-tuning (SFT)
Controlled variation of difficulty, compression granularity, data size

Compressed Reasoning Data in LLM Post-Training: A Taxonomy and Empirical Study

Key facts

Entities

Institutions

Sources