ARTFEED — Contemporary Art Intelligence

Reasoning Distillation Fails to Transmit Cognitive Structure in LLMs

ai-technology · 2026-04-25

A recent study released on arXiv indicates that large language models (LLMs) struggle to convey the cognitive framework of reasoning through reasoning distillation. Researchers examined the "Hán Dān Xué Bù" (Superficial Mimicry) hypothesis across 14 models. They discovered that teacher models, which were trained using reinforcement learning, align closely with human cognitive costs (correlation r=0.64). In contrast, distilled student models trained through Supervised Fine-Tuning (SFT) experience a "Functional Alignment Collapse" (r=0.34) and frequently perform worse than their pre-distillation benchmarks, a situation referred to as "Negative Transfer." The findings imply that SFT creates a "Cargo Cult" effect, where students mimic the superficial aspects of reasoning without grasping the teacher's adaptive resource allocation strategy. The full paper can be found at arXiv:2601.05019.

Key facts

  • Study tests Hán Dān Xué Bù hypothesis across 14 models
  • Teacher models show alignment with human cognitive costs (r=0.64)
  • Distilled students suffer Functional Alignment Collapse (r=0.34)
  • Distilled students often underperform pre-distillation baselines
  • SFT induces Cargo Cult effect in reasoning distillation
  • Paper published on arXiv with ID 2601.05019
  • Teacher models trained via reinforcement learning
  • Student models trained via Supervised Fine-Tuning (SFT)

Entities

Institutions

  • arXiv

Sources