Chain-of-Thought Reasoning Traces Found to Be Performative in LLMs

ai-technology · 2026-05-13

A recent investigation published on arXiv (2605.11746) questions the belief that chain-of-thought (CoT) reasoning in large language models is consistently aligned with the internal processes that generate answers. Researchers employed a Detect-Classify-Compare framework, utilizing an answer-commitment proxy validated through Patchscopes, tuned-lens probes, and causal direction ablation, to evaluate nine models across seven reasoning benchmarks. The results indicated that latent commitment and explicit answer arrival coincide on average only 61.9% of the time. The primary mismatch observed was confabulated continuation, which represented 58.0% of the mismatch events, where the answer-commitment proxy remains stable while the trace generates deliberative text without altering the committed answer. The study also features comparisons between architecture-matched Qwen2.5 and DeepSeek-R1-Distill.

Key facts

Chain-of-thought traces are used to improve model capability and audit behavior.
The study tests the assumption that visible trace syncs with answer-determining computation.
A step-level Detect-Classify-Compare framework was built.
Answer-commitment proxy cross-validated with Patchscopes, tuned-lens probes, and causal direction ablation.
Nine models and seven reasoning benchmarks were tested.
Latent commitment and explicit answer arrival align on only 61.9% of steps on average.
Confabulated continuation is the dominant mismatch pattern at 58.0% of mismatch events.
The committed answer does not change during confabulated continuation steps.
Architecture-matched Qwen2.5 and DeepSeek-R1-Distill models were included.

Chain-of-Thought Reasoning Traces Found to Be Performative in LLMs

Key facts

Entities

Institutions

Sources