Hidden States Reveal Task-Relevant Information in Chain-of-Thought Reasoning

ai-technology · 2026-04-29

A recent research paper on arXiv (2604.23351) examines if chain-of-thought (CoT) tokens hold task-specific information beyond simple explanations. By employing activation patching on GSM8K, the researchers transferred hidden states at the token level from a CoT generation to a direct-answer response for the same query. The findings indicate that patching achieves greater accuracy than both direct-answer prompting and the initial CoT trace, suggesting that individual CoT tokens possess enough information to derive the correct answer, even if the original trace is flawed. Task-relevant information is found to be more abundant in correct CoT executions, unevenly spread across tokens, with a concentration in mid-to-late layers and appearing earlier in the reasoning process. The study also examines language tokens, including verbs.

Key facts

Study uses activation patching on GSM8K
Token-level hidden states transferred from CoT to direct-answer run
Patching yields higher accuracy than direct-answer prompting and original CoT trace
Task-relevant information more prevalent in correct CoT runs
Information concentrates in mid-to-late layers
Information appears earlier in reasoning trace
Language tokens such as verbs are patched
Published on arXiv with ID 2604.23351

Hidden States Reveal Task-Relevant Information in Chain-of-Thought Reasoning

Key facts

Entities

Institutions

Sources