Structured Reasoning Signals, Not Code, Improve Mathematical Reasoning in LMs

ai-technology · 2026-05-20

A recent study published on arXiv questions the belief that incorporating code enhances general reasoning in language models. Researchers conducted controlled pretraining using a 10-trillion-token dataset with distinct domain separation and discovered that executable code does not improve reasoning abilities; rather, it competes with knowledge-heavy tasks, including advanced mathematics. The reasoning improvements typically linked to code arise from structured reasoning across different domains, like combinations of code and text or math and text. By increasing the proportion of structured math-domain examples within a set math budget, significant improvements were observed in challenging mathematical reasoning tasks.

Key facts

Study on arXiv (2605.19762) revisits claim that code improves reasoning.
Controlled pretraining experiments on a 10T-token corpus with fine-grained domain separation.
Code restricted to standalone executable programs does not act as general reasoning enhancer.
Code competes with knowledge-intensive tasks, especially complex mathematical reasoning.
Reasoning gains attributed to code are better explained by cross-domain structured reasoning traces (code-text, math-text mixtures).
Increasing density of structured math-domain samples within fixed math budget yields substantial gains on difficult mathematical reasoning.

Structured Reasoning Signals, Not Code, Improve Mathematical Reasoning in LMs

Key facts

Entities

Institutions

Sources