Transformers Show Scaling Deductive Reasoning with Depth

ai-technology · 2026-05-07

A new study on arXiv investigates how Transformer models scale implicit deductive reasoning over Horn clauses. Researchers found that sufficiently deep models with bidirectional prefix masks can approach the performance of explicit chain-of-thought reasoning across various graph topologies and problem widths, though chain-of-thought remains necessary for depth extrapolation. The work systematically decorrelates provability from spurious features and enforces algorithmic alignment.

Key facts

Study investigates scaling properties of implicit deductive reasoning in Transformers
Focuses on reasoning over Horn clauses in depth-bounded Transformers
Systematically decorrelates provability from spurious features
Enforces algorithmic alignment
Sufficiently deep models with bidirectional prefix mask approach explicit CoT performance
CoT remains necessary for depth extrapolation
Results hold across graph topologies and problem widths
Published on arXiv under Computer Science > Artificial Intelligence

Transformers Show Scaling Deductive Reasoning with Depth

Key facts

Entities

Institutions

Sources