Transformers Fail to Verify During Backtracking Search

other · 2026-05-23

A new paper on arXiv (2605.22221) reveals that decoder-only transformers trained on cumulative solver traces cannot correctly predict whether to continue or backtrack during search. The optimal predictor depends only on the current state, but the model suffers from scattered retrieval (state features spread across positions) and history entanglement (conditioning on trajectory). The authors propose a localization fix that rewrites decision blocks to expose state features locally.

Key facts

arXiv:2605.22221v1
Announce Type: cross
Backtracking search underlies classical constraint solvers, planners, and theorem provers
Transformer-based reasoning systems explore search trees over intermediate steps
Training uses autoregressive next-token loss on offline solver traces
Model input is cumulative trace of all prior decisions
Optimal continue-or-backtrack predictor depends only on current search state
Two trajectories reaching same state admit same viable continuations
Decoder-only transformers fail due to scattered retrieval and history entanglement
Localization fix rewrites each decision block to expose state features locally

Transformers Fail to Verify During Backtracking Search

Key facts

Entities

Institutions

Sources