Transformers Fail to Verify During Backtracking Search
A new paper on arXiv (2605.22221) reveals that decoder-only transformers trained on cumulative solver traces cannot correctly predict whether to continue or backtrack during search. The optimal predictor depends only on the current state, but the model suffers from scattered retrieval (state features spread across positions) and history entanglement (conditioning on trajectory). The authors propose a localization fix that rewrites decision blocks to expose state features locally.
Key facts
- arXiv:2605.22221v1
- Announce Type: cross
- Backtracking search underlies classical constraint solvers, planners, and theorem provers
- Transformer-based reasoning systems explore search trees over intermediate steps
- Training uses autoregressive next-token loss on offline solver traces
- Model input is cumulative trace of all prior decisions
- Optimal continue-or-backtrack predictor depends only on current search state
- Two trajectories reaching same state admit same viable continuations
- Decoder-only transformers fail due to scattered retrieval and history entanglement
- Localization fix rewrites each decision block to expose state features locally
Entities
Institutions
- arXiv