Reasoning Length Increases Position Bias in AI Models
A recent arXiv study (2605.06672) indicates that as the length of reasoning trajectories increases, models such as DeepSeek-R1 and those utilizing chain-of-thought (CoT) prompting exhibit a heightened position bias in multiple-choice question answering. In tests involving thirteen configurations of R1-distilled 7-8B models and DeepSeek-R1 at 671B, evaluated on MMLU, ARC-Challenge, and GPQA, twelve configurations demonstrated a positive partial correlation (0.11 to 0.41, p<0.05) between trajectory length and Position Bias Score (PBS), even after accuracy was accounted for. Furthermore, all twelve open-weight reasoning-mode setups displayed a consistent increase in PBS across length quartiles, with a truncation intervention offering causal support for the observed bias shift from later reasoning points.
Key facts
- Study examines position bias in reasoning models
- Uses DeepSeek-R1, R1-distilled 7-8B models, and base models with CoT
- Tested on MMLU, ARC-Challenge, and GPQA datasets
- 12 out of 13 configurations show positive correlation between trajectory length and PBS
- Correlation ranges from 0.11 to 0.41 (all p<0.05)
- All 12 open-weight configurations show monotonically increasing PBS across length quartiles
- Truncation intervention provides causal evidence
- Paper published on arXiv (2605.06672)
Entities
Institutions
- arXiv