Spectral Orthogonal Exploration Improves LLM Reasoning

publication · 2026-04-30

A new study on arXiv has introduced a framework called Spectral Orthogonal Exploration (SOE) that seeks to tackle an issue known as 'Reasoning Collapse' in Large Language Models (LLMs) when they deal with complicated math problems. Researchers found that ineffective reasoning often gets stuck in a low-rank bias area within the model's hidden-state structure, limiting the ability to find correct answers. SOE utilizes a 'Student Guides Teacher' approach, where a weaker auxiliary agent acts as an orthogonal probe, injecting diverse reasoning signals into the main teacher's subspace. This strategy pushes the teacher to explore a wider array of reasoning options, improving exploration compared to traditional random sampling. Tests on math datasets confirm its success.

Key facts

Paper titled 'Student Guides Teacher: Weak-to-Strong Inference via Spectral Orthogonal Exploration'
arXiv identifier: 2601.06160v2
Addresses 'Reasoning Collapse' in LLMs on mathematical reasoning tasks
Failed reasoning traces associated with low-rank bias manifold in hidden-state geometry
Proposes Spectral Orthogonal Exploration (SOE) framework
Uses 'Student Guides Teacher' paradigm with weak auxiliary agent as orthogonal probe
Injects heterogeneous reasoning signals into teacher's orthogonal complement of dominant subspace
Experiments show improved exploration over standard sampling

Spectral Orthogonal Exploration Improves LLM Reasoning

Key facts

Entities

Institutions

Sources