GeoFaith Framework Improves Faithful Chain-of-Thought Reasoning in LLMs
Researchers have introduced GeoFaith, a spatio-temporal framework designed to diagnose and enforce faithful reasoning in large language models (LLMs). Chain-of-Thought (CoT) reasoning often suffers from post-hoc rationalization, producing plausible but unfaithful reasoning chains due to outcome-based supervision. GeoFaith leverages latent geometric structure and entropy dynamics to address this issue. The team developed a scalable bootstrapping pipeline that expands step-level annotations from 1,000 to 20,000 samples across four domains. They trained an 8B faithfulness detector that outperforms GPT-5 on standard benchmarks. Additionally, they designed a faithfulness-aware reinforcement learning framework that jointly optimizes outcome correctness, process faithfulness, and trajectory consistency. Experiments demonstrate superior performance on both faithfulness detection and downstream reasoning tasks, producing shorter and more faithful reasoning chains.
Key facts
- GeoFaith is a spatio-temporal framework for faithful CoT reasoning.
- It uses latent geometric structure and entropy dynamics.
- Bootstrapping pipeline expands annotations from 1k to 20k samples.
- An 8B faithfulness detector outperforms GPT-5.
- Framework jointly optimizes correctness, faithfulness, and consistency.
- Experiments show superior performance on faithfulness detection and reasoning.
- Proposed method produces shorter reasoning chains.
- Addresses post-hoc rationalization in LLMs.
Entities
Institutions
- arXiv