Counterfactual Routing Reduces Hallucinations in MoE Models
A novel inference framework, named Counterfactual Routing (CoR), has been developed to tackle hallucinations in Sparse Mixture-of-Experts (MoE) models, especially when dealing with long-tail knowledge. The study, which can be found on arXiv (2604.14246), reveals that static Top-k routing tends to prioritize frequent patterns over less common factual connections, resulting in the inactivity of 'specialist experts' that hold causal significance. CoR employs layer-wise perturbation analysis alongside the Counterfactual Expert Impact (CEI) metric to reallocate computational resources from syntax-driven to knowledge-focused layers, all while keeping the total activation count unchanged. This method seeks to activate dormant experts and reduce hallucinations without requiring extra training.
Key facts
- Counterfactual Routing (CoR) is a training-free inference framework
- Addresses hallucinations in Sparse Mixture-of-Experts (MoE) models
- Published on arXiv with identifier 2604.14246
- Static Top-k routing causes routers to favor high-frequency patterns
- Specialist experts with long-tail knowledge remain dormant
- CoR uses layer-wise perturbation analysis and Counterfactual Expert Impact (CEI) metric
- Dynamically shifts computational resources from syntax-dominant to knowledge-intensive layers
- Maintains constant total activation count
Entities
Institutions
- arXiv