Counterfactual Routing Reduces Hallucinations in MoE Models

ai-technology · 2026-04-30

A novel inference framework, named Counterfactual Routing (CoR), has been developed to tackle hallucinations in Sparse Mixture-of-Experts (MoE) models, especially when dealing with long-tail knowledge. The study, which can be found on arXiv (2604.14246), reveals that static Top-k routing tends to prioritize frequent patterns over less common factual connections, resulting in the inactivity of 'specialist experts' that hold causal significance. CoR employs layer-wise perturbation analysis alongside the Counterfactual Expert Impact (CEI) metric to reallocate computational resources from syntax-driven to knowledge-focused layers, all while keeping the total activation count unchanged. This method seeks to activate dormant experts and reduce hallucinations without requiring extra training.

Key facts

Counterfactual Routing (CoR) is a training-free inference framework
Addresses hallucinations in Sparse Mixture-of-Experts (MoE) models
Published on arXiv with identifier 2604.14246
Static Top-k routing causes routers to favor high-frequency patterns
Specialist experts with long-tail knowledge remain dormant
CoR uses layer-wise perturbation analysis and Counterfactual Expert Impact (CEI) metric
Dynamically shifts computational resources from syntax-dominant to knowledge-intensive layers
Maintains constant total activation count

Counterfactual Routing Reduces Hallucinations in MoE Models

Key facts

Entities

Institutions

Sources