FusionRoute: Token-Level LLM Collaboration Framework
FusionRoute is a novel framework for token-level multi-LLM collaboration, where a lightweight router selects the most suitable expert model at each decoding step and refines its output via logit addition. This addresses the trade-off between general-purpose large models and specialized small models. Theoretical analysis shows pure expert-only routing is fundamentally limited.
Key facts
- arXiv:2601.05106v4
- FusionRoute framework proposed
- Token-level multi-LLM collaboration
- Lightweight router selects expert per decoding step
- Router contributes complementary logit via logit addition
- Addresses dilemma of general vs specialized models
- Theoretical analysis of pure expert-only routing limitations
Entities
—