ARTFEED — Contemporary Art Intelligence

Pruning 75% of Experts from MoE LLMs for Translation with Minimal Loss

ai-technology · 2026-05-28

A new method aggressively prunes experts from mixture-of-experts large language models to create efficient translation specialists. The approach exploits expert specialization and separable multilingual capabilities to identify and remove translation-irrelevant experts without retraining. Pruning half of all experts yields negligible degradation, 70% pruning causes only minor losses, and 75% pruning with short supervised fine-tuning recovers baseline performance. This drastically reduces memory and compute requirements for translation tasks.

Key facts

  • Method prunes experts from MoE LLMs for translation
  • Exploits expert specialization and separable multilingual capabilities
  • Pruning 50% of experts yields negligible degradation
  • Pruning 70% causes only minor losses
  • Pruning 75% with short SFT recovers baseline performance
  • No retraining required for moderate pruning
  • Reduces memory and compute requirements
  • Published on arXiv with ID 2605.28042

Entities

Institutions

  • arXiv

Sources