CP-MoE Framework Addresses Catastrophic Forgetting in LLMs and VLMs

ai-technology · 2026-05-22

A team of researchers has introduced CP-MoE, a continual learning framework designed for large language models (LLMs) and vision-language models (VLMs) to address the issue of catastrophic forgetting. Current Mixture-of-Experts (MoE) techniques based on LoRA either overly isolate experts, hindering knowledge transfer, or permit task-specific updates to overwrite crucial parameters. CP-MoE features a transient expert that captures initial task-specific updates, aiding in their integration into stable experts. Additionally, it employs a consistency-preserving routing bias to assess representation similarity, alongside a transient expert-guided regularization method. This innovative approach seeks to strike a balance between facilitating knowledge transfer and preventing forgetting.

Key facts

CP-MoE is a continual learning framework for LLMs and VLMs.
It addresses catastrophic forgetting in large language models.
Existing LoRA-based MoE methods face a trade-off between knowledge transfer and forgetting.
CP-MoE uses a transient expert to capture early task-specific updates.
It introduces a consistency-preserving routing bias.
The routing bias estimates representation similarity with stable experts.
A transient expert-guided regularization mechanism is included.
The framework aims to improve expert selection and reduce forgetting.

Entities

—

Sources

arXiv cs.AI — 2026-05-21