M-ORE: Modality-Decoupled Online Editing for MLLMs
Researchers have introduced M-ORE, an online recursive editor that decouples modalities for the ongoing adaptation of multimodal large language models (MLLMs). Current text-only LLM editors struggle with MLLMs due to conflicts arising from visually dominant activations and interference from sequential edits. M-ORE employs a unified proximal-projection approach, utilizing a closed-form update through Sherman-Morrison recursion, which guarantees consistent overhead for each edit. It preserves module-specific locality statistics for both the text stack and visual projector to avoid updates dominated by visual inputs, while executing continuous updates in a stable, orthogonal low-rank edit subspace. This method is tailored for online model editing within stringent computational and memory constraints.
Key facts
- M-ORE addresses cross-modal conflict and inter-edit interference in MLLM editing.
- Uses Sherman-Morrison recursion for constant per-edit overhead.
- Maintains module-wise locality statistics for text stack and visual projector.
- Performs updates in a fixed orthogonal low-rank edit subspace.
- Designed for lifelong adaptation of MLLMs.
- arXiv paper number: 2605.20273.
- Published on arXiv.
- Proposes a closed-form update for online editing.
Entities
Institutions
- arXiv