Interpretable Concept Decomposition Enables Precise Machine Unlearning in VLMs

ai-technology · 2026-05-16

A novel approach known as ICED (Interpretable Concept-level Unlearning via Interpretable Concept Decomposition) tackles the issue of machine unlearning in Vision-Language Models (VLMs). Existing unlearning techniques typically function at the image or instance level, often inadequately eliminating specific knowledge without disrupting unrelated semantics, particularly when a single image encompasses multiple intertwined concepts. ICED utilizes a multimodal large language model to create a concise, task-specific vocabulary from the forgetting set. It then breaks down visual representations into sparse, nonnegative combinations of these semantic concepts, allowing for precise knowledge adjustments. The unlearning process is framed as concept-level optimization, focusing on the concepts to be discarded while maintaining contextual integrity. The paper can be found on arXiv with the reference 2605.14309.

Key facts

ICED is a concept-level unlearning framework for VLMs.
It uses a multimodal large language model to build a concept vocabulary.
Visual representations are decomposed into sparse, nonnegative concept combinations.
Unlearning is performed via concept-level optimization.
The method allows precise removal of target knowledge without affecting unrelated semantics.
Traditional unlearning is image or instance level, which is imprecise.
A single image often contains multiple entangled concepts.
The paper is available on arXiv with ID 2605.14309.

Interpretable Concept Decomposition Enables Precise Machine Unlearning in VLMs

Key facts

Entities

Institutions

Sources