Machine Unlearning Blind Spots: Over-Unlearning and Relearning Attacks

other · 2026-06-01

A new paper on arXiv (2506.01318) identifies two critical blind spots in machine unlearning (MU): over-unlearning, which degrades retained data near the forget set, and prototypical relearning attacks that can resurrect forgotten knowledge with few samples. The authors propose an over-unlearning metric OU@epsilon to quantify collateral damage and introduce Spotter, a plug-and-play objective combining masked knowledge-distillation penalty to counter both issues in class-level unlearning.

Key facts

Paper arXiv:2506.01318 addresses over-unlearning and relearning attacks in machine unlearning.
Over-unlearning deteriorates retained data near the forget set.
Prototypical Relearning Attack exploits per-class prototypes to restore pre-unlearning performance.
OU@epsilon metric quantifies over-unlearning in regions proximal to the forget set.
Spotter is a plug-and-play objective with masked knowledge-distillation penalty.
Focus is on class-level unlearning.
The paper was announced as a replace-cross on arXiv.
The attack requires only a few samples of the forget class.

Machine Unlearning Blind Spots: Over-Unlearning and Relearning Attacks

Key facts

Entities

Institutions

Sources