Norm-Anchor Scaling Prevents Model Edit Collapse

ai-technology · 2026-05-07

A failure mode was discovered in the sequential locate-and-edit (L&E) model editing process, characterized as a positive norm-feedback loop that causes an amplification between solved value vectors and adjusted MLP weights, ultimately leading to a decline in edit quality and a loss of capabilities. To address this, researchers introduced Norm-Anchor Scaling (NAS), a stabilizing plug-in that rescales each solved value vector to align with the original model's reference norm. Implemented across various LLM backbones, datasets, and L&E editors, NAS increases the effective editing range by more than 4x and enhances long-term editing performance by an average of 72.2%, all achieved with a single line modification and minimal computational expense.

Key facts

Sequential L&E model editing can fail abruptly after many edits.
Failure is caused by a positive norm-feedback loop.
The loop involves solved value vectors and edited MLP weights.
Norm growth under standard L&E dynamics is approximately exponential.
Existing regularizers or update clamps do not resolve the issue.
NAS breaks the loop by rescaling value vectors to original-model reference norm.
NAS extends editing horizon by more than 4x.
NAS improves long-run editing performance by 72.2% on average.

Entities

—

Sources

arXiv cs.AI — 2026-05-07