Spectral Method Recovers Damaged Language Model Capabilities

ai-technology · 2026-05-22

Researchers propose DG-Hard, a post-hoc repair method for catastrophic forgetting in fine-tuned language models. The method uses only the pretrained checkpoint and its fine-tuned descendant, applying the Donoho-Gavish hard singular-value threshold to weight-delta matrices. DG-Hard recovers capabilities damaged by fine-tuning while preserving target-task gains and beneficial held-out improvements, treating the fine-tuning update as a low-rank task-aligned signal embedded in noise. The approach is detailed in arXiv:2605.20296.

Key facts

DG-Hard is a spectral repair method for fine-tuning updates.
It uses only the pretrained checkpoint and fine-tuned descendant.
It applies Donoho-Gavish hard singular-value thresholding.
It recovers damaged capabilities without retraining.
It preserves target-task gains and beneficial held-out improvements.
The method treats the fine-tuning update as low-rank signal plus noise.
The research is described in arXiv:2605.20296.
Catastrophic forgetting degrades capabilities not explicitly threatened by training data.

Spectral Method Recovers Damaged Language Model Capabilities

Key facts

Entities

Institutions

Sources