Self-Distillation Framework Recovers Performance in Large Language Models

ai-technology · 2026-04-20

A new framework known as Self-Distillation Fine-Tuning (SDFT) has been developed to tackle the issue of performance decline in Large Language Models (LLMs). This decline is frequently attributed to catastrophic forgetting that arises during Supervised Fine-Tuning (SFT), as well as during quantization and pruning. The study offers both a theoretical basis and practical guidance for this recovery process. It suggests that the generative ability of an LLM is intrinsically linked to the high-dimensional manifold formed by its hidden layers. Researchers utilize Centered Kernel Alignment (CKA) to assess the alignment of activation trajectories between student and teacher models, benefiting from its invariance to orthogonal transformations. Results indicate a strong link between activation alignment and performance recovery. This research, which addresses significant challenges in sustaining LLM performance amid various optimization methods, was published on arXiv under identifier 2604.15794v1.

Key facts

Self-Distillation Fine-Tuning (SDFT) framework recovers LLM performance
Addresses performance degradation from catastrophic forgetting during SFT
Also addresses degradation from quantization and pruning
Provides theoretical explanation for recovery mechanism
LLM generative capability relies on high-dimensional manifold of hidden layers
Uses Centered Kernel Alignment (CKA) to measure activation alignment
CKA has invariance to orthogonal transformations and scaling
Published on arXiv with identifier 2604.15794v1

Self-Distillation Framework Recovers Performance in Large Language Models

Key facts

Entities

Institutions

Sources