Anchored Learning Prevents Catastrophic Forgetting in LLM Fine-Tuning
A new framework called Anchored Learning addresses catastrophic forgetting in large language model (LLM) fine-tuning by explicitly controlling distributional updates. The method uses a dynamically evolving moving anchor that interpolates between the current model and a frozen reference, transforming global fine-tuning into local trust-region updates. Theoretically, it guarantees a linear KL-divergence upper bound per iteration, ensuring stable transitions. The paper is published on arXiv (2605.04468) and targets offline fine-tuning scenarios.
Key facts
- Anchored Learning is a framework for stabilizing LLM supervised fine-tuning.
- It addresses catastrophic forgetting caused by excessive distributional drift.
- The method uses a dynamically evolving moving anchor.
- The anchor interpolates between the current model and a frozen reference.
- It transforms global fine-tuning into local trust-region updates.
- The update admits a linear KL-divergence upper bound per iteration.
- The paper is available on arXiv with ID 2605.04468.
- The approach is designed for offline fine-tuning.
Entities
Institutions
- arXiv