Tree Generation Method Reduces Forgetting in Large Language Models

ai-technology · 2026-04-25

Researchers have introduced a model-agnostic self-decompression method called Tree Generation (TG) to address catastrophic forgetting in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs). LLMs often forget previously learned knowledge when fine-tuned on domain-specific data, while MLLMs like LLaVA show performance declines on language benchmarks compared to their single-modality versions. The TG method decompresses knowledge within LLMs into the training corpus, generating synthetic supervised fine-tuning (SFT) data for instruction tuning. By incorporating this dumped corpus during SFT for MLLMs, the forgetting problem is significantly reduced. The paper, titled "Preserving Knowledge in Large Language Model with Model-Agnostic Self-Decompression," focuses on TG-SFT and was submitted to arXiv on June 17, 2024.

Key facts

Tree Generation (TG) is a model-agnostic self-decompression method.
TG addresses catastrophic forgetting in LLMs and MLLMs.
LLMs forget old knowledge when post-pretrained or supervised fine-tuned on domain-specific data.
MLLMs like LLaVA show significant decline in language benchmark performance.
TG decompresses knowledge within LLMs into the training corpus.
TG-SFT generates synthetic SFT data for instruction tuning.
Incorporating the dumped corpus during SFT for MLLMs reduces forgetting.
Paper submitted to arXiv on June 17, 2024.

Tree Generation Method Reduces Forgetting in Large Language Models

Key facts

Entities

Institutions

Sources