ReAD: Reinforcement-Guided Capability Distillation for LLMs

ai-technology · 2026-05-13

A new framework called ReAD (Reinforcement-guided cApability Distillation) addresses the challenge of compressing large language models (LLMs) into smaller ones while preserving task-specific abilities. Current capability distillation methods treat capabilities as independent, ignoring how improving one capability affects others. ReAD explicitly models capability interdependence under a fixed token budget, leveraging reinforcement learning to optimize the distillation process. The approach builds on observed patterns: distillation induces systematic cross-capability transfer that depends on budget, and additional budget often yields limited task-relevant gains while potentially degrading other abilities. By inferring task-essential capabilities and guiding their development, ReAD aims to produce more efficient and effective smaller models for downstream tasks.

Key facts

ReAD is a Reinforcement-guided cApability Distillation framework for LLMs.
It addresses capability interdependence in knowledge distillation.
Current methods treat capabilities as independent training targets.
Distillation induces systematic, budget-dependent cross-capability transfer.
Additional budget often brings limited task-relevant gains.
Extra budget can sometimes degrade other useful abilities.
ReAD explicitly accounts for capability interdependence.
The framework uses reinforcement learning to guide distillation.

ReAD: Reinforcement-Guided Capability Distillation for LLMs

Key facts

Entities

Institutions

Sources