CLP-DD: Closed-Form Linear-Probe Dataset Distillation for Pre-Trained Models

ai-technology · 2026-05-11

Researchers propose Closed-Form Linear-Probe Dataset Distillation (CLP-DD), a method for compressing large training sets into small synthetic sets for frozen pre-trained vision models. Unlike existing approaches that rely on iterative trajectory matching or neural-tangent-kernel approximations, CLP-DD exploits the closed-form solution of linear probing on pre-trained features, eliminating inner-loop optimization and infinite-width approximations. The method uses a bilevel formulation to compute the linear probe induced by the synthetic set directly from the pre-trained features. This approach targets modern transfer learning where a frozen encoder is followed by lightweight linear probing, offering a more efficient and theoretically grounded alternative. The paper is available on arXiv under identifier 2605.07194.

Key facts

CLP-DD is a dataset distillation method for pre-trained vision models.
It uses a closed-form linear-probe solution, avoiding iterative updates.
The method targets frozen encoders with linear probing.
It eliminates neural-tangent-kernel approximations and inner-loop trajectories.
The approach is based on a bilevel optimization formulation.
The paper is published on arXiv with ID 2605.07194.
It compresses large training sets into small synthetic sets.
The method is designed for visual transfer learning.

CLP-DD: Closed-Form Linear-Probe Dataset Distillation for Pre-Trained Models

Key facts

Entities

Institutions

Sources