CLP-DD: Closed-Form Linear-Probe Dataset Distillation for Pre-Trained Models
Researchers propose Closed-Form Linear-Probe Dataset Distillation (CLP-DD), a method for compressing large training sets into small synthetic sets for frozen pre-trained vision models. Unlike existing approaches that rely on iterative trajectory matching or neural-tangent-kernel approximations, CLP-DD exploits the closed-form solution of linear probing on pre-trained features, eliminating inner-loop optimization and infinite-width approximations. The method uses a bilevel formulation to compute the linear probe induced by the synthetic set directly from the pre-trained features. This approach targets modern transfer learning where a frozen encoder is followed by lightweight linear probing, offering a more efficient and theoretically grounded alternative. The paper is available on arXiv under identifier 2605.07194.
Key facts
- CLP-DD is a dataset distillation method for pre-trained vision models.
- It uses a closed-form linear-probe solution, avoiding iterative updates.
- The method targets frozen encoders with linear probing.
- It eliminates neural-tangent-kernel approximations and inner-loop trajectories.
- The approach is based on a bilevel optimization formulation.
- The paper is published on arXiv with ID 2605.07194.
- It compresses large training sets into small synthetic sets.
- The method is designed for visual transfer learning.
Entities
Institutions
- arXiv