ARTFEED — Contemporary Art Intelligence

PODS: A New Framework for Dynamic Data Volume Scheduling in Model Training

publication · 2026-05-16

A new paper on arXiv (2605.14773) introduces PODS (Plug-and-play Oscillatory Data-volume Scheduling), a framework that dynamically adjusts the volume of selected data during model training. Existing data selection methods focus on which samples to choose but keep the selection ratio fixed, leading to a static data volume. The authors show that varying the selection ratio introduces an implicit regularization effect, with lower ratios amplifying regularization and higher ratios preserving data coverage. PODS is a lightweight module that oscillates the data volume over time, improving training efficiency without requiring new sample-scoring metrics. The work reframes data selection as an optimization problem, highlighting a trade-off between regularization and fidelity.

Key facts

  • Paper published on arXiv with ID 2605.14773
  • PODS stands for Plug-and-play Oscillatory Data-volume Scheduling
  • Existing methods fix the selected data volume as a target ratio throughout training
  • Selected-data training induces an implicit regularization effect modulated by the instantaneous selection ratio
  • Lower ratios amplify selection-induced regularization
  • Higher ratios preserve data coverage and optimization fidelity
  • PODS serves as a lightweight module that does not introduce new sample-scoring metrics
  • The work revisits data selection from an optimization perspective

Entities

Institutions

  • arXiv

Sources