ARTFEED — Contemporary Art Intelligence

Autoguidance and Online Data Curation for Diffusion Model Training

ai-technology · 2026-05-18

A recent paper on arXiv (2509.15267) explores the potential of autoguidance and online data selection techniques to enhance the efficiency and speed of training generative diffusion models. The researchers have combined joint example selection (JEST) with autoguidance into a single codebase to facilitate rapid ablation studies and benchmarking. They assess various data curation strategies using a controlled task for 2-D synthetic data generation and (3x64x64)-D image generation, ensuring equal wall-clock time and sample numbers while considering selection overhead. Results show that autoguidance significantly boosts sample quality and diversity. Notably, early AJEST, which applies selection at the training's outset, can achieve comparable or slightly superior data efficiency, albeit with increased time overhead and complexity.

Key facts

  • Paper title: Autoguided Online Data Curation for Diffusion Model Training
  • arXiv ID: 2509.15267
  • Announce type: replace-cross
  • Integrates joint example selection (JEST) and autoguidance
  • Evaluated on 2-D synthetic data and (3x64x64)-D image generation
  • Comparisons at equal wall-clock time and equal number of samples
  • Autoguidance consistently improves sample quality and diversity
  • Early AJEST matches or exceeds autoguidance alone in data efficiency

Entities

Institutions

  • arXiv

Sources