Score-based diffusion models provably learn low-dimensional data distributions

publication · 2026-04-25

A recent theoretical study demonstrates that score-based diffusion models can effectively capture data distributions characterized by inherent low-dimensional structures, such as those found in natural images, using limited samples. This research sets finite-sample error bounds in Wasserstein-p distance for all p ≥ 1, relying solely on finite-moment assumptions for the target distribution μ, without needing conditions like compact support, manifold, or smooth density. When provided with n i.i.d. samples from μ with a finite q-th moment, the convergence rate is polynomially related to the intrinsic dimension instead of the ambient dimension, clarifying why diffusion models perform well on high-dimensional yet low-complexity data like images. These findings are valid under mild regularity conditions concerning the forward diffusion process and the data distribution, offering the first statistical assurances that account for the intrinsic low-dimensional structure prevalent in real-world datasets.

Key facts

arXiv:2603.03700v2
Score-based diffusion models
Finite-sample error bounds in Wasserstein-p distance
All p ≥ 1
Finite-moment assumption on μ
No compact-support, manifold, or smooth-density conditions
Convergence rate depends on intrinsic dimension
Mild regularity conditions on forward diffusion and data distribution

Score-based diffusion models provably learn low-dimensional data distributions

Key facts

Entities

Institutions

Sources