Survey on Diffusion and Flow Matching Models for Tabular Data

publication · 2026-05-25

A new survey published on arXiv explores the application of diffusion and flow matching models to tabular data generation. While deep generative models have excelled in image, text, audio, and video domains, tabular data remains challenging due to mixed numerical and categorical attributes, missing values, imbalanced categories, and complex dependencies. Earlier methods using GANs or VAEs suffered from unstable training and mode collapse. Diffusion models, with their noising-and-denoising framework, offer a flexible and stable alternative for tasks like tabular synthesis, missing-value imputation, and anomaly detection. The survey covers recent adaptations and highlights the potential of flow matching as an efficient alternative. The paper is available at arXiv:2502.17119.

Key facts

arXiv paper ID: 2502.17119
Survey focuses on diffusion and flow matching models for tabular data
Tabular data includes numerical and categorical attributes, missing values, imbalanced categories
GANs and VAEs have limitations: unstable training, mode collapse, weak multimodal modeling
Diffusion models use a noising-and-denoising formulation
Applications: tabular synthesis, missing-value imputation, trustworthy data generation, anomaly detection
Flow matching is mentioned as an efficient alternative
Published on arXiv (replace-cross announcement type)

Survey on Diffusion and Flow Matching Models for Tabular Data

Key facts

Entities

Institutions

Sources