Survey on Diffusion and Flow Matching Models for Tabular Data
A new survey published on arXiv explores the application of diffusion and flow matching models to tabular data generation. While deep generative models have excelled in image, text, audio, and video domains, tabular data remains challenging due to mixed numerical and categorical attributes, missing values, imbalanced categories, and complex dependencies. Earlier methods using GANs or VAEs suffered from unstable training and mode collapse. Diffusion models, with their noising-and-denoising framework, offer a flexible and stable alternative for tasks like tabular synthesis, missing-value imputation, and anomaly detection. The survey covers recent adaptations and highlights the potential of flow matching as an efficient alternative. The paper is available at arXiv:2502.17119.
Key facts
- arXiv paper ID: 2502.17119
- Survey focuses on diffusion and flow matching models for tabular data
- Tabular data includes numerical and categorical attributes, missing values, imbalanced categories
- GANs and VAEs have limitations: unstable training, mode collapse, weak multimodal modeling
- Diffusion models use a noising-and-denoising formulation
- Applications: tabular synthesis, missing-value imputation, trustworthy data generation, anomaly detection
- Flow matching is mentioned as an efficient alternative
- Published on arXiv (replace-cross announcement type)
Entities
Institutions
- arXiv