Benchmarking Synthetic Data Methods for Education

publication · 2026-04-25

A new study from arXiv presents the first systematic benchmark comparing traditional resampling techniques and deep generative models for synthetic data in education. Using a 10,000-record student performance dataset, researchers evaluated SMOTE, Bootstrap, and Random Oversampling against Autoencoder, Variational Autoencoder, and Copula-GAN. Metrics included distributional fidelity (Kolmogorov-Smirnov distance, Jensen-Shannon divergence), machine learning utility (Train-on-Synthetic-Test-on-Real scores), and privacy preservation (Distance to Closest Record). Results show resampling methods achieve near-perfect utility (TSTR: 0.997) but fail privacy (DCR ~ 0.00), while deep models offer better privacy at the cost of utility. The study provides empirical guidance for practitioners selecting synthetic data methods in educational technology.

Key facts

First systematic benchmark comparing resampling and deep generative models for synthetic data in education
Dataset: 10,000-record student performance dataset
Resampling methods: SMOTE, Bootstrap, Random Oversampling
Deep learning models: Autoencoder, Variational Autoencoder, Copula-GAN
Evaluation metrics: Kolmogorov-Smirnov distance, Jensen-Shannon divergence, TSTR, Distance to Closest Record
Resampling methods achieved TSTR of 0.997 but DCR ~ 0.00
Deep models offer better privacy but lower utility
Study provides empirical guidance for synthetic data selection

Benchmarking Synthetic Data Methods for Education

Key facts

Entities

Institutions

Sources