Synthetic Data Hurts Most Time Series Models, Helps Some

ai-technology · 2026-05-09

An extensive empirical investigation into the use of synthetic data augmentation for forecasting time series has shown that its impact is highly dependent on the model architecture. Out of 4,218 experiments, channel-mixing models, including TimesNet and iTransformer, performed better in most cases, while channel-independent models like DLinear and PatchTST experienced consistent declines in performance. In scenarios with limited resources, TimesNet, when trained on just 10% of Weather data using synthetic augmentation, outperformed the full-data baseline in 4 out of 16 combinations of sparsity datasets. Nevertheless, across all architectures, augmentation negatively affected performance in 67% of trials. The Seasonal-Trend generator was the only method that consistently improved results across the benchmarks tested. This research, available on arXiv, evaluates five architectures, four synthetic signals, and seven datasets.

Key facts

Study conducted 4,218 runs across nine experiment groups.
Channel-mixing models (TimesNet, iTransformer) benefit from synthetic data.
Channel-independent models (DLinear, PatchTST) are consistently degraded.
TimesNet with 10% Weather data plus synthetic augmentation beats full-data baseline in 4 cases.
Augmentation hurts 67% of trials across all architectures.
Only Seasonal-Trend generator reliably helps across benchmarks.
Study evaluates five architectures, four synthetic signals, seven datasets.
Published on arXiv under identifier 2605.06032.

Synthetic Data Hurts Most Time Series Models, Helps Some

Key facts

Entities

Institutions

Sources