RDB-PFN: First Relational Foundation Model Trained on Synthetic Data
Researchers introduced RDB-PFN, the first foundation model for relational databases trained purely on synthetic data. The model uses a Relational Prior Generator to create diverse synthetic RDBs, enabling pre-training on over 2 million tasks. RDB-PFN achieves strong few-shot performance on 19 real-world benchmarks via in-context learning.
Key facts
- RDB-PFN is the first relational foundation model trained purely on synthetic data.
- It uses a Relational Prior Generator to create diverse synthetic RDBs.
- Pre-trained on over 2 million synthetic single-table and relational tasks.
- Achieves strong few-shot performance on 19 real-world benchmarks.
- Learns to adapt to new databases instantly via in-context learning.
- Inspired by Prior-Data Fitted Networks (PFNs) and Structural Causal Models (SCMs).
- Addresses data scarcity and structural heterogeneity of real RDBs.
Entities
—