RDDG Framework Uses LLMs and Bayesian Calibration for Relational Data Synthesis

ai-technology · 2026-04-22

A new framework called RDDG (Relational Data generator with Dynamic Guidance) addresses the challenge of imbalanced data in real-world applications by synthesizing rare-class relational data. Developed by researchers and documented in arXiv preprint 2604.16817v1, this approach employs large language models (LLMs) within an in-context learning framework to generate structured tabular data. Unlike existing methods, RDDG incorporates a feedback mechanism that continuously optimizes data quality throughout synthesis. The framework first selects representative samples from original data via core set selection, then uses in-context learning to discover patterns and correlations among attributes. It employs progressive chain-of-thought steps to enhance downstream imbalanced classification performance. The work highlights the underexplored application of LLMs to relational data synthesis while addressing the lack of effective feedback mechanisms in current approaches. This research contributes to mitigating data scarcity problems through controllable synthesis techniques.

Key facts

RDDG framework synthesizes relational data for imbalanced classification
Uses LLMs with in-context learning for structured tabular data generation
Incorporates dynamic feedback mechanism for continuous optimization
Employs core set selection to identify representative samples
Utilizes progressive chain-of-thought steps in synthesis process
Addresses data scarcity problems for rare-classes in real-world applications
Documented in arXiv preprint 2604.16817v1 with cross announcement type
Focuses on discovering inherent patterns and correlations among attributes

RDDG Framework Uses LLMs and Bayesian Calibration for Relational Data Synthesis

Key facts

Entities

Institutions

Sources