BuddyBench: Multi-Task Benchmark for Pediatric Social-Communication

other · 2026-05-28

BuddyBench is a multi-task benchmark that prioritizes privacy, specifically aimed at personalizing pediatric social communication. It combines drill-level learning paths, standardized clinical evaluations, BuddyPlan self-reports, and randomized treatment outcomes into a cohesive framework, setting it apart from current neurodevelopmental databases that emphasize imaging, genetics, or cross-sectional phenotyping. The benchmark consists of two groups: ND-03, an observational cohort with extensive drill coverage for Tasks 1-2 (n=189), and ND-02, a randomized controlled trial cohort for Tasks 3-4 (n=86 ITT). These cohorts facilitate knowledge tracing, next-drill suggestions, clinical predictions, and causal inference, linking behavioral personalization with clinical assessment. Additionally, BuddyBench-Sim, a synthetic dataset, is provided for consistent evaluation, with baselines showing effectiveness across tasks while ensuring pediatric privacy.

Key facts

BuddyBench is a privacy-constrained multi-task benchmark for pediatric social-communication personalization.
It links drill-level learning trajectories, clinical assessments, BuddyPlan self-report, and randomized-treatment endpoints.
Two cohorts: ND-03 (observational, n=189) and ND-02 (RCT, n=86 ITT).
Supports knowledge tracing, next-drill recommendation, clinical prediction, and causal inference.
BuddyBench-Sim is a synthetic companion dataset for reproducible evaluation.
Baselines show signal across tasks while keeping pediatric privacy constraints.
Published on arXiv with ID 2605.28089.
Announcement type: new.

BuddyBench: Multi-Task Benchmark for Pediatric Social-Communication

Key facts

Entities

Institutions

Sources