BuddyBench: Multi-Task Benchmark for Pediatric Social-Communication
BuddyBench is a multi-task benchmark that prioritizes privacy, specifically aimed at personalizing pediatric social communication. It combines drill-level learning paths, standardized clinical evaluations, BuddyPlan self-reports, and randomized treatment outcomes into a cohesive framework, setting it apart from current neurodevelopmental databases that emphasize imaging, genetics, or cross-sectional phenotyping. The benchmark consists of two groups: ND-03, an observational cohort with extensive drill coverage for Tasks 1-2 (n=189), and ND-02, a randomized controlled trial cohort for Tasks 3-4 (n=86 ITT). These cohorts facilitate knowledge tracing, next-drill suggestions, clinical predictions, and causal inference, linking behavioral personalization with clinical assessment. Additionally, BuddyBench-Sim, a synthetic dataset, is provided for consistent evaluation, with baselines showing effectiveness across tasks while ensuring pediatric privacy.
Key facts
- BuddyBench is a privacy-constrained multi-task benchmark for pediatric social-communication personalization.
- It links drill-level learning trajectories, clinical assessments, BuddyPlan self-report, and randomized-treatment endpoints.
- Two cohorts: ND-03 (observational, n=189) and ND-02 (RCT, n=86 ITT).
- Supports knowledge tracing, next-drill recommendation, clinical prediction, and causal inference.
- BuddyBench-Sim is a synthetic companion dataset for reproducible evaluation.
- Baselines show signal across tasks while keeping pediatric privacy constraints.
- Published on arXiv with ID 2605.28089.
- Announcement type: new.
Entities
Institutions
- arXiv