ARTFEED — Contemporary Art Intelligence

BuddyBench: Multi-Task Benchmark for Pediatric Social-Communication

other · 2026-05-28

BuddyBench is a multi-task benchmark that prioritizes privacy, specifically aimed at personalizing pediatric social communication. It combines drill-level learning paths, standardized clinical evaluations, BuddyPlan self-reports, and randomized treatment outcomes into a cohesive framework, setting it apart from current neurodevelopmental databases that emphasize imaging, genetics, or cross-sectional phenotyping. The benchmark consists of two groups: ND-03, an observational cohort with extensive drill coverage for Tasks 1-2 (n=189), and ND-02, a randomized controlled trial cohort for Tasks 3-4 (n=86 ITT). These cohorts facilitate knowledge tracing, next-drill suggestions, clinical predictions, and causal inference, linking behavioral personalization with clinical assessment. Additionally, BuddyBench-Sim, a synthetic dataset, is provided for consistent evaluation, with baselines showing effectiveness across tasks while ensuring pediatric privacy.

Key facts

  • BuddyBench is a privacy-constrained multi-task benchmark for pediatric social-communication personalization.
  • It links drill-level learning trajectories, clinical assessments, BuddyPlan self-report, and randomized-treatment endpoints.
  • Two cohorts: ND-03 (observational, n=189) and ND-02 (RCT, n=86 ITT).
  • Supports knowledge tracing, next-drill recommendation, clinical prediction, and causal inference.
  • BuddyBench-Sim is a synthetic companion dataset for reproducible evaluation.
  • Baselines show signal across tasks while keeping pediatric privacy constraints.
  • Published on arXiv with ID 2605.28089.
  • Announcement type: new.

Entities

Institutions

  • arXiv

Sources