ARTFEED — Contemporary Art Intelligence

CBT-Audio Dataset for Patient Distress Estimation in Therapy Sessions

ai-technology · 2026-05-20

Researchers have introduced CBT-Audio, a dataset designed to evaluate audio language models for estimating patient distress intensity from spoken cognitive behavioural therapy (CBT) sessions. The dataset comprises 1,802 patient turns extracted from 96 publicly available recordings. Current AI systems for CBT are largely text-based, missing vocal cues that therapists rely on to detect discrepancies between what patients say and how they say it. CBT-Audio addresses this gap by enabling models to analyze audio directly. The work is published on arXiv under identifier 2605.17370.

Key facts

  • CBT-Audio is a dataset for evaluating patient distress estimation from spoken CBT sessions.
  • It contains 1,802 patient turns from 96 publicly available recordings.
  • Current AI systems for CBT are limited to text, missing vocal cues.
  • The dataset enables audio language models to analyze patient voice.
  • The research is published on arXiv with identifier 2605.17370.
  • Therapists rely on mismatches between transcript and voice to understand distress.
  • Spoken CBT data are scarce due to ethical and privacy constraints.
  • CBT is delivered through spoken conversation where how patients speak matters.

Entities

Institutions

  • arXiv

Sources