MP-IB Framework for Bipolar Agitation Detection via Voice Biomarkers
A team of researchers has unveiled MP-IB, the pioneering framework that addresses mixed-precision quantization as an information bottleneck for distinguishing clinical trait-state in detecting agitation in bipolar disorder through voice biomarkers. This framework employs an FP16 trait head (1,024 bits) for identifying speakers and an INT4 state head (128 bits) for assessing agitation, resulting in an 8x information asymmetry without the need for adversarial training. It features Dynamic Precision Scheduling and Multi-Scale Temporal Fusion. In tests on the Bridge2AI-Voice dataset (N=833, 4 sessions per participant, strict speaker-independent cross-validation), MP-IB recorded a rho of 0.117 (95% CI: [0.089, 0.145], p=0.003 vs. chance), surpassing a 94M-parameter WavLM-Adapter with in-domain self-supervised learning continuation (rho = -0.042) and beta VAE disentanglement (rho = 0.089). This framework is tailored for edge devices with limited resources, facilitating ongoing monitoring of agitation in bipolar disorder through voice analysis.
Key facts
- MP-IB is the first framework to treat mixed-precision quantization as an information bottleneck for clinical trait-state separation.
- FP16 trait head uses 1,024 bits for speaker identity; INT4 state head uses 128 bits for agitation.
- 8x information asymmetry achieved without adversarial training.
- Dynamic Precision Scheduling and Multi-Scale Temporal Fusion are incorporated.
- Evaluated on Bridge2AI-Voice dataset with N=833, 4 sessions per participant.
- Achieved rho = 0.117 (95% CI: [0.089, 0.145], p=0.003 vs. chance).
- Outperformed 94M-parameter WavLM-Adapter (rho = -0.042) and beta VAE (rho = 0.089).
- Designed for on-device continuous monitoring of bipolar disorder agitation.
Entities
Institutions
- arXiv