Soft-Label Learning Enhances Uncertainty Decomposition in Subjective NLP

other · 2026-05-26

A new method combining cyclical stochastic gradient Markov chain Monte Carlo (cSG-MCMC) with soft-label learning improves uncertainty decomposition in subjective NLP tasks. The approach trains a linear head on a frozen RoBERTa model, targeting empirical annotator distributions. On the GoEmotions benchmark, it outperforms Monte Carlo Dropout and Deep Ensemble across three evaluation axes: Jensen-Shannon divergence to annotator distribution, Spearman correlation between aleatoric uncertainty and disagreement, and selective-prediction AURC and AUROC. This work addresses annotator disagreement in emotion classification, which reflects intrinsic ambiguity in emotion concepts, and is the first to integrate soft-label learning with Bayesian deep learning for multi-axis uncertainty evaluation.

Key facts

Method uses cyclical SG-MCMC and soft-label learning
Linear head trained on frozen RoBERTa
Targets empirical annotator distribution
Evaluated on GoEmotions benchmark with 28 emotions
Outperforms Monte Carlo Dropout and Deep Ensemble on three axes
First integration of soft-label learning with Bayesian deep learning for uncertainty decomposition
Addresses annotator disagreement in emotion classification
Five-axis evaluation framework

Entities

—

Sources

arXiv cs.AI — 2026-05-26