Soft-Label Learning Enhances Uncertainty Decomposition in Subjective NLP
A new method combining cyclical stochastic gradient Markov chain Monte Carlo (cSG-MCMC) with soft-label learning improves uncertainty decomposition in subjective NLP tasks. The approach trains a linear head on a frozen RoBERTa model, targeting empirical annotator distributions. On the GoEmotions benchmark, it outperforms Monte Carlo Dropout and Deep Ensemble across three evaluation axes: Jensen-Shannon divergence to annotator distribution, Spearman correlation between aleatoric uncertainty and disagreement, and selective-prediction AURC and AUROC. This work addresses annotator disagreement in emotion classification, which reflects intrinsic ambiguity in emotion concepts, and is the first to integrate soft-label learning with Bayesian deep learning for multi-axis uncertainty evaluation.
Key facts
- Method uses cyclical SG-MCMC and soft-label learning
- Linear head trained on frozen RoBERTa
- Targets empirical annotator distribution
- Evaluated on GoEmotions benchmark with 28 emotions
- Outperforms Monte Carlo Dropout and Deep Ensemble on three axes
- First integration of soft-label learning with Bayesian deep learning for uncertainty decomposition
- Addresses annotator disagreement in emotion classification
- Five-axis evaluation framework
Entities
—