Multi-Level Speaker-Adaptive Network for Emotion Recognition in Conversations
A new machine learning model called Multi-Level Speaker-Adaptive Network (ML-SAN) addresses the challenge of individual differences in emotional expression. Current emotion recognition systems treat all speakers uniformly, failing to account for personal expressive traits. ML-SAN adapts to each speaker's unique style across multiple levels, improving accuracy in multi-turn dialogues. The research, published on arXiv (2604.25383), aims to enhance human-machine empathy by enabling machines to distinguish varied expressions of the same emotion, such as happiness shown through words versus actions.
Key facts
- ML-SAN stands for Multi-Level Speaker-Adaptive Network
- Addresses individual expressive traits in emotion recognition
- Focuses on multimodal emotion recognition in conversations
- Improves recognition in multi-turn dialogues
- Published on arXiv with ID 2604.25383
- Aims to establish empathy with machines
- Current models are described as 'static'
- Proposes a novel multi-level speaker adaptation approach
Entities
Institutions
- arXiv