ARTFEED — Contemporary Art Intelligence

Third-Place Solution in Hume-ABAW10 Emotion Mimicry Challenge

ai-technology · 2026-05-23

A team secured third place in the Hume-ABAW10 Emotional Mimicry Intensity (EMI) Challenge by utilizing a two-stage multimodal framework. This competition focused on predicting six continuous dimensions of emotion intensity: Admiration, Amusement, Determination, Empathic Pain, Excitement, and Joy, using real-world multimodal video clips. Their innovative framework integrates textual, acoustic, and visual data, with an optional motion component. Modality-specific encoders are trained separately and then combined through a lightweight regressor that employs modality dropout and controlled encoder adaptation. The highest validation performance achieved was an average Pearson correlation of 0.4722, realized by the text–audio–vision–motion fusion model under a 4:1 split. Although the motion branch yielded minimal improvements, it provided intriguing insights for further research.

Key facts

  • Team placed third in Hume-ABAW10 EMI Challenge
  • Predicts six emotion dimensions: Admiration, Amusement, Determination, Empathic Pain, Excitement, Joy
  • Two-stage multimodal framework combines text, audio, vision, and optional motion
  • Best validation Pearson correlation: 0.4722
  • Model uses modality dropout and controlled encoder adaptation
  • Motion branch yields slight gains
  • Challenge focuses on in-the-wild multimodal video clips
  • Framework trains modality-specific encoders independently before fusion

Entities

Institutions

  • Hume-ABAW10
  • EMI Challenge
  • arXiv

Sources