EmoMind: Decoding Affective Captions from fMRI Brain Signals
A team of researchers has introduced EmoMind, the pioneering end-to-end system for interpreting emotional captions directly from fMRI data. While existing brain-to-text technologies extract semantic information, they overlook emotional nuances, and current language models rely on broad categorical labels to produce emotional text. EmoMind first derives a neutral scene description grounded in semantics from brain-decoded visual features, subsequently enhancing it with a continuous 34-dimensional emotion vector obtained from the same fMRI data. The system employs classifier-free guidance alongside an identity-preserving null branch to ensure a balance between maintaining content and expressing emotion. This innovative method captures diverse emotional experiences across subjects, surpassing traditional discrete categories. The findings are detailed in arXiv:2605.16739.
Key facts
- EmoMind is the first end-to-end pipeline for decoding affective captions from fMRI signals.
- Current brain-to-text systems recover semantic content but discard affect.
- Language models generate emotional text only from categorical labels that collapse inter-subject variability.
- EmoMind retrieves a neutral scene description from brain-decoded visual features.
- It rewrites the description using a continuous 34-dimensional emotion vector from the same fMRI recording.
- Classifier-free guidance against an identity-preserving null branch controls content-affect balance.
- The system enables smooth interpolation between semantic fidelity and affective expressivity.
- The research is published on arXiv with ID 2605.16739.
Entities
Institutions
- arXiv