ARTFEED — Contemporary Art Intelligence

Explainability and Fairness in VLMs for Wellbeing Assessment

ai-technology · 2026-04-29

A new study from arXiv (2604.23786) investigates fairness and explainability in Vision-Language Models (VLMs) for wellbeing assessment and depression prediction. Researchers evaluated models across laboratory (AFAR-BSFT) and naturalistic (E-DAIC) datasets, finding significant performance disparities: Phi3.5-Vision achieved 80.4% accuracy on E-DAIC, while Qwen2-VL scored only 33.9%. Both models showed a tendency to over-predict depression on AFAR-BSFT, raising concerns about diagnostic reliability and demographic fairness. The work highlights the under-explored intersection of Explainable AI (XAI) and multimodal foundation models in clinical mental health monitoring.

Key facts

  • Study investigates fairness and explainability in VLMs for wellbeing assessment
  • Evaluated on laboratory (AFAR-BSFT) and naturalistic (E-DAIC) datasets
  • Phi3.5-Vision achieved 80.4% accuracy on E-DAIC
  • Qwen2-VL achieved 33.9% accuracy on E-DAIC
  • Both models over-predicted depression on AFAR-BSFT
  • Application of XAI to VLMs for depression prediction is under-explored
  • Research published on arXiv (2604.23786)
  • Concerns about transparency and bias in clinical deployment of VLMs

Entities

Institutions

  • arXiv

Sources