Human-Centered Audit Reveals Shapley Value Benchmarks Misaligned with User Needs

ai-technology · 2026-04-27

A recent study published on arXiv (2604.22662) questions the assessment of Shapley values within explainable AI, revealing that traditional metrics such as sparsity and faithfulness do not correlate with human perceptions of clarity or decision-making effectiveness. Researchers employed a unified amortized framework to differentiate semantic variations among eight Shapley variants while adhering to low-latency requirements in operational risk workflows. They performed an extensive empirical analysis using four risk datasets and a practical fraud-detection scenario involving 3,735 case reviews by professional analysts. Findings indicated that no variant enhanced analyst performance, and explanations did not improve decision-making in critical situations. This study highlights the necessity for human-centered benchmarks in evaluating XAI.

Key facts

Study from arXiv (2604.22662) evaluates Shapley value variants in high-stakes settings
Eight Shapley variants compared under low-latency constraints
Evaluation used four risk datasets and a fraud-detection environment
3,735 case reviews conducted by professional analysts
Standard metrics (sparsity, faithfulness) decoupled from human-perceived clarity
No formulation improved objective analyst performance
Calls for human-centered benchmarks in XAI evaluation

Human-Centered Audit Reveals Shapley Value Benchmarks Misaligned with User Needs

Key facts

Entities

Institutions

Sources