MLLM Concept-Based Explanations Degrade Accuracy in Few-Shot ICL
A new study on arXiv (2605.28215) systematically evaluates concept-based explainability of frozen multimodal large language models (MLLMs) under few-shot in-context learning (ICL). Using five conditions of increasing formal rigour—from baseline classification to Description Logics (DL) axiom generation—the authors test four state-of-the-art MLLMs via an independent LLM-as-a-judge pipeline. Results show that generating formally structured, concept-based explanations degrades predictive accuracy monotonically from 93.8% to 90.1%, contradicting the assumption that explicit reasoning universally aids performance. The paper argues that explaining is genuinely harder than predicting alone, and that Chain-of-Thought prompting may not reflect true internal computation. The study was published on arXiv with ID 2605.28215v1.
Key facts
- arXiv paper 2605.28215v1 evaluates concept-based explainability of MLLMs under few-shot ICL
- Five conditions of increasing formal rigour tested: baseline to Description Logics axiom generation
- Four state-of-the-art MLLMs evaluated via LLM-as-a-judge pipeline
- Predictive accuracy dropped from 93.8% to 90.1% with formal concept-based explanations
- Chain-of-Thought prompting may not reflect true internal computation
- Explaining is harder than predicting alone
- Study contradicts assumption that explicit reasoning universally aids performance
- Published on arXiv with announcement type 'new'
Entities
Institutions
- arXiv