Causal Probing Reveals How MLLMs Encode Visual Concepts
A new paper on arXiv proposes a causal framework using activation steering to probe internal visual representations in Multimodal Large Language Models (MLLMs). The study reveals that entities are encoded via localized memorization, while abstract concepts are globally distributed across the network. This divergence explains scaling laws: increasing model depth is crucial for abstract concepts but not for entity localization. Reverse steering shows that blocking explicit output triggers latent activation surges.
Key facts
- Paper titled 'Causal Probing for Internal Visual Representations in Multimodal Large Language Models'
- Published on arXiv with ID 2605.05593v1
- Proposes a causal framework based on activation steering
- Systematic intervention across four visual concept categories
- Entities exhibit distinct localized memorization
- Abstract concepts are globally distributed across the network
- Increasing model depth is indispensable for encoding abstract concepts
- Entity localization remains invariant to scale
- Reverse steering uncovers latent activation surges when explicit output is blocked
Entities
Institutions
- arXiv