VLMs Suppress Female Representations Under Ambiguous Input
A recent investigation indicates that vision-language models (VLMs) tend to associate gender-ambiguous images with male identities, even in contexts involving occupations typically linked to women. The study presents LALS (Latent Association Learning Score), a novel zero-shot metric designed to assess internal concept associations by mapping visual-token activations into text-embedding space. Analyzing over 800 gender-ambiguous images across 15 occupations and four VLMs, researchers discovered a consistent disconnect: while models frequently encode female associations internally, they predominantly produce male outputs. This discrepancy underscores the inadequacy of alignment techniques when dealing with ambiguous inputs that are prevalent in real-world scenarios.
Key facts
- VLMs default to male associations for gender-ambiguous images
- Even female-stereotyped occupations trigger male defaults
- LALS metric measures internal concept associations per token and layer
- Study tested 15 occupations, over 800 images, and four VLMs
- Internal representations and outputs are systematically decoupled
- Models often encode female associations internally but output male
- Minimal prompting pressure exposes occupation-gender defaults
- Ambiguous inputs are common in practice yet rarely studied
Entities
Institutions
- arXiv