MoE Vision Models Show Animate-Inanimate Expert Partitioning

ai-technology · 2026-05-22

A recent study shared on arXiv (2605.20610) delves into how Mixture-of-Experts (MoE) models specialize in visual tasks. The team trained convolutional MoE models with sparse gating using a contrastive approach on natural images, and they looked at expert specialization through the lens of visual neuroscience techniques. They started their analysis at the gating level and moved to individual experts, checking how well categories were separated and how each expert tuned to the most engaging inputs. They also explored tuning via semantic dimensions from the THINGS dataset, reflecting human behavior. Ultimately, they found that experts are categorized by an animate-inanimate distinction, noticeable from the gating stage to expert representations.

Key facts

Study uses MoE models with contrastive learning on natural images.
Tools from visual neuroscience applied to analyze expert specialization.
Per-expert category separability and tuning measured.
Semantic dimensions from THINGS dataset used for interpretation.
Animate-inanimate distinction dominates expert partitioning.
Stability of expertise-allocation assessed across initializations.
Analysis extends from gating-level to expert-level.
Research published on arXiv with ID 2605.20610.

MoE Vision Models Show Animate-Inanimate Expert Partitioning

Key facts

Entities

Institutions

Sources