Director-Experts: Modular Network for Multi-Modality Medical Vision
A new modular network architecture called Director-Experts (DEX) addresses the challenge of Non-IID feature statistics in multi-modality medical vision foundation models. DEX regulates specialization and coordination in stacked modules, each containing a pool of experts that specialize in modality-dominant statistics via an image-wise activation strategy, and a director that distills multi-expert knowledge into a shared space using group exponential moving average. This approach aims to prevent representation collapse toward modality-dominant shortcuts, enabling emergent modular representations for semantic integration across heterogeneous imaging modalities.
Key facts
- arXiv:2605.21861
- Multi-modality medical vision foundation models face Non-IID feature statistics
- Monolithic self-supervised optimization induces conflicting gradients
- Representations collapse toward modality-dominant shortcuts
- Director-Experts (DEX) is a modular network
- Each DEX module has a pool of experts and a director
- Image-wise activation strategy for expert specialization
- Group exponential moving average for director update
Entities
Institutions
- arXiv