Federated Multi-Label Prompt Tuning for Vision-Language Models
A new paper on arXiv (2605.28347) introduces FedMPT, the first method specifically designed for federated multi-label recognition (MLR) using vision-language models (VLMs). The approach addresses the problem of overfitting to spurious label correlations when VLMs are adapted to private, heterogeneous client data in decentralized settings. By applying a causal model with front-door adjustment, FedMPT decouples the MLR process through intermediate variables that amplify oracle label co-occurrence, steering the model toward generalizable conditions to reduce erroneous label activation.
Key facts
- Paper ID: arXiv:2605.28347
- Title: FedMPT: Federated Multi-label Prompt Tuning of Vision-Language Models
- First method for federated multi-label recognition
- Uses causal model with front-door adjustment
- Addresses overfitting to spurious label correlations
- Decouples MLR via intermediate variables
- Amplifies oracle label co-occurrence
- Focuses on decentralized, heterogeneous client data
Entities
Institutions
- arXiv