Flow Matching Method for Few-Shot Vision-Language Adaptation
A new paper on arXiv (2605.05054) critiques existing flow matching (FM) methods for few-shot adaptation of vision-language models, identifying three key limitations from a polar decomposition perspective: angular dynamics distortion due to radial-angular coupling, neglect of radial dynamics from feature normalization discarding modality confidence, and context-agnostic unconditional generation. The authors propose Direct Product Flow Matching to decouple radial and angular dynamics, aiming to improve adaptation performance.
Key facts
- Paper arXiv:2605.05054
- Announce type: cross
- Critiques existing flow matching methods
- Uses polar decomposition perspective
- Identifies three limitations: angular dynamics distortion, radial dynamics neglect, context-agnostic unconditional generation
- Proposes Direct Product Flow Matching
- Aims to improve few-shot adaptation of vision-language models
Entities
Institutions
- arXiv