PermFlow: A Flow Matching Framework for Multimodal Permutation Learning
PermFlow is a framework designed for conditional flow matching that learns permutations while tackling the issue of multimodal distribution collapse in ambiguous situations. Unlike traditional differentiable techniques that rely on entropy-regularized Sinkhorn methods, which yield a single softened output, PermFlow functions directly within the affine subspace of matrices where both row and column sums equal one. It employs a closed-form tangent-space projector to maintain these constraints precisely throughout each trajectory, eliminating the need for iterative adjustments. By utilizing a nearest-target coupling, it directs various noisy initializations toward unique valid permutations, allowing the model to effectively capture multimodal permutation distributions. In tasks involving visual sorting with blended-digit ambiguity and symmetric linear assignment, PermFlow demonstrates impressive accuracy with unambiguous inputs and successfully retrieves valid permutations amid ambiguity. The comprehensive framework is outlined in arXiv:2605.16755.
Key facts
- PermFlow is a conditional flow matching framework for learning permutations.
- It addresses multimodal permutation distributions under ambiguity.
- Existing methods based on entropy-regularized Sinkhorn collapse under ambiguity.
- PermFlow operates on the affine subspace of matrices with unit row and column sums.
- A closed-form tangent-space projector preserves constraints exactly.
- Nearest-target coupling routes distinct initializations to distinct permutations.
- Tested on visual sorting with blended-digit ambiguity and symmetric linear assignment.
- Achieves high accuracy on unambiguous inputs and recovers both valid permutations under ambiguity.
Entities
Institutions
- arXiv