Training-Free Flow Matching Boosts Diversity Without Losing Quality

ai-technology · 2026-05-22

A novel mechanism for control during inference time, which does not require training, significantly improves diversity in flow-based text-to-image models while maintaining image quality. This method promotes a lateral spread among trajectories through a feature-space objective and reinstates uncertainty with time-scheduled stochastic perturbation. Notably, this perturbation is orthogonally projected to the generation flow, a geometric constraint that enhances variation without compromising image details or prompt accuracy. Theoretically, this design consistently increases a volume measure, ensuring enhanced diversity. It overcomes the challenge posed by deterministic trajectories in flow-based models, which can make exploring diverse modes expensive with limited sampling budgets. Existing approaches typically necessitate retraining or result in quality loss. This research, detailed in arXiv:2510.09060v2, presents an effective solution for boosting diversity in text-to-image generation.

Key facts

Training-free inference-time control mechanism for flow-based text-to-image models
Enhances diversity without degrading image fidelity
Encourages lateral spread among trajectories via feature-space objective
Reintroduces uncertainty through time-scheduled stochastic perturbation
Perturbation is projected orthogonal to generation flow
Geometric constraint preserves image details and prompt fidelity
Theoretically monotonically increases a volume measure
Addresses limitation of deterministic trajectories in flow-based models

Training-Free Flow Matching Boosts Diversity Without Losing Quality

Key facts

Entities

Institutions

Sources