RLFSeg: Rectified Flow for Text-Based Segmentation
A new framework called RLFSeg uses Rectified Flow to improve text-based image segmentation, outperforming diffusion-based methods. Text-based segmentation allows delineating object boundaries from text prompts, offering flexibility beyond fixed categories. Previous methods using diffusion models as feature extractors inherit harmful generative natures. RLFSeg learns direct mapping from image to segmentation mask in latent space, avoiding noise-denoise processes and time step optimization. The method shows substantially better performance, especially on zero-shot tasks. The research is published on arXiv as paper 2605.04590.
Key facts
- RLFSeg uses Rectified Flow for text-based segmentation.
- It outperforms previous diffusion-based methods.
- Text-based segmentation offers higher flexibility than fixed-category tasks.
- Diffusion models have harmful generative natures for discriminative tasks.
- RLFSeg learns direct mapping in latent space.
- It avoids noise-denoise processes and time step optimization.
- Performance is substantially better on zero-shot tasks.
- Paper is arXiv:2605.04590.
Entities
Institutions
- arXiv