ARTFEED — Contemporary Art Intelligence

FullFlow: Upgrading Text-to-Image Models for Bidirectional Generation

ai-technology · 2026-05-22

FullFlow represents an efficient approach that enhances a pretrained rectified-flow model for text-to-image generation, transforming it into a bidirectional vision-language generator. This method exclusively trains LoRA adapters along with lightweight text heads, ensuring that images remain in a continuous flow while incorporating a discrete process for text insertion. By utilizing distinct timesteps for images and text, it facilitates various functionalities, including text-to-image, image-to-text, joint sampling, and partial-text prediction, all supported by a single backbone.

Key facts

  • FullFlow upgrades text-to-image models to bidirectional vision-language generation.
  • It uses LoRA adapters and lightweight text heads.
  • Images remain in continuous flow; text is added via discrete insertion.
  • Separate timesteps for image and text enable multiple generation modes.
  • The method is parameter-efficient, avoiding large-scale retraining.
  • It works with rectified-flow text-to-image models.
  • The approach preserves the strong image prior of the original model.
  • FullFlow enables text-to-image, image-to-text, joint sampling, and partial-text prediction.

Entities

Sources