VAGS: Velocity-Adaptive Guidance Scale for Image Editing and Generation
A novel approach known as Velocity-Adaptive Guidance Scale (VAGS) enhances classifier-free guidance in diffusion models by flexibly modifying guidance strength throughout the ODE trajectory. Unlike traditional CFG, which employs a constant scale, VAGS adjusts the nominal scale by a bounded factor that incorporates a temporal signal-level component along with cosine similarity between velocity fields pertinent to the task. For editing without inversion, VAGS evaluates the alignment between velocities guided by the source and target, allowing for adaptive editing strength based on local compatibility. This technique resolves issues where initial steps are dominated by noise while later steps solidify image structure. VAGS is suggested as an alternative to fixed guidance scales in flow-based samplers. The findings are detailed in arXiv paper 2605.15661.
Key facts
- VAGS is a training-free replacement for classifier-free guidance.
- It multiplies the nominal scale by a bounded factor combining temporal signal-level term and cosine similarity.
- Standard CFG uses a fixed scale across the entire ODE trajectory.
- Early steps are noise-dominated with weak semantic signal.
- Late steps commit image structure and demand stronger directional commitment.
- VAGS measures alignment between source- and target-guided velocities for inversion-free editing.
- Edit strength at each step reflects local compatibility between velocity fields.
- The paper is available on arXiv with ID 2605.15661.
Entities
Institutions
- arXiv