NoiseShift: Training-Free Method Improves Low-Resolution Text-to-Image Generation
A new method called NoiseShift addresses the degradation of text-to-image diffusion models when generating at lower resolutions. Prior work focused on high-resolution generation, but NoiseShift targets low-resolution inference to reduce computational cost. The key insight is that the same scheduled noise level can correspond to different perceptual corruption at lower resolutions, causing a train-test mismatch. NoiseShift recalibrates the noise conditioning by re-indexing the denoiser's noise embedding without changing the sampling schedule or requiring additional training. This restores local forward process alignment and improves image quality at reduced resolutions.
Key facts
- NoiseShift is a training-free recalibration method for low-resolution image generation.
- It addresses train-test mismatch in noise conditioning for diffusion models.
- The method keeps the original noise sampling schedule unchanged.
- It re-indexes the noise conditioning of the denoiser.
- Targets lower-resolution generation to cut computational cost.
- Prior work emphasized higher resolution generation.
- The same scheduled noise level can correspond to different perceptual corruption at lower resolutions.
- NoiseShift restores local forward process alignment.
Entities
—