StitchVM: Aligning Diffusion Models via Stitched Value Functions
StitchVM is a framework designed for model stitching that adeptly shifts reward models trained on pristine images into noisy latent spaces for the purpose of diffusion alignment. This innovation tackles the difficulty of synchronizing diffusion-based generative models with specific task rewards, as rewards are established for clean outputs, yet alignment necessitates value assessments in noisy intermediate latents. While current approaches like Tweedie estimates are efficient, they carry bias, and Monte Carlo estimates, though more precise, demand significant computational resources. StitchVM builds upon an existing reward model, modifying it to provide a viable solution. The research has been made available on arXiv under ID 2605.19804.
Key facts
- StitchVM is a model stitching framework for diffusion alignment.
- It transfers reward models from clean images to noisy latents.
- Existing methods include Tweedie (biased, efficient) and Monte Carlo (accurate, expensive).
- The paper is on arXiv: 2605.19804.
- The approach addresses reward alignment for diffusion models.
- It uses pretrained reward models as a starting point.
- The method aims to reduce computational cost while maintaining accuracy.
- The framework is designed for task-specific rewards like prompt fidelity.
Entities
Institutions
- arXiv