ZeNO: Gradient-Free Noise Optimization for Generative Model Alignment
A new method called ZeNO (Zeroth-order Noise Optimization) enables reward alignment in generative models without backpropagation. Developed for diffusion and flow models, ZeNO treats noise optimization as a path-integral control problem solvable via zeroth-order reward evaluations. Using an Ornstein-Uhlenbeck reference process, the update implicitly targets a reward-tilted distribution via Langevin dynamics. The framework supports inference-time scaling and performs well across diverse generators and reward functions, including protein structure generation where backpropagation is infeasible. The paper is available on arXiv under reference 2605.11347.
Key facts
- ZeNO is a gradient-free framework for reward alignment in generative models.
- It formulates noise optimization as a path-integral control problem.
- The method uses zeroth-order reward evaluations without backpropagation.
- It instantiates an Ornstein-Uhlenbeck reference process.
- The update connects to Langevin dynamics targeting a reward-tilted distribution.
- ZeNO enables effective inference-time scaling.
- It demonstrates strong performance on protein structure generation.
- The paper is published on arXiv with ID 2605.11347.
Entities
Institutions
- arXiv