FAV: A General Framework for Aligning Few-Step Generative Models
A new alignment framework named FAV (Few-step Generative Models Alignment via Sample-based Variational Inference) has been developed by researchers. This innovative approach necessitates only sample access to both the generator and the reference distribution. It reframes alignment as sampling from a reward-tilted distribution linked to a reference, employing Stein Variational Gradient Descent as a sample-based variational inference method and utilizing fixed-point regression for amortizing particle updates. When tested on robotics manipulation and image generator alignment, FAV surpassed existing baselines in 56 offline and 30 offline-to-online RL tasks. The research paper can be found on arXiv.
Key facts
- FAV requires only sample access to the generator and reference distribution.
- It uses Stein Variational Gradient Descent for sample-based variational inference.
- Particle updates are amortized into generator parameters via fixed-point regression.
- Evaluated on robotics manipulation and image generator alignment.
- Outperforms baselines on 56 offline and 30 offline-to-online RL tasks.
- Paper available on arXiv with ID 2605.26552.
Entities
Institutions
- arXiv