AutoRubric-T2I: Automated Rubric Learning for Text-to-Image Alignment
Researchers propose AutoRubric-T2I, a framework that automatically synthesizes and selects explicit rubrics to guide Vision-Language Model (VLM) judges in evaluating Text-to-Image (T2I) generation models. Existing reward models are costly, opaque, and trained on large human preference corpora. AutoRubric-T2I generates candidate rubrics from preference pairs and uses a VLM judge to score images, aiming for robust, interpretable alignment with human preferences.
Key facts
- AutoRubric-T2I is the first rubric learning framework in T2I
- It synthesizes reasoning traces from preference pairs into candidate rubrics
- Uses a VLM judge to score paired images
- Aims to reduce cost and opacity of existing reward models
- Published on arXiv with ID 2605.17602
- Announce type is new
- Focuses on aligning T2I models with human preferences
- Addresses limitations of Bradley-Terry preference models
Entities
Institutions
- arXiv