ARTFEED — Contemporary Art Intelligence

OpenVTON-Bench: 100K High-Resolution Benchmark for Virtual Try-On Evaluation

ai-technology · 2026-05-07

OpenVTON-Bench has been introduced as a comprehensive benchmark for virtual try-on evaluation, comprising around 100,000 high-resolution image pairs, each measuring up to 1536x1536 pixels. Utilizing DINOv3-based hierarchical clustering for sampling, the dataset achieves uniform distribution across 20 garment categories through Gemini-powered dense captioning. The evaluation protocol assesses five critical dimensions, including background consistency and overall realism, addressing the shortcomings of traditional metrics in capturing texture details and semantic consistency. This initiative aims to align with commercial standards for scale and diversity and has been published on arXiv under the identifier 2601.22725.

Key facts

  • OpenVTON-Bench includes approximately 100K high-resolution image pairs.
  • Images are up to 1536x1536 pixels.
  • Dataset uses DINOv3-based hierarchical clustering for sampling.
  • Gemini-powered dense captioning ensures uniform distribution across 20 garment categories.
  • Evaluation protocol measures five dimensions: background consistency, identity fidelity, texture fidelity, shape plausibility, overall realism.
  • Addresses limitations of traditional metrics in quantifying texture details and semantic consistency.
  • Aims to meet commercial standards in scale and diversity.
  • Published on arXiv (2601.22725).

Entities

Institutions

  • arXiv

Sources