TempGlitch Benchmark for Temporal Glitch Detection in Games
TempGlitch is a new benchmark designed to assess vision-language models (VLMs) in identifying temporal glitches within gameplay videos. Unlike most current assessments that view glitches as static visual issues observable from a single frame, TempGlitch emphasizes glitches that are revealed through variations across sequential frames. Initial research indicates that VLMs struggle significantly more with detecting temporal glitches compared to spatial ones. The benchmark includes five types of temporal glitches, ensuring balanced samples for each category, alongside glitch-free videos for accurate binary evaluation. This study, available on arXiv (2605.21443v1), examined 12 proprietary models and addresses a previously overlooked area in AI-driven video game quality assurance.
Key facts
- TempGlitch is a controlled gameplay video benchmark for temporal glitch detection.
- It covers five temporal glitch types with balanced per-category samples.
- Includes paired glitch-free videos for reliable binary evaluation.
- Preliminary study shows temporal glitches are substantially harder for VLMs to detect than spatial ones.
- Evaluated 12 proprietary vision-language models.
- Published on arXiv with ID 2605.21443v1.
- Most existing evaluations treat glitches as static visual anomalies.
- Temporal glitches become evident only through changes across ordered frames.
Entities
Institutions
- arXiv