AutoVDC Framework Uses VLMs to Clean Autonomous Driving Datasets

ai-technology · 2026-05-01

A new framework called AutoVDC (Automated Vision Data Cleaning) has been developed by researchers to utilize Vision-Language Models (VLMs) for the automatic detection of incorrect annotations in vision datasets related to autonomous driving. This innovative method seeks to minimize the time and expenses associated with the manual examination of extensive datasets, which frequently suffer from inaccuracies due to human input. The effectiveness of AutoVDC was assessed using the KITTI and nuImages datasets, known for their object detection benchmarks in autonomous driving. The research team generated dataset variations with deliberately added erroneous annotations to evaluate the error detection rate. This framework allows users to improve data quality and rectify errors without the need for manual effort.

Key facts

AutoVDC stands for Automated Vision Data Cleaning
The framework uses Vision-Language Models (VLMs) to detect annotation errors
Validated on KITTI and nuImages datasets
Datasets contain object detection benchmarks for autonomous driving
Intentionally injected errors were used to test detection rate
Human annotations are imperfect and require multiple iterations
Manual review of large datasets is laborious and expensive
The approach enhances data quality automatically

Entities

—

Sources

arXiv cs.AI — 2026-05-01