AutoVDC Framework Uses VLMs to Clean Autonomous Driving Datasets
A new framework called AutoVDC (Automated Vision Data Cleaning) has been developed by researchers to utilize Vision-Language Models (VLMs) for the automatic detection of incorrect annotations in vision datasets related to autonomous driving. This innovative method seeks to minimize the time and expenses associated with the manual examination of extensive datasets, which frequently suffer from inaccuracies due to human input. The effectiveness of AutoVDC was assessed using the KITTI and nuImages datasets, known for their object detection benchmarks in autonomous driving. The research team generated dataset variations with deliberately added erroneous annotations to evaluate the error detection rate. This framework allows users to improve data quality and rectify errors without the need for manual effort.
Key facts
- AutoVDC stands for Automated Vision Data Cleaning
- The framework uses Vision-Language Models (VLMs) to detect annotation errors
- Validated on KITTI and nuImages datasets
- Datasets contain object detection benchmarks for autonomous driving
- Intentionally injected errors were used to test detection rate
- Human annotations are imperfect and require multiple iterations
- Manual review of large datasets is laborious and expensive
- The approach enhances data quality automatically
Entities
—