Synthetic Data Pipeline for Industrial Defect Detection
A new end-to-end generative pipeline, SynSur, addresses data scarcity in industrial defect detection by creating synthetic labeled defect samples. Combining Vision-Language-Model prompts, LoRA-adapted diffusion, mask-guided inpainting, and automatic label derivation, the system generates realistic training data. Evaluation on pitting defects in ball screw drives and the MSD mobile phone screen dataset demonstrates cross-domain transfer. The method analyzes prompt construction, LoRA selection, and sample filtering to improve detector performance.
Key facts
- SynSur is an end-to-end pipeline for synthetic defect generation and annotation.
- It uses Vision-Language-Model-based prompts, LoRA-adapted diffusion, mask-guided inpainting, and sample filtering.
- Evaluation conducted on pitting defects on ball screw drives.
- Also tested on Mobile phone screen surface defect segmentation dataset (MSD).
- Pipeline aims to overcome scarcity of labeled defect data.
- Analyzes prompt construction, LoRA selection, and sample filtering.
- arXiv:2604.26633v1.
- Published as a cross-type announcement.
Entities
Institutions
- arXiv