Decoupled Pipeline for Automated Defect Detection and Report Generation in Wind Turbine Inspection
A new hybrid vision-language architecture automates defect reasoning and report generation for industrial inspection, specifically targeting wind turbine blades. The system, described in arXiv:2605.26533, decouples the task into three components: Eyes (YOLO26-x-obb detector for defect localization), Bridge (a deterministic encoding module mapping bounding boxes to spatial tokens), and Brain (a 4-bit quantized Qwen-2.5-1.5B model fine-tuned with QLoRA on 947 synthetic reports). Retrieval-Augmented Fine-Tuning (RAFT) grounds recommendations in indexed maintenance data. The pipeline is edge-deployable and produces structured JSON reports, addressing the current gap where linguistic interpretation relies on human experts.
Key facts
- arXiv:2605.26533 describes a decoupled pipeline for wind turbine blade inspection.
- The system uses YOLO26-x-obb for oriented bounding-box detection.
- A deterministic encoding module maps bounding boxes to grid-referenced spatial tokens.
- The language model is a 4-bit quantized Qwen-2.5-1.5B adapted with QLoRA.
- Training data consists of 947 synthetically generated maintenance reports.
- Retrieval-Augmented Fine-Tuning (RAFT) grounds recommendations in indexed maintenance data.
- The pipeline is designed for edge deployment.
- It generates structured JSON reports automatically.
Entities
Institutions
- arXiv