JUDO Framework Enhances Industrial Anomaly QA with Domain Knowledge
Researchers have introduced JUDO (Juxtaposed Domain-Oriented Multimodal Reasoner), a framework designed to improve large multimodal models (LMMs) for industrial anomaly detection and question answering. JUDO addresses the lack of domain-specific knowledge in LMMs by incorporating visual and textual reasoning. It segments defect regions through visual comparison of query and normal images, and uses supervised fine-tuning (SFT) and reinforcement learning (GRPO) to enhance domain understanding. The framework aims to generate more accurate responses in complex industrial scenarios.
Key facts
- JUDO is a framework for industrial anomaly QA.
- It incorporates domain knowledge into LMMs.
- Visual reasoning juxtaposes query and normal images.
- Supervised fine-tuning (SFT) enhances context understanding.
- Reinforcement learning (GRPO) guides domain reasoning.
- The work is published on arXiv (2605.20284).
- LMMs currently lack domain-specific knowledge.
Entities
Institutions
- arXiv