COMO: Closed-Loop Optical Molecule Recognition with Minimum Risk Training
A new AI framework called COMO (Closed-loop Optical Molecule recOgnition) addresses the challenge of optical chemical structure recognition (OCSR) in real-world documents. OCSR translates molecular images into machine-readable formats like SMILES strings or molecular graphs, but struggles with variations in chemical structures, shorthand conventions, and visual noise. Existing deep-learning methods use teacher forcing with token-level Maximum Likelihood Estimation (MLE), which suffers from exposure bias and fails to optimize molecular-level criteria like chemical validity and structural similarity. COMO introduces Minimum Risk Training (MRT) to OCSR, creating a closed-loop framework that directly optimizes over molecular-level evaluation metrics, mitigating exposure bias. The paper is available on arXiv under identifier 2604.23546.
Key facts
- COMO is a closed-loop framework for optical chemical structure recognition
- It uses Minimum Risk Training to mitigate exposure bias
- Existing methods rely on token-level Maximum Likelihood Estimation
- OCSR translates molecular images into SMILES strings or molecular graphs
- The paper is on arXiv with ID 2604.23546
- Real-world documents have inexhaustible variations in chemical structures
- Token-level MLE hinders optimization for chemical validity and structural similarity
- COMO directly optimizes over molecular-level evaluation criteria
Entities
Institutions
- arXiv