DiagramNet Dataset and Framework for System-Level Diagram Recognition
Researchers have unveiled DiagramNet, the inaugural multimodal dataset tailored for system-level diagrams, which tackles the difficulty of identifying non-standard symbols in chip design schematics. This dataset comprises 10,977 connection annotations and 15,515 chain-of-thought QA pairs, spanning four tasks: Listing, Localization, Connection, and Circuit QA. A progressive training pipeline, alongside a decoupled multi-agent workflow, breaks down visual reasoning into three stages: Perception, Reasoning, and Knowledge. By integrating a 3B-parameter model with this approach, the results exceed those of the 2025 EDA Elite Challenge champion and outperform GPT-5, Claude-Sonnet-4, and Gemini. The findings have been published on arXiv.
Key facts
- DiagramNet is the first multimodal dataset for system-level diagrams.
- Dataset contains 10,977 connection annotations and 15,515 chain-of-thought QA pairs.
- Four tasks: Listing, Localization, Connection, and Circuit QA.
- Progressive training pipeline and decoupled multi-agent workflow proposed.
- Workflow decomposes visual reasoning into Perception, Reasoning, and Knowledge stages.
- 3B-parameter model with workflow surpasses 2025 EDA Elite Challenge winner.
- Outperforms GPT-5, Claude-Sonnet-4, and Gemini.
- Published on arXiv with ID 2605.01338.
Entities
Institutions
- arXiv