Multi-Graph Reasoning Benchmark for Vision-Language Models

ai-technology · 2026-04-27

A novel standard for assessing multi-graph reasoning in Vision-Language Models (VLMs) has been launched. This benchmark encompasses four types of graphs: knowledge graphs, flowcharts, mind maps, and route maps. It features both homogeneous and heterogeneous groupings, with tasks that escalate in complexity. Various leading VLMs were tested through a multi-dimensional scoring system that evaluates graph parsing, consistency in reasoning, and accuracy in following instructions. Additionally, the research includes the fine-tuning of several open-source models to enhance their performance in multi-graph joint reasoning, a significant challenge that has not been thoroughly explored in VLM studies.

Key facts

First comprehensive benchmark for multi-graph reasoning in VLMs
Covers four graph types: knowledge graphs, flowcharts, mind maps, route maps
Includes homogeneous and heterogeneous graph groupings
Tasks of increasing complexity
Evaluates state-of-the-art VLMs
Multi-dimensional scoring framework: graph parsing, reasoning consistency, instruction-following accuracy
Fine-tunes multiple open-source models
Addresses underexplored challenge of multi-graph joint reasoning

Entities

—

Sources

arXiv cs.AI — 2026-04-27