Multi-Graph Reasoning Benchmark for Vision-Language Models
A novel standard for assessing multi-graph reasoning in Vision-Language Models (VLMs) has been launched. This benchmark encompasses four types of graphs: knowledge graphs, flowcharts, mind maps, and route maps. It features both homogeneous and heterogeneous groupings, with tasks that escalate in complexity. Various leading VLMs were tested through a multi-dimensional scoring system that evaluates graph parsing, consistency in reasoning, and accuracy in following instructions. Additionally, the research includes the fine-tuning of several open-source models to enhance their performance in multi-graph joint reasoning, a significant challenge that has not been thoroughly explored in VLM studies.
Key facts
- First comprehensive benchmark for multi-graph reasoning in VLMs
- Covers four graph types: knowledge graphs, flowcharts, mind maps, route maps
- Includes homogeneous and heterogeneous graph groupings
- Tasks of increasing complexity
- Evaluates state-of-the-art VLMs
- Multi-dimensional scoring framework: graph parsing, reasoning consistency, instruction-following accuracy
- Fine-tunes multiple open-source models
- Addresses underexplored challenge of multi-graph joint reasoning
Entities
—