InterChart Benchmark Tests VLMs on Multi-Chart Reasoning

ai-technology · 2026-05-04

InterChart has been launched by researchers as a diagnostic standard aimed at assessing the reasoning capabilities of vision-language models (VLMs) across interconnected charts. This evaluation is crucial for practical uses in areas like scientific reporting, financial analysis, and public policy dashboards. Unlike previous benchmarks that examined isolated, visually similar charts, InterChart presents a variety of question types, including entity inference, trend correlation, numerical estimation, and complex multi-step reasoning based on 2-3 related charts. The benchmark consists of three levels of difficulty: factual reasoning with single charts, integrative analysis of aligned chart sets, and semantic inference using visually intricate, real-world chart pairs. Evaluations show that both open- and closed-source VLMs experience significant accuracy drops as chart complexity rises.

Key facts

InterChart is a diagnostic benchmark for vision-language models.
It evaluates reasoning across multiple related charts.
Tasks include entity inference, trend correlation, numerical estimation, and multi-step reasoning.
The benchmark has three tiers of increasing difficulty.
Tiers cover individual charts, synthetic sets, and real-world pairs.
State-of-the-art VLMs show steep accuracy declines with complexity.
InterChart targets applications in scientific reporting, financial analysis, and policy dashboards.
The benchmark was introduced in arXiv:2508.07630v2.

InterChart Benchmark Tests VLMs on Multi-Chart Reasoning

Key facts

Entities

Institutions

Sources