GraphARC Benchmark Tests AI on Graph-Based Abstract Reasoning

other · 2026-06-01

A new benchmark named GraphARC has been unveiled by researchers to assess abstract reasoning in graph-structured data. Unlike previous benchmarks that are limited to grids or text, GraphARC expands the few-shot transformation learning approach from the Abstraction and Reasoning Corpus (ARC). Each task involves deducing a transformation rule from several input-output graph pairs and applying it to a fresh test graph, addressing local, global, and hierarchical transformations. GraphARC can generate instances across various graph families and sizes, facilitating thorough generalization evaluation. Testing on advanced language models indicates a comprehension-execution gap, as these models can identify graph properties but struggle with complete graph transformation tasks, particularly as complexity increases. The benchmark is detailed in a paper available on arXiv (2605.31031).

Key facts

GraphARC is a benchmark for abstract reasoning on graph-structured data.
It generalizes the few-shot transformation learning paradigm of the Abstraction and Reasoning Corpus (ARC).
Each task requires inferring a transformation rule from a few input-output pairs and applying it to a new test graph.
Transformations cover local, global, and hierarchical graph changes.
GraphARC instances can be generated at scale across diverse graph families and sizes.
State-of-the-art language models show a comprehension-execution gap on GraphARC.
Models can answer questions about graph properties but often fail to solve full transformation tasks.
Performance further degrades with increasing complexity.

GraphARC Benchmark Tests AI on Graph-Based Abstract Reasoning

Key facts

Entities

Institutions

Sources