RAG Failures Explained Through Circuit-Level Attribution Graphs

ai-technology · 2026-05-16

A new study from arXiv (2605.14192) investigates why Retrieval-Augmented Generation (RAG) systems produce incorrect answers despite access to external evidence. Using circuit tracing, researchers constructed attribution graphs that model information flow through transformer layers during decoding. These graphs reveal interactions among retrieved context, intermediate activations, and generated tokens. Across multiple question-answering benchmarks, consistent structural differences emerged: correct predictions exhibit deeper reasoning pathways than incorrect ones. The work provides a model-internal, circuit-level view of how external evidence is integrated into reasoning, offering a novel perspective on RAG failures.

Key facts

Study examines why RAG fails despite external evidence
Uses circuit tracing to build attribution graphs
Graphs model information flow through transformer layers
Correct predictions show deeper reasoning pathways
Findings consistent across multiple QA benchmarks
Paper available on arXiv with ID 2605.14192
Provides circuit-level view of evidence integration
Focuses on model-internal mechanisms

RAG Failures Explained Through Circuit-Level Attribution Graphs

Key facts

Entities

Institutions

Sources