CRM Detects When LLMs Rely on Memory Over Retrieved Context

ai-technology · 2026-05-27

A new study from arXiv identifies the 'attribution blind spot' in retrieval-augmented generation (RAG), where language models produce context-consistent output from parametric memory rather than external evidence. The authors introduce Computational Reality Monitoring (CRM), adapted from cognitive science, to detect this failure by comparing internal representations with and without retrieved context. CRM reveals representational divergence that output-level monitors miss, offering a method to verify whether retrieved context actually governs generation. The paper, arXiv:2605.26778, addresses a critical gap for high-stakes deployment of RAG systems.

Key facts

arXiv paper 2605.26778 identifies the attribution blind spot in RAG
Standard assumption that context-consistent output implies context-governed output is flawed
Models can produce faithful-looking text from parametric memory when retrieved documents overlap with pretraining data
Computational Reality Monitoring (CRM) compares internal representations with and without context
CRM is adapted from cognitive science's reality monitoring framework
CRM detects membership-conditioned representational divergence missed by output-level monitors
CRM does not certify which source an output originates from
The study addresses a prerequisite for high-stakes deployment of RAG

CRM Detects When LLMs Rely on Memory Over Retrieved Context

Key facts

Entities

Institutions

Sources