ARTFEED — Contemporary Art Intelligence

VerbatimRAG: Hallucination-Free QA for Research Papers

ai-technology · 2026-05-22

Researchers have developed VerbatimRAG, an extractive question answering system that eliminates hallucinations in AI-assisted research by mapping user queries directly to verbatim text spans in retrieved documents. The system is applied to the ACL Anthology and uses a novel ground truth dataset created via the ScIRGen methodology, with human annotation by NLP researchers. A 150M-parameter ModernBERT model is trained and evaluated on this benchmark. The approach addresses the tendency of LLMs to produce factually inaccurate output, providing a reliable method for collecting high-quality information from trusted sources.

Key facts

  • VerbatimRAG is an extractive QA system for research papers.
  • It maps user queries to verbatim text spans in retrieved documents.
  • Applied to the ACL Anthology.
  • Uses a novel ground truth dataset based on synthetic queries and ScIRGen methodology.
  • Human annotation performed by NLP researchers.
  • A 150M-parameter ModernBERT model is trained and evaluated.
  • Addresses LLM hallucination problem in research.
  • arXiv paper ID: 2605.21102.

Entities

Institutions

  • ACL Anthology
  • arXiv

Sources