ClaimRAG-LAW: A Bilingual Benchmark for Legal RAG Systems

ai-technology · 2026-05-22

Researchers have introduced ClaimRAG-LAW, a comprehensive dataset for evaluating retrieval-augmented generation (RAG) systems in the legal domain. The benchmark supports both French and English, targeting both legal experts and non-experts. It addresses the lack of fine-grained evaluation frameworks for legal RAG, which are used to mitigate hallucinations in large language model (LLM) responses. Existing benchmarks are largely English-only and focused on expert queries. ClaimRAG-LAW includes diverse question types to enable separate analysis of retrieval and generation performance. The dataset is designed to improve reliability of LLM-based legal question answering.

Key facts

ClaimRAG-LAW is a dataset for legal RAG evaluation.
It supports French and English languages.
It targets both legal experts and non-experts.
Existing legal RAG benchmarks lack granularity.
RAG systems are used to reduce hallucinations in LLMs.
The dataset includes diverse question types.
It enables separate evaluation of retrieval and generation.
The work is published on arXiv.

ClaimRAG-LAW: A Bilingual Benchmark for Legal RAG Systems

Key facts

Entities

Institutions

Sources