ClaimRAG-LAW: A Bilingual Benchmark for Legal RAG Systems
Researchers have introduced ClaimRAG-LAW, a comprehensive dataset for evaluating retrieval-augmented generation (RAG) systems in the legal domain. The benchmark supports both French and English, targeting both legal experts and non-experts. It addresses the lack of fine-grained evaluation frameworks for legal RAG, which are used to mitigate hallucinations in large language model (LLM) responses. Existing benchmarks are largely English-only and focused on expert queries. ClaimRAG-LAW includes diverse question types to enable separate analysis of retrieval and generation performance. The dataset is designed to improve reliability of LLM-based legal question answering.
Key facts
- ClaimRAG-LAW is a dataset for legal RAG evaluation.
- It supports French and English languages.
- It targets both legal experts and non-experts.
- Existing legal RAG benchmarks lack granularity.
- RAG systems are used to reduce hallucinations in LLMs.
- The dataset includes diverse question types.
- It enables separate evaluation of retrieval and generation.
- The work is published on arXiv.
Entities
Institutions
- arXiv