ARTFEED — Contemporary Art Intelligence

Semantic Context Boosts LLM Accuracy on Database Queries by 17-23 Points

ai-technology · 2026-04-30

A recent study published on arXiv (2604.25149) reveals that supplying LLMs with a semantic context document greatly enhances their performance in natural-language queries of analytical databases. Researchers evaluated three leading models—Claude Opus 4.7, Claude Sonnet 4.6, and GPT-5.4—using 100 questions based on the Cleaned Contoso Retail Dataset in ClickHouse. Each model underwent two tests: one with just the warehouse schema and another with the schema plus a 4 KB markdown document detailing measures, conventions, and disambiguation rules. The inclusion of the document resulted in an accuracy boost of +17 to +23 percentage points for all models. When provided with the document, their performance ranged from 67.7% to 68.7%, while performance without it was statistically similar. The study notes that both inaccurate responses and confident hallucinations arise when models attempt to deduce business semantics absent from the schema.

Key facts

  • Study published on arXiv under ID 2604.25149
  • Benchmarked Claude Opus 4.7, Claude Sonnet 4.6, and GPT-5.4
  • Used Cleaned Contoso Retail Dataset in ClickHouse
  • 100 natural-language questions tested
  • Paired single-shot protocol applied
  • Accuracy improved by +17 to +23 percentage points with semantic context
  • With context, models scored 67.7-68.7%
  • Without context, models were statistically indistinguishable

Entities

Institutions

  • arXiv

Sources