Multi-Agent NL2SQL Method Achieves 78.1% Accuracy on BIRD Benchmark

ai-technology · 2026-05-20

A new multi-agent method for natural language to SQL (NL2SQL) conversion has achieved 78.1% semantic accuracy on the BIRD benchmark, as detailed in a paper on arXiv. The approach uses a semantically enriched representation of user-provided schema and incorporates user-provided business rules to generate accurate SQL queries. Key contributions include an optimized orchestrator in a multi-agent solution that leverages LLMs for planning, orchestration, reflection, and self-correction, as well as advanced schema enrichment. The study addresses the persistent gap between LLM-based NL2SQL and human expert SQL writers, aiming to improve accuracy for practical applications relying on relational databases.

Key facts

The method achieves 78.1% semantic accuracy on the BIRD benchmark.
It uses a semantically enriched representation of user-provided schema.
User-provided business rules are incorporated into the process.
The solution is multi-agent with an optimized orchestrator.
LLMs are used for planning, orchestration, reflection, and self-correction.
Advanced schema enrichment was developed as part of the method.
The paper is published on arXiv with ID 2605.19010.
The work targets the NL2SQL problem for relational databases.

Multi-Agent NL2SQL Method Achieves 78.1% Accuracy on BIRD Benchmark

Key facts

Entities

Institutions

Sources