FINER-SQL: Fine-Grained Feedback Boosts Small Language Models for Text-to-SQL

ai-technology · 2026-05-07

FINER-SQL is a framework utilizing reinforcement learning to improve small language models (SLMs) for generating Text-to-SQL. Although large language models (LLMs) have progressed in this area, they face challenges such as high computational demands, latency issues, and data privacy risks, rendering them unsuitable for various practical uses. While SLMs facilitate efficient and private on-premise deployment, they often exhibit inadequate reasoning and poor adherence to instructions. Traditional reinforcement learning techniques that rely on sparse binary rewards (0/1) fail to provide effective learning signals for incorrect SQL outputs, resulting in unstable training. FINER-SQL resolves these challenges by offering dense, interpretable rewards that deliver continuous feedback, even for erroneous results, and is founded on group relative policy optimization for enhanced training scalability and reusability.

Key facts

FINER-SQL is a reinforcement learning framework for small language models (SLMs) in Text-to-SQL.
Large language models (LLMs) have driven advances in Text-to-SQL but have high computational cost, latency, and privacy concerns.
SLMs enable efficient and private on-premise deployment but have weak reasoning and instruction following.
Conventional reinforcement learning uses sparse binary rewards (0/1) that provide little learning signal for incorrect SQLs.
FINER-SQL replaces sparse supervision with dense and interpretable rewards offering continuous feedback.
The framework is built on group relative policy optimization.
FINER-SQL aims to enhance SLMs for Text-to-SQL without LLM drawbacks.
The approach is scalable and reusable.

FINER-SQL: Fine-Grained Feedback Boosts Small Language Models for Text-to-SQL

Key facts

Entities

Institutions

Sources