ARTFEED — Contemporary Art Intelligence

FinRAG-12B: Banking LLM Achieves 12% Refusal Rate with 143M Tokens

ai-technology · 2026-05-09

Researchers introduced FinRAG-12B, a 12-billion parameter language model for grounded question answering in banking. The model uses a data-efficient pipeline with only 143 million tokens, combining LLM-as-a-Judge filtering, citation annotation, and curriculum learning. It outperforms GPT-4.1 on citation grounding while maintaining high answer quality. A calibrated refusal mechanism trained on 22% unanswerable examples yields a 12% 'I don't know' rate, improving over the base model's unsafe 4.3% rate. The work addresses banking industry demands for accuracy, regulatory compliance, and verifiable responses.

Key facts

  • FinRAG-12B is a 12B parameter LLM for banking question answering.
  • Training uses only 143M tokens with LLM-as-a-Judge filtering.
  • Outperforms GPT-4.1 on citation grounding.
  • Calibrated refusal mechanism yields 12% 'I don't know' rate.
  • Base model had unsafe 4.3% refusal rate.
  • Trained on 22% unanswerable examples.
  • Addresses banking industry demands for accuracy and compliance.
  • Uses curriculum learning and citation annotation.

Entities

Institutions

  • arXiv

Sources