TPA: New Method Detects LLM Hallucinations by Attributing Token Probabilities

ai-technology · 2026-04-24

Researchers have unveiled a new technique called TPA, or Next Token Probability Attribution, designed to pinpoint hallucinations in Retrieval-Augmented Generation (RAG) systems. Unlike earlier methods that treated hallucinations as just a clash between internal knowledge and retrieved data, TPA identifies seven distinct sources contributing to this issue: query, RAG context, past token, self token, feedforward networks, final LayerNorm, and initial embedding. By mathematically connecting each token's likelihood to these sources, TPA evaluates their role in the generation process. The analysis of attribution scores using Part-of-Speech tags highlights how different model components affect hallucinations. You can check out the complete paper on arXiv under the reference 2512.07515.

Key facts

TPA attributes token probability to seven sources: query, RAG context, past token, self token, FFN, final LayerNorm, initial embedding.
Prior methods only considered binary conflict between FFN and retrieved context.
Aggregation by Part-of-Speech tags quantifies component contributions.
Paper available on arXiv under ID 2512.07515.
Published as arXiv:2512.07515v4 (replace-cross).

TPA: New Method Detects LLM Hallucinations by Attributing Token Probabilities

Key facts

Entities

Institutions

Sources