Innovation as an Almost Characterization of LLM Hallucination

ai-technology · 2026-05-27

A recent study published on arXiv (2605.26808) presents "innovation" as a characteristic of large language models, indicating their likelihood to generate outputs beyond the scope of their training data. The researchers demonstrate that this innovation aligns with the hallucination criteria established by Kalai and Vempala (STOC 2024), suggesting it serves as an almost complete characterization of hallucination. The paper explores two key inquiries: what aspect renders hallucinations inevitable in calibrated LLMs, and if abandoning calibration can prevent hallucinations. This research expands upon the probabilistic framework laid out by Kalai and Vempala, which defined calibration and hallucination, revealing that calibrated LLMs experience hallucinations at a rate corresponding to the "missing mass."

Key facts

Paper on arXiv: 2605.26808
Introduces property called 'innovation'
Innovation measures tendency to produce outputs outside training data
Innovation is implied by Kalai and Vempala's hallucination condition
Innovation is an almost characterization of hallucination
Addresses two fundamental questions about LLM hallucination
Builds on Kalai and Vempala (STOC 2024) framework
Kalai and Vempala showed calibrated LLMs hallucinate at rate of 'missing mass'

Innovation as an Almost Characterization of LLM Hallucination

Key facts

Entities

Institutions

Sources