LLMs Show Systematic Epistemic-Rhetorical Miscalibration

ai-technology · 2026-04-24

A recent study published on arXiv (2604.19768) presents a framework aimed at measuring the separation between rhetorical intensity and epistemic grounding within large language models. The researchers developed a taxonomy of triadic epistemic-rhetorical markers, operationalized through composite metrics: form-meaning divergence, the genuine-to-performed epistemic ratio, and rhetorical device distribution entropy. By examining 225 argumentative texts (approximately 0.6 million tokens) from human experts, non-experts, and LLM-generated sources, they uncovered a consistent epistemic signature across models. LLM texts utilize tricolon nearly twice as often as experts (Δ=0.95), whereas human authors use erotema at over twice the rate of LLMs. Additionally, performed hesitancy markers are found in LLM outputs at double the density of human texts, highlighting significant form-meaning divergence in LLM content.

Key facts

Study from arXiv:2604.19768
Proposes framework for quantifying epistemic-rhetorical miscalibration
Triadic ERM taxonomy with FMD, GPR, RDDE metrics
Analyzed 225 argumentative texts (~0.6M tokens)
Compared human expert, human non-expert, and LLM sub-corpora
LLMs use tricolon at nearly twice expert rate (Δ=0.95)
Humans use erotema at more than twice LLM rate
Performed hesitancy markers twice as dense in LLM output

LLMs Show Systematic Epistemic-Rhetorical Miscalibration

Key facts

Entities

Institutions

Sources