ARTFEED — Contemporary Art Intelligence

Simpson's Paradox in Machine-Generated Text Detection

other · 2026-05-09

A new arXiv paper (2605.06294) reveals that the dominant method for distinguishing human-written from machine-generated text suffers from Simpson's paradox. The likelihood hypothesis, which assumes machine text is more probable to a detector model, fails because token-level signals are non-uniform across the model's hidden space. Naive averaging of likelihood scores across regions with different statistical structure destroys local signals. The authors propose a learned local calibration step grounded in Bayesian decision theory, using lightweight predictors of score distributions conditioned on position to correct aggregation errors.

Key facts

  • arXiv paper 2605.06294 addresses detection of machine-generated text.
  • Dominant approach uses likelihood hypothesis: machine text appears more probable.
  • Token-level signal is non-uniform across hidden space of detector model.
  • Naive averaging causes Simpson's paradox, destroying strong local signals.
  • Proposed solution: learned local calibration step based on Bayesian decision theory.
  • Calibration uses lightweight predictors of score distributions conditioned on position.
  • Paper demonstrates that inappropriate aggregation is a key flaw in current detectors.
  • Research is of profound societal importance for distinguishing human vs. AI text.

Entities

Sources