Study Reveals Skepticism Shift: Humans Increasingly Distrust Real Audio After Deepfake Exposure

ai-technology · 2026-05-27

A comprehensive investigation into the perception of audio deepfakes, released on arXiv, indicates a notable 'skepticism shift.' While participants' accuracy in identifying fake audio remained relatively unchanged (from 72.9% to 71.2%), their confidence in genuine speech significantly declined (from 72.7% to 64.1%). The research gathered 35,532 assessments from 1,768 individuals, evaluating 138 different text-to-speech and voice conversion technologies. It was found that commercial and autoregressive language models were the most challenging to identify (with accuracy between 61.3% and 65.9%), whereas traditional seq2seq and flow-matching models were more easily recognized (with accuracy ranging from 75.4% to 76.8%). Additionally, a machine learning detector achieved over 94.5% accuracy in all scenarios, highlighting that deepfakes diminish trust in real audio rather than merely affecting detection skills.

Key facts

35,532 judgments from 1,768 participants
138 text-to-speech and voice conversion systems tested
Human accuracy on fake samples: 72.9% (2021 baseline) to 71.2% (current)
Human accuracy on real samples dropped from 72.7% to 64.1%
Commercial and autoregressive language model systems hardest to detect (61.3-65.9%)
Traditional seq2seq and flow-matching models easier to spot (75.4-76.8%)
ML detector maintained over 94.5% accuracy
Study published on arXiv (2605.26136)

Study Reveals Skepticism Shift: Humans Increasingly Distrust Real Audio After Deepfake Exposure

Key facts

Entities

Institutions

Sources