AI Text Detection Models Struggle with Distribution Shift
A recent study published on arXiv (2605.03969) assesses the efficacy of transformer-based detectors designed for identifying AI-generated text, utilizing HC3 PLUS for training and evaluating across various domains and generators. The detectors achieve a remarkable 99.5% balanced accuracy within the training domain; however, their performance diminishes when transferring to M4 and AI-Text-Detection-Pile due to distribution shifts. Enhancing features through attention-based linguistic feature fusion leads to improved transferability, with DeBERTa-v3-base+FeatAttn yielding the highest performance results.
Key facts
- arXiv paper 2605.03969
- Trains on HC3 PLUS
- Tests on M4 benchmark and AI-Text-Detection-Pile
- In-domain accuracy up to 99.5%
- Performance degrades under distribution shift
- Feature augmentation improves transfer
- Best model: DeBERTa-v3-base+FeatAttn
- Single decision threshold fixed across tests
Entities
Institutions
- arXiv