AI-Generated Text Detection Fails in Real-World Scenarios, Study Finds
A new study from arXiv (2603.23146) reveals that AI-generated text detectors, despite high benchmark accuracy, fail in real-world settings. Researchers propose an interpretable detection framework combining linguistic feature engineering, machine learning, and explainable AI. Their model, trained on 30 linguistic features, achieved an F1 score of 0.9734 on PAN CLEF 2025 and COLING 2025 benchmarks. However, systematic cross-domain and cross-generator evaluation showed substantial generalization failure, suggesting detectors often exploit dataset-specific artifacts rather than genuine machine authorship.
Key facts
- Study published on arXiv (2603.23146)
- Model trained on 30 linguistic features
- Achieved F1 score of 0.9734 on PAN CLEF 2025 and COLING 2025
- Cross-domain and cross-generator evaluation revealed generalization failure
- Detectors may exploit dataset-specific artifacts
- Framework integrates linguistic feature engineering, machine learning, and explainable AI
- Research investigates whether detectors identify machine authorship or dataset artifacts
- Widespread adoption of LLMs makes detection a pressing challenge
Entities
Institutions
- arXiv
- PAN CLEF 2025
- COLING 2025