New Algorithm Detects Human vs. LLM Text Segments
Researchers propose algorithms to segment human-LLM co-authored text by adapting change point detection from time-series analysis. The method identifies which parts of a passage are written by humans versus large language models, addressing the limitation of binary classifiers that label entire texts. A weighted and a generalized algorithm handle varying detection scores, with minimax optimality proven. Empirical results show strong performance on arXiv:2605.03723.
Key facts
- arXiv:2605.03723 proposes segmenting human-LLM co-authored text.
- The approach adapts change point detection from time-series analysis.
- A weighted algorithm and a generalized algorithm are developed.
- The procedure achieves minimax optimality.
- Empirical results demonstrate strong performance.
- Existing detectors only provide binary classification for entire passages.
- The work addresses the need to localize specific authored segments.
- Large language models create an urgent need for text authenticity.
Entities
Institutions
- arXiv