ARTFEED — Contemporary Art Intelligence

Code Evaluation Metrics Tested for Plagiarism Detection

other · 2026-04-30

A new study that appeared on arXiv (2604.25778) looks into how well Code Evaluation Metrics (CEMs) can detect source code plagiarism across six levels of modifications, from L1 to L6. The researchers evaluated several metrics, including CodeBLEU, CrystalBLEU, RUBY, Tree Structured Edit Distance (TSED), and CodeBERTScore, using the ConPlag and IRPlag datasets. They also compared these metrics to top tools like JPlag and Dolos. The results show that without preprocessing, these metrics are not very effective at spotting plagiarism.

Key facts

  • Study compares five CEMs against SOTA SCPDTs JPlag and Dolos
  • Uses ConPlag (raw and template-free) and IRPlag datasets
  • Evaluates plagiarism across modification levels L1-L6
  • CEMs tested: CodeBLEU, CrystalBLEU, RUBY, TSED, CodeBERTScore
  • Threshold-free ranking-based measures used for evaluation
  • Findings indicate CEMs cannot reliably detect plagiarism without preprocessing
  • Published on arXiv with ID 2604.25778
  • Focuses on academic integrity in software engineering education

Entities

Institutions

  • arXiv

Sources