ARTFEED — Contemporary Art Intelligence

AI-Generated Python Refactoring Pull Requests Show Mixed Quality and Security Results

other · 2026-05-22

A new empirical study from arXiv (2605.21453) examines the quality and security of AI-generated Python refactoring pull requests using the AIDev dataset. Researchers applied PyQu, an ML-based quality assessment tool, alongside Pylint and Bandit static analysis to measure changes across five quality attributes. Results show agentic commits improve a quality attribute in 22.5% of changes, with usability improvements being most common. However, security and maintainability issues persist, highlighting risks in AI-driven code contributions.

Key facts

  • Study analyzes Python refactoring PRs from AIDev dataset
  • Uses PyQu, Pylint, and Bandit for quality and security assessment
  • Agentic commits improve a quality attribute in 22.5% of changes
  • Usability is the most improved quality attribute
  • Security and maintainability issues remain after AI edits
  • Research addresses gap in empirical evidence on AI code contributions
  • Findings published on arXiv with identifier 2605.21453
  • Study focuses on real-world GitHub repositories

Entities

Institutions

  • arXiv
  • AIDev

Sources