SAFEdit: Multi-Agent Framework for Reliable Code Editing
The newly developed multi-agent system, SAFEdit, seeks to enhance the dependability of code editing tasks performed by large language models (LLMs). According to the EditBench benchmark, 39 out of 40 models assessed exhibit a task success rate (TSR) of less than 60 percent, highlighting a considerable disparity between general code generation and instruction-based editing under executable testing conditions. SAFEdit breaks down the editing workflow into distinct roles: a Planner Agent formulates a clear, visibility-conscious editing strategy; an Editor Agent implements minimal, literal changes to the code; and a Verifier Agent conducts actual test executions. In cases of test failures, a Failure Abstraction Layer (FAL) converts raw test logs into organized diagnostic information, which is then utilized by the Editor for further refinement. The framework's performance is compared to earlier single-model methods, as discussed in the paper.
Key facts
- SAFEdit is a multi-agent framework for instructed code editing.
- It decomposes editing into Planner, Editor, and Verifier agents.
- A Failure Abstraction Layer (FAL) converts test logs into structured feedback.
- EditBench benchmark shows 39 of 40 models have TSR below 60%.
- The framework aims to reduce unintended code changes.
- Iterative refinement is supported via feedback loops.
- The paper is on arXiv with ID 2604.25737.
- SAFEdit is compared against prior single-model approaches.
Entities
Institutions
- arXiv