BUILD-AND-FIND: New Protocol Measures Codebase Clarity for Downstream Agents
Researchers have unveiled BUILD-AND-FIND, a new protocol aimed at assessing whether downstream coding agents can accurately retrieve intended design decisions from generated repositories. This protocol responds to the increasing prevalence of agent-driven repository-level engineering, where one agent creates a repository that subsequent agents can examine, audit, or enhance. In this context, a generated repository acts as both a solution to a task and a communication tool for future endeavors. Even if agents meet observable behavioral goals, the clarity with which repositories reveal intended behaviors and design choices can vary. BUILD-AND-FIND evaluates both the precision of downstream recovery and the effort needed for inspection. In this protocol, a builder constructs a codebase from a concealed repository specification, while a finder only has access to the codebase and a bank of specification-traced multiple-choice questions. This method was detailed in a paper on arXiv (2605.06136) and is intended for assessing agent-managed codebases.
Key facts
- BUILD-AND-FIND is a protocol for evaluating agent-managed codebases.
- It measures downstream agents' ability to recover intended design choices.
- The protocol assesses both accuracy and inspection effort.
- A builder creates a codebase from a hidden specification.
- A finder uses the codebase and a question bank to recover choices.
- The approach addresses repository-level engineering by multiple agents.
- Repositories are seen as communication artifacts for future work.
- The paper is available on arXiv (2605.06136).
Entities
Institutions
- arXiv