Refute-or-Promote Methodology Boosts LLM-Assisted Defect Discovery with 83% Kill Rate
A new adversarial multi-agent review methodology called Refute-or-Promote addresses precision problems in LLM-assisted defect discovery, where too many incorrect reports undermine credibility. The system employs Stratified Context Hunting for candidate generation alongside adversarial kill mandates and context asymmetry. Cross-Model Critic components enable cross-family review to identify correlated blind spots that same-family review might miss. During a 31-day evaluation across seven targets including security libraries and the ISO C++ standard, the pipeline eliminated approximately 79% of 171 candidates before disclosure. In a consolidated-protocol subset focusing on lcms2 and wolfSSL with 30 candidates, the prospective kill rate reached 83%. The approach yielded four CVEs, with three publicly disclosed and one currently embargoed, plus acceptance of LWG 4549. Cold-start reviewers are incorporated to minimize anchoring cascades during the review process.
Key facts
- Methodology named Refute-or-Promote improves LLM-assisted defect discovery precision
- Combines Stratified Context Hunting, adversarial kill mandates, context asymmetry, and Cross-Model Critic
- 31-day campaign tested across 7 targets including security libraries and ISO C++ standard
- Pipeline killed roughly 79% of 171 candidates before disclosure
- Consolidated-protocol subset (lcms2, wolfSSL; n=30) showed 83% prospective kill rate
- Resulted in 4 CVEs (3 public, 1 embargoed)
- LWG 4549 was accepted
- Cold-start reviewers reduce anchoring cascades
- Cross-family review catches correlated blind spots
Entities
—