Refute-or-Promote Methodology Boosts LLM-Assisted Defect Discovery with 83% Kill Rate

ai-technology · 2026-04-22

A new adversarial multi-agent review methodology called Refute-or-Promote addresses precision problems in LLM-assisted defect discovery, where too many incorrect reports undermine credibility. The system employs Stratified Context Hunting for candidate generation alongside adversarial kill mandates and context asymmetry. Cross-Model Critic components enable cross-family review to identify correlated blind spots that same-family review might miss. During a 31-day evaluation across seven targets including security libraries and the ISO C++ standard, the pipeline eliminated approximately 79% of 171 candidates before disclosure. In a consolidated-protocol subset focusing on lcms2 and wolfSSL with 30 candidates, the prospective kill rate reached 83%. The approach yielded four CVEs, with three publicly disclosed and one currently embargoed, plus acceptance of LWG 4549. Cold-start reviewers are incorporated to minimize anchoring cascades during the review process.

Key facts

Methodology named Refute-or-Promote improves LLM-assisted defect discovery precision
Combines Stratified Context Hunting, adversarial kill mandates, context asymmetry, and Cross-Model Critic
31-day campaign tested across 7 targets including security libraries and ISO C++ standard
Pipeline killed roughly 79% of 171 candidates before disclosure
Consolidated-protocol subset (lcms2, wolfSSL; n=30) showed 83% prospective kill rate
Resulted in 4 CVEs (3 public, 1 embargoed)
LWG 4549 was accepted
Cold-start reviewers reduce anchoring cascades
Cross-family review catches correlated blind spots

Entities

—

Sources

arXiv cs.AI — 2026-04-22