LLM Vulnerability Detectors Easily Evaded by Syntax-Preserving Edits
A recent study indicates that vulnerability detectors based on LLMs, which are increasingly utilized in CI/CD security gating, are vulnerable to evasion via syntax- and compilation-preserving code modifications. Researchers examined five different attack types across four families of behavior-preserving transformations using a comprehensive C/C++ benchmark comprising 5,000 samples. They proposed a metric called Complete Resistance (CR) to quantify the percentage of accurately identified vulnerabilities that resist all attack types. The findings reveal a notable robustness disparity: models with over 70% clean recall show a CR as low as 0.12%, suggesting that over 87% of identified vulnerabilities can be bypassed with at least one syntax-preserving alteration. Universal adversarial strings optimized on a 14B surrogate effectively transfer to black-box APIs, including GPT-4o, and targeted optimization further enhances evasion, achieving an attack success rate of up to 92.5%. These results underscore that relying solely on clean benchmark accuracy is inadequate for evaluating real-world robustness.
Key facts
- LLM-based vulnerability detectors are used in CI/CD security gating
- Study evaluates five attack variants across four carrier families
- Unified C/C++ benchmark of 5,000 samples used
- Complete Resistance (CR) metric introduced
- Models with 70%+ clean recall have CR as low as 0.12%
- Over 87% of detected vulnerabilities can be evaded
- Universal adversarial strings transfer to GPT-4o
- On-target optimization achieves up to 92.5% ASR
Entities
—