LLM Vulnerability Detectors Easily Evaded by Syntax-Preserving Edits

ai-technology · 2026-05-07

A recent study indicates that vulnerability detectors based on LLMs, which are increasingly utilized in CI/CD security gating, are vulnerable to evasion via syntax- and compilation-preserving code modifications. Researchers examined five different attack types across four families of behavior-preserving transformations using a comprehensive C/C++ benchmark comprising 5,000 samples. They proposed a metric called Complete Resistance (CR) to quantify the percentage of accurately identified vulnerabilities that resist all attack types. The findings reveal a notable robustness disparity: models with over 70% clean recall show a CR as low as 0.12%, suggesting that over 87% of identified vulnerabilities can be bypassed with at least one syntax-preserving alteration. Universal adversarial strings optimized on a 14B surrogate effectively transfer to black-box APIs, including GPT-4o, and targeted optimization further enhances evasion, achieving an attack success rate of up to 92.5%. These results underscore that relying solely on clean benchmark accuracy is inadequate for evaluating real-world robustness.

Key facts

LLM-based vulnerability detectors are used in CI/CD security gating
Study evaluates five attack variants across four carrier families
Unified C/C++ benchmark of 5,000 samples used
Complete Resistance (CR) metric introduced
Models with 70%+ clean recall have CR as low as 0.12%
Over 87% of detected vulnerabilities can be evaded
Universal adversarial strings transfer to GPT-4o
On-target optimization achieves up to 92.5% ASR

Entities

—

Sources

arXiv cs.AI — 2026-05-07