GRIEF Fuzzer Uncovers 15 Vulnerabilities in LLM Serving Systems

ai-technology · 2026-05-13

A team of researchers has introduced GRIEF, a greybox fuzzer designed for LLM inference engines that identifies vulnerabilities linked to shared-state behavior in serving layers. Unlike traditional evaluations that concentrate on model safety or API correctness, GRIEF regards timed multi-request traces as primary inputs, employing lightweight oracles to uncover crashes, hangs, performance anomalies, and silent output corruption. Reproducible failures are validated through controlled replay with log-probability checks. In initial tests on vLLM and SGLang, GRIEF identified 15 vulnerabilities, with 10 verified by engine developers, including 2 CVEs. These vulnerabilities encompass issues related to KV-cache, batching, prefix sharing, speculative decoding, adapters, and multi-tenant scheduling, underscoring the security-critical aspects of LLM serving infrastructure under realistic concurrent workloads.

Key facts

GRIEF is a greybox fuzzer for LLM inference engines.
It targets vulnerabilities in the serving layer, not model behavior.
It uses timed multi-request traces as first-class inputs.
Lightweight oracles detect crashes, hangs, performance pathologies, and silent output corruption.
Controlled replay with log-probability checks confirms reproducible failures.
Early campaigns on vLLM and SGLang discovered 15 vulnerabilities.
10 vulnerabilities were confirmed by engine developers.
2 CVEs were included among the confirmed vulnerabilities.
Vulnerabilities span KV-cache, batching, prefix sharing, speculative decoding, adapters, and multi-tenant scheduling.
The work underscores the security-critical nature of LLM serving systems.

GRIEF Fuzzer Uncovers 15 Vulnerabilities in LLM Serving Systems

Key facts

Entities

Institutions

Sources