ARTFEED — Contemporary Art Intelligence

GRIEF Fuzzer Uncovers 15 Vulnerabilities in LLM Serving Systems

ai-technology · 2026-05-13

A team of researchers has introduced GRIEF, a greybox fuzzer designed for LLM inference engines that identifies vulnerabilities linked to shared-state behavior in serving layers. Unlike traditional evaluations that concentrate on model safety or API correctness, GRIEF regards timed multi-request traces as primary inputs, employing lightweight oracles to uncover crashes, hangs, performance anomalies, and silent output corruption. Reproducible failures are validated through controlled replay with log-probability checks. In initial tests on vLLM and SGLang, GRIEF identified 15 vulnerabilities, with 10 verified by engine developers, including 2 CVEs. These vulnerabilities encompass issues related to KV-cache, batching, prefix sharing, speculative decoding, adapters, and multi-tenant scheduling, underscoring the security-critical aspects of LLM serving infrastructure under realistic concurrent workloads.

Key facts

  • GRIEF is a greybox fuzzer for LLM inference engines.
  • It targets vulnerabilities in the serving layer, not model behavior.
  • It uses timed multi-request traces as first-class inputs.
  • Lightweight oracles detect crashes, hangs, performance pathologies, and silent output corruption.
  • Controlled replay with log-probability checks confirms reproducible failures.
  • Early campaigns on vLLM and SGLang discovered 15 vulnerabilities.
  • 10 vulnerabilities were confirmed by engine developers.
  • 2 CVEs were included among the confirmed vulnerabilities.
  • Vulnerabilities span KV-cache, batching, prefix sharing, speculative decoding, adapters, and multi-tenant scheduling.
  • The work underscores the security-critical nature of LLM serving systems.

Entities

Institutions

  • arXiv

Sources