ARTFEED — Contemporary Art Intelligence

ExploitBench: Capability Ladder Benchmark for LLM Cybersecurity Agents

other · 2026-05-16

ExploitBench is a groundbreaking tool that assesses exploitation through 16 measurable factors, such as coverage, crashes, sandbox capabilities, arbitrary read/write, control-flow hijacking, and executing arbitrary code. Each feature is verified by a reliable method that employs random challenges for testing primitives and compares outputs against known binaries, plus a proof for code execution via signal-handling. This benchmark targets 41 vulnerabilities in V8 due to its popularity. Unlike existing LLM security benchmarks, which view a crash as a successful exploit, ExploitBench sees exploitation more as a series of steps, evolving from just executing a faulty line of code to gaining full control over the target system.

Key facts

  • ExploitBench decomposes exploitation into 16 measurable flags.
  • Flags include coverage, crash, sandbox primitives, arbitrary read/write, control-flow hijack, and arbitrary code execution.
  • Each capability is verified by a deterministic oracle.
  • The oracle uses per-run randomized challenge-response for primitives.
  • Differential execution against ground-truth binaries measures progress.
  • A signal-handler proof is used for code execution.
  • ExploitBench is instantiated on 41 V8 bugs.
  • Existing LLM security benchmarks treat a crash as exploitation success.

Entities

Sources