PocketAgents: Manifest-Driven Autonomous Defense Agents for LLM Security

ai-technology · 2026-05-23

A team of researchers has developed innovative autonomous defense agents known as PocketAgents designed to enhance the security of large language models (LLMs). Each agent utilizes three essential components: a manifest, a prompt, and runtime context, which allows limited telemetry access focused on specific actions. This technology was tested in the Perry cyber arena against a simulated DarkSide cyberattack targeting a small business network. In a series of 18 tests, two agents were assessed for their effectiveness; 13 successfully blocked network threats, while four did not meet validation requirements and one test yielded ambiguous results. This strategy highlights the necessity for proactive decision-making in LLM defense.

Key facts

PocketAgents is a manifest-driven library of autonomous defense agents.
Each agent consists of three data files: manifest, prompt, and runtime context.
The shared runtime provides bounded telemetry access and accepts only typed reports with actions listed in the manifest.
Implemented on the Perry cyber arena, a cyber-deception testbed.
Two agents were evaluated: Command and Control and Exfiltration.
18 closed-loop trials of a DarkSide-inspired attack on a small enterprise topology were conducted.
13 trials produced validated network-block actions containing the attack.
4 trials failed schema validation; 1 produced an unspecified outcome.

PocketAgents: Manifest-Driven Autonomous Defense Agents for LLM Security

Key facts

Entities

Institutions

Sources