ARTFEED — Contemporary Art Intelligence

PocketAgents: Manifest-Driven Autonomous Defense Agents for LLM Security

ai-technology · 2026-05-23

A team of researchers has developed innovative autonomous defense agents known as PocketAgents designed to enhance the security of large language models (LLMs). Each agent utilizes three essential components: a manifest, a prompt, and runtime context, which allows limited telemetry access focused on specific actions. This technology was tested in the Perry cyber arena against a simulated DarkSide cyberattack targeting a small business network. In a series of 18 tests, two agents were assessed for their effectiveness; 13 successfully blocked network threats, while four did not meet validation requirements and one test yielded ambiguous results. This strategy highlights the necessity for proactive decision-making in LLM defense.

Key facts

  • PocketAgents is a manifest-driven library of autonomous defense agents.
  • Each agent consists of three data files: manifest, prompt, and runtime context.
  • The shared runtime provides bounded telemetry access and accepts only typed reports with actions listed in the manifest.
  • Implemented on the Perry cyber arena, a cyber-deception testbed.
  • Two agents were evaluated: Command and Control and Exfiltration.
  • 18 closed-loop trials of a DarkSide-inspired attack on a small enterprise topology were conducted.
  • 13 trials produced validated network-block actions containing the attack.
  • 4 trials failed schema validation; 1 produced an unspecified outcome.

Entities

Institutions

  • arXiv
  • Perry

Sources