PocketAgents: Manifest-Driven Autonomous Defense Agents for LLM Security
A team of researchers has developed innovative autonomous defense agents known as PocketAgents designed to enhance the security of large language models (LLMs). Each agent utilizes three essential components: a manifest, a prompt, and runtime context, which allows limited telemetry access focused on specific actions. This technology was tested in the Perry cyber arena against a simulated DarkSide cyberattack targeting a small business network. In a series of 18 tests, two agents were assessed for their effectiveness; 13 successfully blocked network threats, while four did not meet validation requirements and one test yielded ambiguous results. This strategy highlights the necessity for proactive decision-making in LLM defense.
Key facts
- PocketAgents is a manifest-driven library of autonomous defense agents.
- Each agent consists of three data files: manifest, prompt, and runtime context.
- The shared runtime provides bounded telemetry access and accepts only typed reports with actions listed in the manifest.
- Implemented on the Perry cyber arena, a cyber-deception testbed.
- Two agents were evaluated: Command and Control and Exfiltration.
- 18 closed-loop trials of a DarkSide-inspired attack on a small enterprise topology were conducted.
- 13 trials produced validated network-block actions containing the attack.
- 4 trials failed schema validation; 1 produced an unspecified outcome.
Entities
Institutions
- arXiv
- Perry