AgentMark: Behavioral Watermarking for LLM Agents
Researchers propose AgentMark, a framework to embed multi-bit identifiers into the planning behaviors of LLM-based agents, such as tool and subgoal choices, for IP protection and regulatory provenance. Unlike content watermarking, which attributes outputs, AgentMark targets the high-level decision-making layer. It addresses challenges like utility degradation from distributional deviations and black-box agent operation by eliciting an explicit behavior distribution and applying distribution-preserving conditional sampling. The paper is available on arXiv.
Key facts
- AgentMark is a behavioral watermarking framework for LLM-based agents.
- It embeds multi-bit identifiers into planning decisions.
- It targets high-level planning behaviors like tool and subgoal choices.
- Content watermarking fails to identify planning behaviors.
- Minor distributional deviations can degrade utility in long-term operation.
- Many agents operate as black boxes.
- AgentMark uses distribution-preserving conditional sampling.
- The paper is on arXiv with ID 2601.03294.
Entities
Institutions
- arXiv