MCP-TDP Benchmark Reveals Tool Description Poisoning Risks for LLM Agents
A new research paper from arXiv (2605.24069) introduces the MCP-TDP Security Benchmark, a sandbox environment designed to evaluate Tool Description Poisoning (TDP) attacks on Large Language Model (LLM) agents using the Model Context Protocol (MCP). TDP is a semantic attack where malicious instructions are injected into a tool's descriptive metadata rather than its executable code, targeting the agent's cognitive planning layer. The benchmark comprises 32 realistic test cases across 6 risk categories. The study evaluated 8 mainstream LLMs, revealing vulnerabilities in how agents interpret tool descriptions. The research highlights a covert attack surface introduced by MCP's interoperability, which enables autonomous execution by integrating external knowledge and tools.
Key facts
- Research paper from arXiv (2605.24069) introduces MCP-TDP Security Benchmark
- TDP attacks inject malicious instructions into tool descriptive metadata
- Benchmark includes 32 realistic test cases across 6 risk categories
- Evaluated 8 mainstream LLMs
- MCP standardizes tool use for LLM agents
- Attack targets cognitive planning layer of agents
- Vulnerabilities found in how agents interpret tool descriptions
- Interoperability of MCP introduces covert attack surface
Entities
Institutions
- arXiv