EnvTrustBench: Benchmarking Evidence-Grounding Defects in LLM Agents
A new benchmarking framework, EnvTrustBench, has been launched to assess evidence-grounding defects in large language model agents. These agents increasingly depend on external resources like files, web pages, APIs, and logs, which affect their tool usage and sequence of actions. However, the accuracy of these inputs remains questionable. While current benchmarks evaluate task performance or specific threats like prompt injection, they fail to consider whether agents stay aligned with the authentic state of the environment when their observations are outdated or deceptive. EnvTrustBench characterizes an evidence-grounding defect (EGD) as a failure in behavior where an agent wrongly accepts an erroneous environment-facing observation as valid. The framework encompasses context admission, evidence provenance, verification policy, action gating, and model reasoning. This research was published on arXiv under ID 2605.08828.
Key facts
- EnvTrustBench is a new agentic framework for benchmarking evidence-grounding defects.
- LLM agents use environment-facing scaffolds like files, web pages, APIs, and logs.
- Existing benchmarks miss the reliability question of grounding in true environment state.
- Evidence-grounding defect (EGD) defined as treating incorrect observations as authoritative.
- Framework covers context admission, evidence provenance, freshness checking, verification policy, action gating, and model reasoning.
- Published on arXiv with ID 2605.08828.
Entities
Institutions
- arXiv