Trajel: Auditing Trajectory-Level Hallucinations in Multi-Agent Workflows
A new dataset and evaluation framework called Trajel targets hallucination detection in multi-agent industrial workflows, focusing on intermediate steps rather than just final outputs. The framework introduces a five-type taxonomy of hallucinations—factual, referential, logical, procedural, and scope-based—over expert-annotated agent traces from AssetOpsBench. Benchmarking supervised detection models at subtask, trajectory, and long-context levels reveals that existing benchmarks miss common failure modes, nearly half of hallucinated trajectories involve multiple types, and automated detectors with high binary accuracy still misclassify subtle types. The work is published on arXiv under identifier 2605.24219.
Key facts
- Trajel is a dataset and evaluation framework for trajectory-level hallucinations.
- It focuses on multi-agent industrial workflows.
- The framework uses a five-type hallucination taxonomy: factual, referential, logical, procedural, and scope-based.
- Expert-annotated agent traces come from AssetOpsBench.
- Supervised detection models are benchmarked at subtask, trajectory, and long-context levels.
- Existing benchmarks miss common failure modes.
- Nearly half of hallucinated trajectories involve multiple hallucination types.
- Automated detectors with high binary accuracy still misclassify subtle types.
Entities
Institutions
- arXiv
- AssetOpsBench