TriEx: A Game-Based Framework for Explaining Multi-Agent LLM Reasoning

ai-technology · 2026-04-24

The paper presents TriEx, a tri-view framework designed for explainability in multi-agent LLMs operating in interactive and partially observable environments. It enhances sequential decision-making through three coordinated elements: self-reasoning from a first-person perspective tied to actions, evolving second-person belief states regarding opponents, and third-person oracle audits based on reference signals from the environment. This approach transforms explanations from unstructured narratives into evidence-based objects that can be compared across different times and viewpoints. Utilizing imperfect-information strategic games as a testing ground, TriEx facilitates a comprehensive analysis of explanation fidelity, belief evolution, and evaluator consistency, highlighting consistent discrepancies between agents' statements and their actions.

Key facts

TriEx is a tri-view explainability framework for multi-agent LLMs.
It instruments sequential decision making with three aligned artifacts.
First-person self-reasoning is bound to an action.
Second-person belief states about opponents are updated over time.
Third-person oracle audits are grounded in environment-derived reference signals.
Explanations become evidence-anchored objects comparable across time and perspectives.
Imperfect-information strategic games are used as a controlled testbed.
The framework reveals systematic mismatches between what agents say and what they do.

Entities

—

Sources

arXiv cs.AI — 2026-04-23