Strat-Reasoner: Enhancing LLM Strategic Reasoning in Multi-Agent Games
A new framework called Strat-Reasoner improves large language models' (LLMs) strategic reasoning in multi-agent games. Current LLMs struggle in such environments because outcomes depend on joint strategies, and non-stationary agents complicate evaluation and credit assignment. Existing single-agent reinforcement learning (RL) and multi-agent extensions fail to incorporate other agents' reasoning. Strat-Reasoner introduces a recursive reasoning paradigm where an agent's reasoning integrates others' reasoning processes. It uses a centralized Chain-of-Thought (CoT) to provide reward signals for intermediate reasoning sequences. The framework is detailed in arXiv paper 2605.04906.
Key facts
- Strat-Reasoner is an RL-based framework for LLMs in multi-agent games.
- It addresses challenges from non-stationary agents and credit assignment.
- Existing single-agent and multi-agent RL approaches do not incorporate other agents' reasoning.
- Strat-Reasoner uses a recursive reasoning paradigm integrating multiple agents' reasoning.
- It employs a centralized Chain-of-Thought (CoT) for intermediate reward signals.
- The paper is available on arXiv with ID 2605.04906.
Entities
Institutions
- arXiv