Global PSRO: A New Algorithm for Equilibrium Computation in Large Zero-Sum Games

other · 2026-05-28

Researchers propose Global Policy-Space Response Oracles (Global PSRO), a novel algorithm that improves equilibrium computation in large zero-sum games. The standard PSRO framework iteratively expands a restricted strategy set using deep reinforcement learning, but existing variants often expand inefficiently by relying on best responses to meta-strategies computed from restricted-game payoffs. Global PSRO introduces a two-phase exploration-selection framework that directly minimizes Population Exploitability (PE), a measure of how well the restricted set represents the full game. This approach guides population expansion by evaluating post-expansion quality, leading to more efficient strategy sets under limited computational budgets. The paper is published on arXiv with ID 2605.28273.

Key facts

arXiv:2605.28273v1
Announce Type: new
PSRO framework scales equilibrium computation to large zero-sum games
PSRO iteratively expands a restricted strategy set using deep reinforcement learning
Existing PSRO variants expand using best responses to meta-strategies
Global PSRO uses Population Exploitability (PE) to measure restricted set quality
Global PSRO introduces a two-phase exploration-selection framework
Global PSRO explicitly minimizes PE during expansion

Global PSRO: A New Algorithm for Equilibrium Computation in Large Zero-Sum Games

Key facts

Entities

Institutions

Sources