ARTFEED — Contemporary Art Intelligence

Global PSRO: A New Algorithm for Equilibrium Computation in Large Zero-Sum Games

other · 2026-05-28

Researchers propose Global Policy-Space Response Oracles (Global PSRO), a novel algorithm that improves equilibrium computation in large zero-sum games. The standard PSRO framework iteratively expands a restricted strategy set using deep reinforcement learning, but existing variants often expand inefficiently by relying on best responses to meta-strategies computed from restricted-game payoffs. Global PSRO introduces a two-phase exploration-selection framework that directly minimizes Population Exploitability (PE), a measure of how well the restricted set represents the full game. This approach guides population expansion by evaluating post-expansion quality, leading to more efficient strategy sets under limited computational budgets. The paper is published on arXiv with ID 2605.28273.

Key facts

  • arXiv:2605.28273v1
  • Announce Type: new
  • PSRO framework scales equilibrium computation to large zero-sum games
  • PSRO iteratively expands a restricted strategy set using deep reinforcement learning
  • Existing PSRO variants expand using best responses to meta-strategies
  • Global PSRO uses Population Exploitability (PE) to measure restricted set quality
  • Global PSRO introduces a two-phase exploration-selection framework
  • Global PSRO explicitly minimizes PE during expansion

Entities

Institutions

  • arXiv

Sources