Adversarial Action Removal in Self-Play Reinforcement Learning
A recent study published on arXiv presents the concept of adversarial action masking in self-play reinforcement learning, where an attacker strategically eliminates certain legal actions from the action set of a victim. This method differs from traditional perturbations, as it removes options before the agent makes a decision. Experiments conducted in poker, ranging from 6 to 5,531 information states, along with two non-poker scenarios, reveal that learned masking inflicts significantly more harm than random masking and learned perturbation benchmarks. The attack is effective against various victims, including Q-learning, PPO, NFSP, neural NFSP, and DQN; it transfers between agents, is intensified through self-play, and shows no signs of recovery with prolonged masked training. The adversary focuses on high-value decision points, indicated by reach-weighted contingent action capacity (CAC_w) and value-weighted refinement (CAC_v), highlighting action availability as a unique robustness aspect in self-play RL.
Key facts
- Adversarial action masking removes legal actions before the agent acts.
- Experiments used poker games with 6 to 5,531 information states.
- Learned masking outperforms random masking and perturbation baselines.
- Attack persists across Q-learning, PPO, NFSP, neural NFSP, and DQN.
- Attack transfers across agents and is amplified by self-play.
- No recovery observed under extended masked training.
- Adversary targets high-value decision points measured by CAC_w and CAC_v.
- Action availability is identified as a distinct robustness surface.
Entities
Institutions
- arXiv