Mahjax: GPU-Accelerated Mahjong Simulator for RL Research
Mahjax is a fully vectorized Riichi Mahjong environment implemented in JAX, designed to enable large-scale rollout parallelization on GPUs for reinforcement learning research. Riichi Mahjong is a multi-player, imperfect-information game with stochasticity and high-dimensional state spaces, presenting challenges that mirror real-world decision-making problems. The environment supports tabula rasa learning, meaning algorithms can learn from scratch without relying on human play logs. Mahjax achieves throughputs of up to 2 million game steps per second. It also includes a high-quality visualization tool for debugging and interaction with trained agents. The project aims to facilitate research into algorithms capable of learning complex games without supervised pre-training, following the AlphaZero lineage.
Key facts
- Mahjax is a GPU-accelerated Riichi Mahjong environment in JAX
- It supports tabula rasa reinforcement learning from scratch
- Achieves throughputs of up to 2 million steps per second
- Includes a visualization tool for debugging
- Riichi Mahjong is a multi-player imperfect-information game
- The game features stochasticity and high-dimensional state spaces
- Prior research relied on supervised learning from human play logs
- Mahjax enables large-scale rollout parallelization on GPUs
Entities
—