Mahjax: GPU-Accelerated Mahjong Simulator for RL Research

ai-technology · 2026-05-22

Mahjax is a fully vectorized Riichi Mahjong environment implemented in JAX, designed to enable large-scale rollout parallelization on GPUs for reinforcement learning research. Riichi Mahjong is a multi-player, imperfect-information game with stochasticity and high-dimensional state spaces, presenting challenges that mirror real-world decision-making problems. The environment supports tabula rasa learning, meaning algorithms can learn from scratch without relying on human play logs. Mahjax achieves throughputs of up to 2 million game steps per second. It also includes a high-quality visualization tool for debugging and interaction with trained agents. The project aims to facilitate research into algorithms capable of learning complex games without supervised pre-training, following the AlphaZero lineage.

Key facts

Mahjax is a GPU-accelerated Riichi Mahjong environment in JAX
It supports tabula rasa reinforcement learning from scratch
Achieves throughputs of up to 2 million steps per second
Includes a visualization tool for debugging
Riichi Mahjong is a multi-player imperfect-information game
The game features stochasticity and high-dimensional state spaces
Prior research relied on supervised learning from human play logs
Mahjax enables large-scale rollout parallelization on GPUs

Entities

—

Sources

arXiv cs.AI — 2026-05-21