ZAYA1-8B: A 700M Active Parameter Reasoning MoE Model

ai-technology · 2026-05-09

ZAYA1-8B, a mixture-of-experts (MoE) model, comprises 700 million active parameters out of a total of 8 billion. It was created by Zyphra utilizing their MoE++ framework. The model underwent training from the ground up on a comprehensive AMD compute platform, incorporating reasoning data from the beginning through an answer-preserving trimming method. It performs on par with or surpasses DeepSeek-R1-0528 in mathematics and coding evaluations while remaining a strong contender against larger open-weight reasoning models. The post-training process consists of a four-phase reinforcement learning cascade: initial reasoning warmup on math and puzzles, a 400-task RLVE-Gym curriculum, math and code RL with test-time compute traces, and behavioral RL for chat and instruction adherence.

Key facts

ZAYA1-8B has 700M active and 8B total parameters.
Built on Zyphra's MoE++ architecture.
Pretraining, midtraining, and SFT performed on AMD compute platform.
Matches or exceeds DeepSeek-R1-0528 on math and coding benchmarks.
Trained from scratch for reasoning with answer-preserving trimming.
Post-training uses a four-stage RL cascade.
Includes 400-task RLVE-Gym curriculum.
Uses synthetic code environments from competitive programming references.

ZAYA1-8B: A 700M Active Parameter Reasoning MoE Model

Key facts

Entities

Institutions

Sources