Humanoid Robots Master Five Distinct Gaits Through Selective Adversarial Motion Prior

ai-technology · 2026-04-22

A unified reinforcement learning framework enables a humanoid robot to master five distinct gaits: walking, goose-stepping, running, stair climbing, and jumping. The approach uses a selective Adversarial Motion Prior strategy, applying AMP to periodic, stability-critical gaits like walking and stair climbing to accelerate convergence and suppress erratic behavior. For highly dynamic gaits such as running and jumping, AMP is deliberately omitted to avoid over-constraining the motion. Policies are trained via Proximal Policy Optimization with domain randomization in simulation before deployment on a physical 12-degree-of-freedom humanoid robot. The research addresses the challenging conflict between stability and dynamic expressiveness across different locomotion skills. The work demonstrates that a consistent policy structure, action space, and reward formulation can handle diverse gait requirements. This multi-gait learning approach was documented in arXiv preprint 2604.19102v1, announced as a cross-disciplinary abstract. The selective AMP strategy represents the key contribution, showing how targeted regularization can optimize learning for specific gait characteristics.

Key facts

Humanoid robot masters five distinct gaits: walking, goose-stepping, running, stair climbing, and jumping
Uses selective Adversarial Motion Prior strategy
AMP applied to periodic, stability-critical gaits (walking, goose-stepping, stair climbing)
AMP omitted for highly dynamic gaits (running, jumping)
Policies trained via PPO with domain randomization in simulation
Deployed on physical 12-DOF humanoid robot
Addresses conflict between stability and dynamic expressiveness across gaits
Documented in arXiv preprint 2604.19102v1

Entities

—

Sources

arXiv cs.AI — 2026-04-22