Reinforcement Learning Optimizes Humanoid Locomotion for Short-Range Targets

other · 2026-04-24

A new reinforcement learning approach directly optimizes humanoid robot locomotion for short-range SE(2) target poses, addressing inefficiencies in existing velocity-tracking methods. The method uses a constellation-based reward function to encourage natural and efficient movement. A benchmarking framework measures energy consumption, time-to-target, and footstep count. Results show consistent outperformance of standard approaches.

Key facts

Humanoids must execute task-driven short-range movements to SE(2) target poses.
Existing methods optimize for velocity-tracking, not direct pose reaching.
The proposed approach uses reinforcement learning with a constellation-based reward function.
A benchmarking framework measures energy, time-to-target, and footstep count.
The method outperforms standard approaches on a distribution of SE(2) goals.
The work is published on arXiv with ID 2508.14098v2.

Reinforcement Learning Optimizes Humanoid Locomotion for Short-Range Targets

Key facts

Entities

Institutions

Sources