Semi-Supervised Learning Boosts Reward Shaping in Sparse RL

ai-technology · 2026-05-18

A new reinforcement learning method uses semi-supervised learning and data augmentation to shape rewards from zero-reward transitions, outperforming supervised approaches in Atari and robotic manipulation tasks, achieving up to double the peak scores in sparse-reward environments.

Key facts

Proposed approach uses semi-supervised learning for reward shaping
Novel double entropy data augmentation enhances performance
Outperforms supervised-based methods in reward inference
Achieves up to twice the peak scores in sparse-reward environments
Tested on Atari games and robotic manipulation tasks
Addresses challenge of sparse reward signals in real-world scenarios
Learns trajectory space representations from zero-reward transitions

Entities

—

Sources

arXiv cs.AI — 2026-05-18