Semi-Supervised Learning Boosts Reward Shaping in Sparse RL
A new reinforcement learning method uses semi-supervised learning and data augmentation to shape rewards from zero-reward transitions, outperforming supervised approaches in Atari and robotic manipulation tasks, achieving up to double the peak scores in sparse-reward environments.
Key facts
- Proposed approach uses semi-supervised learning for reward shaping
- Novel double entropy data augmentation enhances performance
- Outperforms supervised-based methods in reward inference
- Achieves up to twice the peak scores in sparse-reward environments
- Tested on Atari games and robotic manipulation tasks
- Addresses challenge of sparse reward signals in real-world scenarios
- Learns trajectory space representations from zero-reward transitions
Entities
—