Preference-based Constrained Reinforcement Learning for Safety
The recently introduced Preference-based Constrained Reinforcement Learning (PbCRL) tackles the issue of deriving safety constraints in reinforcement learning based on human preferences. Conventional Bradley-Terry models struggle to account for the asymmetric and heavy-tailed characteristics of safety costs, resulting in an underestimation of risk. PbCRL presents a more efficient solution that avoids limiting assumptions and the need for extensive expert demonstrations, enhancing its relevance to practical applications. This research, detailed in arXiv (2603.23565), emphasizes the cost-effective and dependable learning of intricate, subjective, and difficult-to-define safety constraints.
Key facts
- PbCRL is a novel approach for safe reinforcement learning.
- It infers safety constraints from human preferences.
- Bradley-Terry models underestimate risk due to asymmetric safety costs.
- The method does not require extensive expert demonstrations.
- The paper is available on arXiv with ID 2603.23565.
- It addresses the challenge of specifying complex real-world constraints.
Entities
Institutions
- arXiv