ARTFEED — Contemporary Art Intelligence

Preference-based Constrained Reinforcement Learning for Safety

other · 2026-05-25

The recently introduced Preference-based Constrained Reinforcement Learning (PbCRL) tackles the issue of deriving safety constraints in reinforcement learning based on human preferences. Conventional Bradley-Terry models struggle to account for the asymmetric and heavy-tailed characteristics of safety costs, resulting in an underestimation of risk. PbCRL presents a more efficient solution that avoids limiting assumptions and the need for extensive expert demonstrations, enhancing its relevance to practical applications. This research, detailed in arXiv (2603.23565), emphasizes the cost-effective and dependable learning of intricate, subjective, and difficult-to-define safety constraints.

Key facts

  • PbCRL is a novel approach for safe reinforcement learning.
  • It infers safety constraints from human preferences.
  • Bradley-Terry models underestimate risk due to asymmetric safety costs.
  • The method does not require extensive expert demonstrations.
  • The paper is available on arXiv with ID 2603.23565.
  • It addresses the challenge of specifying complex real-world constraints.

Entities

Institutions

  • arXiv

Sources