ARTFEED — Contemporary Art Intelligence

SENIOR: Efficient Query Selection and Preference-Guided Exploration in PbRL

other · 2026-05-23

A new method called SENIOR improves feedback and sample efficiency in preference-based reinforcement learning. It uses a Motion-Distinction-based Selection scheme (MDS) to pick behavior segment pairs with clear motion and distinct directions, making them easier for human labeling. A preference-guided exploration method (PGE) accelerates policy learning via intrinsic rewards. The approach addresses key bottlenecks in PbRL applications.

Key facts

  • SENIOR is a method for preference-based reinforcement learning.
  • It improves human feedback efficiency and sample efficiency.
  • MDS selects segment pairs with apparent motion and different directions.
  • MDS uses kernel density estimation of states.
  • PGE is a preference-guided exploration method.
  • PGE encourages exploration using intrinsic rewards.
  • The paper is from arXiv:2506.14648v2.
  • The method avoids reward engineering.

Entities

Institutions

  • arXiv

Sources