ARTFEED — Contemporary Art Intelligence

Offline Policy Evaluation via Discounted Liveness Formulation

other · 2026-05-13

A new framework for offline policy evaluation in robotic manipulation addresses challenges of sparse rewards and finite-horizon truncation bias. The method uses a liveness-based Bellman operator to interpret evaluation as a task-completion problem, yielding a conservative fixed-point value function robust to truncation. Theoretical analysis includes contraction guarantees. The work is published on arXiv (2605.11479).

Key facts

  • Policy evaluation is fundamental for robotic policy development.
  • Sparse rewards and non-monotonic task progression challenge evaluation.
  • Finite-length rollouts introduce truncation bias.
  • Proposed framework uses a liveness-based Bellman operator.
  • Formulation yields a conservative fixed-point value function.
  • Theoretical properties include contraction guarantees.
  • Published on arXiv with ID 2605.11479.
  • Announce type is cross.

Entities

Institutions

  • arXiv

Sources