Temporal Logic Value Functions for Optimal Policies and Safety Filters

other · 2026-05-06

A recent paper on arXiv (2605.01051) investigates value functions related to temporal logic (TL) specifications in infinite-horizon scenarios without discounting. The researchers highlight a problem where the greedy maximization of Q-functions can lead to policies that postpone task completion indefinitely in reach-avoid situations (Until specifications), even when optimal value functions are used. They build upon recent findings that break down TL value functions into a graph of individual value functions, creating non-Markovian policies that rely on state history to circumvent this problem. They also establish optimality for nested Until, Globally, and Globally-Until specifications using a quantitative robustness metric. Furthermore, they show that the Q-function can act as a safety filter for intricate TL specifications, broadening previous findings beyond basic avoidance tasks. The paper is accessible on arXiv.

Key facts

Paper ID: arXiv:2605.01051
Published on arXiv
Addresses value functions for temporal logic specifications
Identifies pathology in greedy Q-function maximization for reach-avoid problems
Constructs non-Markovian policies based on state history
Proves optimality for nested Until, Globally, and Globally-Until specifications
Uses quantitative robustness score
Extends Q-function safety filtering to complex TL specifications

Temporal Logic Value Functions for Optimal Policies and Safety Filters

Key facts

Entities

Institutions

Sources