Temporal Logic Value Functions for Optimal Policies and Safety Filters
A recent paper on arXiv (2605.01051) investigates value functions related to temporal logic (TL) specifications in infinite-horizon scenarios without discounting. The researchers highlight a problem where the greedy maximization of Q-functions can lead to policies that postpone task completion indefinitely in reach-avoid situations (Until specifications), even when optimal value functions are used. They build upon recent findings that break down TL value functions into a graph of individual value functions, creating non-Markovian policies that rely on state history to circumvent this problem. They also establish optimality for nested Until, Globally, and Globally-Until specifications using a quantitative robustness metric. Furthermore, they show that the Q-function can act as a safety filter for intricate TL specifications, broadening previous findings beyond basic avoidance tasks. The paper is accessible on arXiv.
Key facts
- Paper ID: arXiv:2605.01051
- Published on arXiv
- Addresses value functions for temporal logic specifications
- Identifies pathology in greedy Q-function maximization for reach-avoid problems
- Constructs non-Markovian policies based on state history
- Proves optimality for nested Until, Globally, and Globally-Until specifications
- Uses quantitative robustness score
- Extends Q-function safety filtering to complex TL specifications
Entities
Institutions
- arXiv