Stabilized Neural HJB Solvers: Error Analysis for Model-Based RL

other · 2026-05-11

A new arXiv preprint (2605.07116) presents an error theory for a hybrid regime of physics-informed neural solvers applied to Hamilton-Jacobi-Bellman (HJB) equations in model-based reinforcement learning. This regime uses a neural network to represent the value function, finite-difference HJB policy-evaluation operators evaluated via network queries at shifted points, and residual minimization by random continuous collocation. It preserves stabilized finite-difference policy-evaluation structure without grid-based value unknowns. The authors prove a population L2 stability estimate for one policy-evaluation step with learned dynamics, separating residual error, initial error, and exterior error. The work bridges classical grid methods and continuous-PDE PINNs, offering a theoretical foundation for practical implementations.

Key facts

arXiv preprint 2605.07116
Hybrid regime for neural HJB solvers
Value function represented by neural network
Finite-difference policy-evaluation operators evaluated at shifted points
Residuals minimized by random continuous collocation
Population L2 stability estimate proven
Error bound separates residual, initial, and exterior errors
Bridges grid methods and continuous-PDE PINNs

Stabilized Neural HJB Solvers: Error Analysis for Model-Based RL

Key facts

Entities

Institutions

Sources