FP-IRL: Physics-Constrained Inverse Reinforcement Learning for Unknown Dynamics

other · 2026-05-01

A new approach called Fokker-Planck inverse reinforcement learning (FP-IRL) has been developed for systems that operate under Fokker-Planck dynamics. Unlike standard inverse reinforcement learning methods, which require knowledge of the transition function beforehand, FP-IRL can determine both the reward and transition functions solely from trajectory data, making sampled transitions unnecessary. This technique leverages a mathematical link between Markov decision processes (MDPs) and the Fokker-Planck equation, relating reward optimization in MDPs to free energy minimization. This method is particularly useful for addressing challenges when the underlying dynamics are either unknown or unobservable. The research has been shared on arXiv under the reference 2306.10407v3.

Key facts

FP-IRL stands for Fokker-Planck inverse reinforcement learning.
It is a physics-constrained IRL framework.
It infers reward and transition functions simultaneously from trajectory data.
No access to sampled transitions is required.
It applies to systems described by Fokker-Planck dynamics.
It links reward maximization in MDPs with free energy minimization.
Conventional IRL methods need the transition function prescribed or estimated a priori.
The paper is available on arXiv with ID 2306.10407v3.

FP-IRL: Physics-Constrained Inverse Reinforcement Learning for Unknown Dynamics

Key facts

Entities

Institutions

Sources