Koopman Operator Enables New Reinforcement Learning Algorithms
A new paper introduces two reinforcement learning algorithms based on the Koopman operator, a data-driven method that lifts nonlinear systems into coordinates where dynamics become approximately linear. This approach addresses the intractability of the Bellman and Hamilton-Jacobi-Bellman equations for high-dimensional or nonlinear systems. By parameterizing the Koopman operator with control actions, the authors construct a "controlled Koopman tensor" to estimate the optimal value function. The algorithms reformulate soft value iteration and soft actor-critic, two max-entropy RL methods. The paper is published on arXiv with ID 2403.02290.
Key facts
- Paper develops two new reinforcement learning algorithms based on the Koopman operator.
- Koopman operator lifts nonlinear systems into coordinates with approximately linear dynamics.
- Approach addresses intractability of Bellman and Hamilton-Jacobi-Bellman equations for high-dimensional or nonlinear systems.
- A 'controlled Koopman tensor' is constructed by parameterizing the Koopman operator with control actions.
- Algorithms reformulate soft value iteration and soft actor-critic.
- Paper is on arXiv with ID 2403.02290.
Entities
Institutions
- arXiv