LLM-Guided Evolution Discovers RL Task Interfaces from Raw Simulator State
A new framework called LIMEN uses large language models to guide the evolutionary discovery of reinforcement learning task interfaces, including both observation mappings and reward functions, directly from raw simulator state. Unlike prior work that only automated reward design with fixed observations, LIMEN synthesizes complete interfaces by generating candidate programs and iteratively refining them based on policy training feedback. The approach was tested on discrete gridworld tasks and continuous control domains for locomotion and manipulation, demonstrating that joint evolution of observations and rewards can yield effective interfaces. The code is available on GitHub.
Key facts
- LIMEN is an LLM-guided evolutionary framework for RL task interface discovery.
- It synthesizes both observation mappings and reward functions from raw simulator state.
- Candidate interfaces are generated as executable programs.
- Interfaces are iteratively refined using policy training feedback.
- Tested on discrete gridworld tasks and continuous control domains.
- Domains include locomotion and manipulation tasks.
- Code available at https://github.com/Lossfunk/LIMEN.
- arXiv paper ID: 2605.03408.
Entities
—