EvoNav: LLM-Driven Reward Design for Robot Navigation

ai-technology · 2026-05-13

EvoNav is an evolutionary framework that automates the design of reward functions for robot navigation using large language models (LLMs). It addresses the sensitivity of Reinforcement Learning (RL) policy quality to hand-crafted rewards, which require domain expertise and embed hard-to-audit biases. EvoNav evaluates candidate reward proposals via a progressive three-stage warm-up-boost procedure, starting with analytical proxies and low-cost surrogates (small datasets, analytic rules), then lightweight rollouts, and finally full policy training. This approach overcomes the prohibitive cost of policy training for each candidate. The framework is detailed in a paper on arXiv (2605.11859).

Key facts

EvoNav automates reward function design for robot navigation using LLMs.
Reinforcement Learning policy quality is sensitive to reward specification.
Hand-crafted rewards require domain expertise and embed biases.
EvoNav uses a three-stage warm-up-boost evaluation procedure.
Stages: analytical proxies, lightweight rollouts, full policy training.
The framework reduces the cost of evaluating candidate reward proposals.
Paper published on arXiv with ID 2605.11859.
EvoNav targets robot navigation in dynamic human environments.

EvoNav: LLM-Driven Reward Design for Robot Navigation

Key facts

Entities

Institutions

Sources