Rule-Based High-Level Coaching for UAVs in Search-and-Rescue Under Limited Simulation

other · 2026-04-30

A new decision-making structure for unmanned aerial vehicles (UAVs) engaged in search-and-rescue (SAR) operations is introduced, integrating a high-level advisor based on fixed rules with a low-level reinforcement learning (RL) controller that adapts to goals online. This framework is tailored for scenarios with limited simulation training and enforces a strict no-pretraining deployment policy. The high-level advisor, created offline from a structured task specification, translates into deterministic rules that offer interpretable guidance focused on mission and safety, detailing recommended and avoided actions along with arbitration weights based on different regimes. The low-level controller learns from dense rewards defined by tasks and utilizes a mode-aware prioritized replay system enhanced with rule-derived metadata. Evaluation includes two tasks: battery-aware multi-goal delivery and mo (truncated). The research is available on arXiv under ID 2604.26833.

Key facts

Framework combines rule-based high-level advisor with online goal-conditioned low-level RL controller.
Designed for UAV search-and-rescue missions under limited simulation training.
Includes a strict no-pretraining deployment regime.
High-level advisor is defined offline from structured task specification.
Provides interpretable guidance via recommended/avoided actions and arbitration weights.
Low-level controller learns online from dense rewards with mode-aware prioritized replay.
Evaluated on battery-aware multi-goal delivery and mo (truncated).
Published on arXiv with ID 2604.26833.

Rule-Based High-Level Coaching for UAVs in Search-and-Rescue Under Limited Simulation

Key facts

Entities

Institutions

Sources