DORA: First Agentic Benchmark for End-to-End Disaster Response
A new benchmark called DORA (Disaster Operational Response Agent benchmark) has been launched by researchers, marking the first of its kind for comprehensive disaster response. It features 515 tasks crafted by experts, covering 45 actual disaster scenarios across 10 categories, and includes 3,500 tool-call steps verified by professionals. The tasks address five key aspects of disaster response operations: understanding disasters, analyzing spatial relationships, planning rescues and evacuations, reasoning about time-related changes, and synthesizing multi-modal reports. DORA fills a void left by previous studies that focused on remote-sensing perception or generic tool applications, neglecting complete workflows. It challenges agents to synthesize signals from multiple sensors, analyze road networks, manage populations, plan evacuations, and generate practical reports.
Key facts
- DORA is the first agentic benchmark for end-to-end disaster response.
- It includes 515 expert-authored tasks.
- Tasks cover 45 real-world disaster events across 10 types.
- The benchmark provides 3,500 expert-verified, replayable tool-call steps.
- Tasks span five dimensions: disaster perception, spatial relational analysis, rescue and evacuation planning, temporal evolution reasoning, and multi-modal report synthesis.
- Prior work isolated remote-sensing perception or evaluated generic tool use.
- DORA requires integration of multi-sensor signals and reasoning over road networks, populations, and key facilities.
- The benchmark aims to evaluate end-to-end workflows in emergency operations.
Entities
—