AsyncTool Benchmark Evaluates LLM Agents on Asynchronous Tool Calling

ai-technology · 2026-05-28

Researchers have introduced AsyncTool, a benchmark designed to evaluate the asynchronous function calling capabilities of large language model (LLM)-based agents in multi-task environments. Existing evaluations typically overlook tool response latency and are limited to single-task settings. AsyncTool addresses this by simulating realistic tool feedback delays while presenting multiple heterogeneous tasks concurrently. The benchmark uses a hybrid data evolution strategy to construct a diverse dataset covering multiple scenarios. The work is detailed in arXiv preprint 2605.27995.

Key facts

AsyncTool is a benchmark for evaluating asynchronous tool calling in LLM agents.
It addresses the temporal dimension of tool use, specifically response latency.
The benchmark simulates multi-task environments with delayed tool feedback.
A hybrid data evolution strategy was used to create the dataset.
The research is published on arXiv with ID 2605.27995.
Existing evaluations often ignore tool response latency.
AsyncTool presents multiple heterogeneous tasks simultaneously.
The work focuses on real-world concurrent task execution.

AsyncTool Benchmark Evaluates LLM Agents on Asynchronous Tool Calling

Key facts

Entities

Institutions

Sources