ARTFEED — Contemporary Art Intelligence

LLM Agents Struggle with Unclear Instructions; New Framework Prompts Clarification

ai-technology · 2026-04-30

A new study from arXiv (2409.00557) reveals that large language models (LLMs) equipped with function-calling capabilities struggle when user instructions are imprecise. Researchers analyzed real-world user queries, identified error patterns, and built Noisy ToolBench (NoisyToolBench), a benchmark for evaluating LLM tool-use under imperfect instructions. They found that due to next-token prediction training, LLMs tend to arbitrarily generate missing arguments, leading to hallucinations and risks. To address this, the team proposed Ask-when-Needed (AwN), a framework that prompts LLMs to ask users clarifying questions when instructions are unclear, rather than guessing.

Key facts

  • arXiv:2409.00557v4
  • Noisy ToolBench (NoisyToolBench) benchmark created
  • LLMs arbitrarily generate missing arguments due to next-token prediction
  • Ask-when-Needed (AwN) framework proposed
  • Study focuses on LLM tool-use under imperfect instructions
  • Real-world user instructions were examined
  • Error patterns in LLM tool execution analyzed
  • AwN prompts LLMs to ask users for clarification

Entities

Institutions

  • arXiv

Sources