AgroVG Benchmark for Agricultural Visual Grounding

ai-technology · 2026-05-23

A new benchmark called AgroVG has been introduced to evaluate visual grounding in agricultural AI. This involves pinpointing objects based on natural language descriptions, which is essential for processes like selective weeding, disease identification, and precise harvesting. AgroVG addresses challenges such as small, repetitive, hidden, or oddly shaped objects, along with instructions that might refer to one, several, or even no items. It frames agricultural grounding as a generalized prediction task: given an image and a descriptive phrase, a model should either identify all relevant targets or indicate that none exist. The benchmark includes 10,071 annotated images, and you can find the research on arXiv with the ID 2605.22034.

Key facts

AgroVG is a multi-source benchmark for agricultural visual grounding.
Visual grounding localizes objects described by natural-language expressions.
Applications include selective weeding, disease monitoring, and targeted harvesting.
Agricultural targets are often small, repetitive, occluded, or irregularly shaped.
Instructions may refer to one, many, or no objects in an image.
AgroVG formulates grounding as generalized set prediction.
The benchmark contains 10,071 annotation-grounded images.
The paper is available on arXiv with ID 2605.22034.

AgroVG Benchmark for Agricultural Visual Grounding

Key facts

Entities

Institutions

Sources