SGR-Bench: New Benchmark for State-Gated Retrieval Tasks
Researchers have introduced a new benchmark named SGR-Bench to evaluate search agents in state-gated retrieval (SGR) tasks. In these tasks, you can only find answers after setting specific retrieval conditions for a site, like filters or hierarchies. SGR-Bench includes 100 carefully selected tasks from six different source families across 12 public data ecosystems. Each task requires pinpointing the right website and adjusting its retrieval state to get structured answers. This benchmark allows for direct comparisons between explicit and implicit guidance by combining constraint-guided and goal-oriented approaches. It focuses on a niche area of specialized retrieval tasks, especially important given the recent progress in large language models and tool-using agents.
Key facts
- SGR-Bench is a benchmark for state-gated retrieval (SGR).
- It contains 100 expert-curated tasks.
- Tasks span six source families and 12 public data ecosystems.
- Each task requires configuring site-specific retrieval states.
- The benchmark pairs constraint-guided and goal-oriented formulations.
- It evaluates search agents on specialized data-retrieval websites.
- The work is published on arXiv with ID 2605.22219.
- State-gated retrieval involves filters, views, hierarchies, or scopes.
Entities
Institutions
- arXiv