OpenSeeker-v2 Achieves SOTA on BrowseComp with Simple SFT

ai-technology · 2026-05-07

OpenSeeker-v2 has been unveiled by researchers, showcasing exceptional capabilities across four benchmarks through the use of supervised fine-tuning (SFT) on 10.6k data points. This model leverages a 30B-parameter LLM within the ReAct framework, achieving a score of 46.0% on BrowseComp. Notable advancements include three modifications in data synthesis: enhancing the knowledge graph's size for improved exploration, increasing the tool set's size for greater functionality, and implementing strict low-step filtering. This approach challenges the conventional industry methods that rely on resource-heavy processes such as pre-training, continual pre-training, SFT, and reinforcement learning. The findings indicate that complex and informative trajectories can render simple SFT remarkably effective for developing cutting-edge search agents.

Key facts

OpenSeeker-v2 achieves state-of-the-art performance on 4 benchmarks.
Trained on only 10.6k data points using SFT.
Based on a 30B-sized agent with ReAct paradigm.
Scores 46.0% on BrowseComp.
Uses three data synthesis modifications: scaling knowledge graph size, expanding tool set size, strict low-step filtering.
Challenges resource-intensive industry pipeline of pre-training, CPT, SFT, and RL.
Demonstrates power of informative and high-difficulty trajectories for SFT.
Published on arXiv with ID 2605.04036.

OpenSeeker-v2 Achieves SOTA on BrowseComp with Simple SFT

Key facts

Entities

Institutions

Sources