ARTFEED — Contemporary Art Intelligence

DR-Venus: 4B Deep Research Agent Trained on 10K Open Data

ai-technology · 2026-04-24

Researchers introduce DR-Venus, a frontier 4B small language model-based deep research agent designed for edge-scale deployment. Built entirely on open data, it achieves strong performance using only 10K trajectories. The training recipe involves two stages: agentic supervised fine-tuning (SFT) with strict data cleaning and resampling of long-horizon trajectories, followed by agentic reinforcement learning (RL) to improve execution reliability. RL effectiveness is enhanced by building on IGPO and designing turn-level rewards based on information gain. The work addresses cost, latency, and privacy advantages of edge-scale agents.

Key facts

  • DR-Venus is a 4B parameter deep research agent.
  • Trained on only 10K open data trajectories.
  • Designed for edge-scale deployment.
  • Two-stage training: agentic SFT then agentic RL.
  • SFT includes strict data cleaning and resampling.
  • RL improves execution reliability on long-horizon tasks.
  • RL uses IGPO and turn-level rewards based on information gain.
  • Focuses on cost, latency, and privacy benefits.

Entities

Sources