Curated AI platform outperforms frontier LLMs in pharma asset discovery
A new study, available on arXiv (2605.04908), assesses Gosset, an AI platform that provides curated annotations for drug assets. This study compared Gosset with four major large language models (LLMs) that rely on web searches: Claude Opus 4.7, GPT 5.5, Gemini 3.1 Pro, and Perplexity sonar-pro, across ten specific oncology and immunology targets. Remarkably, Gosset outshined the leading model, producing 3.2 times more verified drug results per query, while maintaining perfect precision and a 100% recall rate for all verified drugs. Furthermore, a curated index is accessible through a Gosset MCP server, allowing any frontier model to utilize it, which could help bridge the current gap in drug discovery.
Key facts
- Gosset is an AI platform with a chat interface backed by curated target-, modality-, and indication-level drug-asset annotations.
- Four frontier systems with web access were benchmarked: Claude Opus 4.7, GPT 5.5, Gemini 3.1 Pro, Perplexity sonar-pro.
- Benchmark focused on ten niche oncology/immunology targets where most of the pipeline is in the long tail of preclinical and Asian-developed assets.
- All five systems received the same natural-language query and the same JSON output schema.
- Gosset returned 3.2x more verified drugs per query than the best frontier system.
- Gosset achieved perfect precision and 100% recall against the cross-system union of verified drugs.
- The curated index is exposed as a Gosset MCP server that any frontier model can call as a tool.
Entities
Institutions
- arXiv
- Gosset
- Claude Opus 4.7
- GPT 5.5
- Gemini 3.1 Pro
- Perplexity sonar-pro