AI Scientists Face Fundamental Hurdles for Autonomous Discovery
A new position paper from arXiv (2605.08956) argues that agentic AI scientists are not built for autonomous scientific discovery, despite functioning as co-scientists. The authors identify four key challenges: problem selection suffers from the McNamara fallacy; large language models (LLMs) lack tacit procedural and failure knowledge from laboratory practice; preference optimization during post-training reduces output diversity toward consensus; and scientific benchmarks focus on single-turn prediction without feedback from physical experiments. These issues require revisiting fundamental design choices, not just scaling or scaffolding.
Key facts
- Paper argues agentic AI scientists are not built for autonomous discovery
- Four challenges identified: McNamara fallacy, missing tacit knowledge, diversity compression, lack of experimental feedback
- LLMs omit procedural and failure knowledge from lab practice
- Post-training preference optimization compresses output diversity
- Benchmarks lack feedback from physical experiments
- Challenges require revisiting design choices, not just scaling
- Paper is a position paper from arXiv
- Published under announcement type 'new'
Entities
Institutions
- arXiv