OpenAI's o1 model outperforms ER doctors in diagnostic accuracy, Harvard study finds

ai-technology · 2026-05-03

A study published in Science by Harvard Medical School and Beth Israel Deaconess Medical Center found that OpenAI's o1 model offered more accurate diagnoses than human physicians in emergency room cases. Researchers compared diagnoses from two attending physicians with those from OpenAI's o1 and 4o models across 76 patients. The o1 model achieved the exact or very close diagnosis in 67% of triage cases, versus 55% and 50% for the two physicians. The study emphasized that AI was given the same raw data from electronic medical records without preprocessing. Lead author Arjun Manrai stated the AI "eclipsed both prior models and our physician baselines." However, researchers cautioned that AI is not ready for real-world clinical decisions, calling for prospective trials. Lead author Adam Rodman noted the lack of a formal accountability framework for AI diagnoses. The study only tested text-based inputs, acknowledging limitations with nontext data.

Key facts

Study published in Science by Harvard Medical School and Beth Israel Deaconess Medical Center.
OpenAI's o1 model outperformed two attending physicians in diagnostic accuracy.
o1 achieved exact or close diagnosis in 67% of triage cases vs. 55% and 50% for physicians.
AI models were given the same raw data from electronic medical records without preprocessing.
Lead author Arjun Manrai said AI 'eclipsed both prior models and our physician baselines.'
Researchers emphasized need for prospective trials before clinical use.
Lead author Adam Rodman noted lack of accountability framework for AI diagnoses.
Study only tested text-based inputs; AI may be limited with nontext data.

Entities

Institutions

Harvard Medical School
Beth Israel Deaconess Medical Center
OpenAI
Science
Guardian

Locations

Beth Israel
Boston
United States

Sources

TechCrunch AI — 2026-05-03