15 AI Chatbots Evaluated for Psychiatric Triage Accuracy

ai-technology · 2026-04-30

A recent study published on arXiv (2604.25415) evaluated 15 advanced AI chatbots in the context of psychiatric triage, utilizing 112 clinical vignettes. Each vignette presented a realistic single-message disclosure, which was assigned one of four triage labels: A (routine), B (assessment within 1 week), C (assessment within 24–48 hours), or D (emergency care now). The vignettes encompassed 9 clusters of psychiatric presentations and 9 specific risk dimensions, categorized into 28 groups, with 4 unique vignettes for each triage level. The chatbots were challenged to determine the appropriate triage label. This research underscores the difficulties faced in AI-driven psychiatric triage, where urgency must be deduced from subjective indicators rather than concrete evidence.

Key facts

Study evaluated 15 frontier AI chatbots
Used 112 clinical vignettes
Four triage labels: A (routine), B (within 1 week), C (24-48 hours), D (emergency now)
Vignettes covered 9 psychiatric presentation clusters
Vignettes covered 9 focal risk dimensions
28 presentation-by-risk groups
Each group had 4 distinct vignettes
Chatbots tasked with assigning triage label

15 AI Chatbots Evaluated for Psychiatric Triage Accuracy

Key facts

Entities

Institutions

Sources