AI System Outperforms Standard LLMs in Clinical Trial Protocol Extraction

ai-technology · 2026-04-20

A new AI system using generative large language models with Retrieval-Augmented Generation (RAG) has been developed for automated clinical trial protocol information extraction. The system was evaluated against publicly available standalone LLMs, showing significantly higher extraction accuracy at 89.0% compared to 62.6% for fine-tuned prompts. This research addresses increasing clinical trial protocol complexity and amendments that create significant burden for trial teams. The study also assessed operational impact on simulated Clinical Research Coordinator workflows, finding AI-assisted tasks improved efficiency. Structuring protocol content into standard formats has potential to improve documentation quality and strengthen compliance. The research was published on arXiv with identifier arXiv:2602.00052v2. The AI system specifically targets clinical trial workflows where knowledge management challenges exist. Automated extraction could reduce burden on trial teams handling complex protocols.

Key facts

AI system uses generative LLMs with Retrieval-Augmented Generation (RAG)
Extraction accuracy of 89.0% for clinical-trial-specific RAG process
Standalone LLMs achieved 62.6% accuracy with fine-tuned prompts
Addresses increasing clinical trial protocol complexity and amendments
Assessed operational impact on simulated Clinical Research Coordinator workflows
Research published on arXiv with identifier arXiv:2602.00052v2
Structuring protocol content into standard formats improves efficiency
AI-assisted tasks show potential to improve documentation quality and compliance

AI System Outperforms Standard LLMs in Clinical Trial Protocol Extraction

Key facts

Entities

Institutions

Sources