Self-Prompting Small Language Models for Clinical Entity Extraction

ai-technology · 2026-05-07

A novel framework allows small language models to autonomously create, validate, enhance, and assess prompts tailored for clinical named entity recognition from dental progress notes. This method tackles issues related to unstructured, specialized, and privacy-sensitive documentation. Researchers tested open-weight models using 1,200 annotated notes, employing multi-prompt ensemble inference and refining selected models through QLoRA-based supervised fine-tuning and direct preference optimization. The Qwen2.5-14B-Instruct model demonstrated the best baseline results. Following DPO, Qwen2.5-14B-Instruct and Llama-3.1-8B-Instruct recorded micro/macro F1 scores of 0.864/0.837 and 0.806/0.797, respectively. The findings reveal significant performance discrepancies among models, underscoring the importance of task-specific evaluations rather than relying on generic benchmarks.

Key facts

Framework enables small language models to self-generate, verify, refine, and evaluate entity-specific prompts
Applied to clinical named entity recognition from dental progress notes
1,200 annotated notes used for evaluation
Multi-prompt ensemble inference evaluated candidate open-weight models
Selected models adapted using QLoRA-based supervised fine-tuning and direct preference optimization
Qwen2.5-14B-Instruct achieved strongest baseline performance
After DPO, Qwen2.5-14B-Instruct micro/macro F1: 0.864/0.837
After DPO, Llama-3.1-8B-Instruct micro/macro F1: 0.806/0.797

Self-Prompting Small Language Models for Clinical Entity Extraction

Key facts

Entities

Institutions

Sources