Self-Prompting Small Language Models for Clinical Entity Extraction
A novel framework allows small language models to autonomously create, validate, enhance, and assess prompts tailored for clinical named entity recognition from dental progress notes. This method tackles issues related to unstructured, specialized, and privacy-sensitive documentation. Researchers tested open-weight models using 1,200 annotated notes, employing multi-prompt ensemble inference and refining selected models through QLoRA-based supervised fine-tuning and direct preference optimization. The Qwen2.5-14B-Instruct model demonstrated the best baseline results. Following DPO, Qwen2.5-14B-Instruct and Llama-3.1-8B-Instruct recorded micro/macro F1 scores of 0.864/0.837 and 0.806/0.797, respectively. The findings reveal significant performance discrepancies among models, underscoring the importance of task-specific evaluations rather than relying on generic benchmarks.
Key facts
- Framework enables small language models to self-generate, verify, refine, and evaluate entity-specific prompts
- Applied to clinical named entity recognition from dental progress notes
- 1,200 annotated notes used for evaluation
- Multi-prompt ensemble inference evaluated candidate open-weight models
- Selected models adapted using QLoRA-based supervised fine-tuning and direct preference optimization
- Qwen2.5-14B-Instruct achieved strongest baseline performance
- After DPO, Qwen2.5-14B-Instruct micro/macro F1: 0.864/0.837
- After DPO, Llama-3.1-8B-Instruct micro/macro F1: 0.806/0.797
Entities
Institutions
- arXiv