Hybrid Neural-Symbolic Pipeline Extracts Clinical Follow-Up Instructions
A hybrid neural-symbolic pipeline reliably extracts (action, date) pairs from outpatient notes, achieving near-perfect F1 scores. The system uses BioBERT for entity recognition, a biaffine linker for relation extraction, a 28-action ontology for canonicalization, and deterministic time normalization. On a synthetic corpus of 2,000 notes with action-disjoint splits, it reached Test-Time Pair F1 of 0.997 (seen) and 0.986 (OOV) with 0.00-day MAE, outperforming zero-shot GPT-4o-mini and LoRA-fine-tuned LLaMA-3 8B. The work addresses the limitation of generative extractors in linking actions to dates.
Key facts
- Pipeline combines BioBERT, biaffine linker, ontology, and deterministic normalization.
- Evaluated on 2,000-note synthetic outpatient corpus with action-disjoint splits.
- Test-Time Pair F1: 0.997 (seen) and 0.986 (OOV).
- Mean absolute error of 0.00 days.
- Outperforms zero-shot GPT-4o-mini and LoRA-fine-tuned LLaMA-3 8B.
- Defines TestSpecification and TimeSpecification entities and ScheduledFor relation.
- 28-action ontology for canonicalization.
- Addresses generative models' failure to link actions with dates.
Entities
Institutions
- BioBERT
- GPT-4o-mini
- LLaMA-3 8B