Automated ICD Classification of Psychiatric Diagnoses Using NLP and LLMs
This study is focused on improving how psychiatric diagnoses are made by using automation. It does this by matching free-text descriptions to the International Classification of Diseases (ICD) with the help of Natural Language Processing (NLP) and Machine Learning (ML). Researchers worked with a special dataset of 145,513 psychiatric descriptions in Spanish. They tested various text representation methods, from basic frequency-based models like Bag of Words (BoW) and TF-IDF to more advanced Large Language Models (LLMs) such as e5_large, BioLORD, and Llama-3-8B. Transformer-based embeddings outperformed traditional methods, capturing intricate medical language better. The e5_large model, after extensive fine-tuning, achieved the highest F1_micro score of 0.866.
Key facts
- Dataset of 145,513 Spanish psychiatric descriptions used
- Models evaluated: BoW, TF-IDF, e5_large, BioLORD, Llama-3-8B
- Transformer-based embeddings outperformed traditional methods
- e5_large achieved highest F1_micro score of 0.866
- Study addresses administrative burden in coding clinical diagnoses
- Focus on mapping free-text to ICD codes
- Research demonstrates potential of LLMs in psychiatric diagnostics
Entities
—