Predicting relations between SOAP note sections: The value of incorporating a clinical information model

2023-04-14 · Source: Explosion · Developer tools and consulting for AI, Machine Learning and NLP - Explosion.ai · Field: Health & Wellbeing — Clinical Care & Medical Practice, Medical Devices & Health Technology, Health & Medical Research · Depth: Intermediate, quick

Summary

To support human annotation for predicting relations between SOAP note sections, a specific methodology was employed, focusing on "Assessment and Plan" subsections. Initially, 100 "Assessment and Plan" subsections were manually annotated using Prodigy. This initial manual effort provided the necessary data to fine-tune a spacy-transformers RoBERTa-base model. This model, originally pretrained on OntoNotes 5, was adapted for Named Entity Recognition (NER) tagging. The fine-tuning process specifically targeted both the "Assessment" and "Plan" sections, aiming to enhance the efficiency and accuracy of subsequent human annotation tasks by providing automated tagging capabilities.

Key takeaway

For NLP Engineers tasked with improving annotation efficiency for clinical notes, consider an iterative approach combining manual annotation with model fine-tuning. You should start with targeted manual annotation of a small dataset, like 100 SOAP note subsections, using tools such as Prodigy. Then, fine-tune a robust general-domain model, such as RoBERTa-base, on your specific clinical data to automate Named Entity Recognition for sections like "Assessment" and "Plan," significantly reducing subsequent manual effort.

Key insights

Fine-tuning a RoBERTa-base model on manually annotated clinical data improves NER for SOAP note sections.

Principles

Manual annotation seeds model training.
Domain-specific fine-tuning enhances NER.
Pretrained models adapt to clinical text.

Method

Manually annotate 100 "Assessment and Plan" subsections with Prodigy. Fine-tune a spacy-transformers RoBERTa-base model (pretrained on OntoNotes 5) for NER tagging on "Assessment" and "Plan" sections.

In practice

Use Prodigy for initial clinical text annotation.
Apply RoBERTa-base for clinical NER tasks.
Fine-tune general models on specific subsections.

Topics

Clinical NLP
Named Entity Recognition
RoBERTa-base
SOAP Notes
Prodigy Annotation
Model Fine-tuning

Best for: AI Scientist, Machine Learning Engineer, NLP Engineer, Research Scientist

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Explosion · Developer tools and consulting for AI, Machine Learning and NLP - Explosion.ai.