Detecting HIV-Related Stigma in Clinical Narratives Using Large Language Models

· Source: cs.CL updates on arXiv.org · Field: Science & Research — Health & Medical Research, Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A study developed a large language model (LLM)-based tool to identify HIV-related stigma in clinical narratives from people living with HIV (PLWH) at the University of Florida Health between 2012 and 2022. Researchers identified candidate sentences using expert-curated keywords and clinical word embeddings, then manually annotated 1,332 sentences across four stigma subscales: Concern with Public Attitudes, Disclosure Concerns, Negative Self-Image, and Personalized Stigma. The study compared encoder-based models like GatorTron-large and BERT with generative LLMs including GPT-OSS-20B, LLaMA-8B, and MedGemma-27B. GatorTron-large achieved the highest overall performance with a Micro F1 score of 0.62. Few-shot prompting significantly improved generative model performance, with 5-shot GPT-OSS-20B and LLaMA-8B reaching Micro-F1 scores of 0.57 and 0.59, respectively. Negative Self-Image was the most predictable subscale, while Personalized Stigma proved the most challenging.

Key takeaway

For NLP engineers developing tools for sensitive clinical data, this research indicates that fine-tuned encoder models like GatorTron-large offer superior performance for specific stigma detection tasks compared to generative LLMs in zero-shot contexts. Consider using few-shot prompting to improve generative model accuracy if you opt for those architectures, but be aware of varying predictability across different stigma categories.

Key insights

LLMs can effectively detect HIV-related stigma in clinical notes, with encoder models outperforming generative models in zero-shot settings.

Principles

Method

Candidate sentences were identified via expert keywords and word embeddings, then manually annotated. Models were evaluated using zero-shot and few-shot prompting on four stigma subscales.

In practice

Topics

Best for: AI Scientist, NLP Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.