Prompt, Plan, Extract: Zero-Shot Agentic LLMs Workflows for Lung Pathology Extraction from Clinical Narratives

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Health & Medical Research · Depth: Expert, quick

Summary

A study developed and evaluated a zero-shot, agentic workflow utilizing five open-source generative Large Language Models (LLMs) for extracting lung pathology information from clinical narratives. The primary objective was to populate 13 College of American Pathologists synoptic fields from lung resection pathology reports, a task traditionally requiring labor-intensive manual extraction or expensive supervised Natural Language Processing (NLP) pipelines. Researchers compared the LLM-based approach against a supervised GatorTron NER-RE baseline using a novel, registry-aligned evaluation framework. The baseline achieved a Micro-F1 score of 0.960. Notably, the best zero-shot model, GPT-OSS-20B, attained a Micro-F1 of 0.893 with a recall of 0.949, demonstrating its capability to accurately extract complex relations such as Pathologic Stage without task-specific training. These findings indicate that open-source, zero-shot agentic LLMs offer a low-cost solution for this critical medical information extraction task.

Key takeaway

For NLP Engineers or Research Scientists developing medical information extraction systems, consider integrating zero-shot agentic LLMs. These models, like GPT-OSS-20B, offer a cost-effective alternative to supervised pipelines for tasks such as populating College of American Pathologists synoptic fields. You can achieve high recall (0.949) for complex relations like Pathologic Stage without extensive manual annotation. This approach significantly reduces development time and resource expenditure for clinical data extraction.

Key insights

Zero-shot agentic LLMs can extract complex lung pathology data from clinical narratives with high accuracy, rivaling supervised methods.

Principles

Method

The study developed a zero-shot, agentic workflow and evaluated five open-source generative LLMs to populate 13 CAP synoptic fields from lung pathology reports, comparing against a supervised NER-RE baseline.

In practice

Topics

Best for: AI Scientist, NLP Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.