Prompt, Plan, Extract: Zero-Shot Agentic LLMs Workflows for Lung Pathology Extraction from Clinical Narratives

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Natural Language Processing · Depth: Expert, quick

Summary

A recent study developed and evaluated a zero-shot, agentic workflow utilizing five open-source generative Large Language Models (LLMs) for extracting lung pathology information from clinical narratives. The research focused on populating 13 College of American Pathologists synoptic fields from lung resection pathology reports, a task traditionally requiring labor-intensive manual extraction or expensive supervised NLP pipelines. Comparing against a supervised GatorTron NER-RE baseline, which achieved a Micro-F1 of 0.960, the best zero-shot model, GPT-OSS-20B, demonstrated a Micro-F1 of 0.893 with a recall of 0.949. This model accurately extracted complex relations, such as Pathologic Stage, without requiring task-specific training. The findings suggest that open-source, zero-shot agentic LLMs offer a low-cost solution for this critical information extraction challenge, validated by a novel, registry-aligned evaluation framework.

Key takeaway

For NLP Engineers or AI Scientists working with clinical data extraction, consider integrating zero-shot agentic LLMs into your workflows. These models, like GPT-OSS-20B, offer a low-cost alternative to traditional supervised methods for tasks such as populating pathology report fields, achieving strong performance (Micro-F1 of 0.893) without extensive manual annotation. This approach can significantly reduce development time and resource expenditure, allowing you to deploy robust information extraction solutions more rapidly.

Key insights

Zero-shot agentic LLMs can accurately extract complex lung pathology data from clinical narratives without specific training.

Principles

Method

The workflow involves prompting, planning, and extraction using agentic LLMs to populate 13 synoptic fields from pathology reports, evaluated against a supervised baseline.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, NLP Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.