sebis at CRF Filling 2026: A Two-Stage Local LLM Pipeline for Medical CRF Filling

2026-06-11 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Advanced, quick

Summary

A two-stage local LLM pipeline, developed by sebis for the CL4Health 2026 Case Report Form (CRF) filling task, addresses challenges in extracting structured clinical information from EHR notes. This pipeline utilizes the MedGemma-27B model, operating fully locally to mitigate privacy risks, inference costs, and hallucination tendencies common with larger LLMs. Its architecture separates binary presence classification from value extraction, ensuring deterministic outputs and strict adherence to textual evidence for negated, uncertain, or unknown states. Employing item-specific, few-shot in-context learning without external API calls or fine-tuning, the approach achieved a macro-F1 score of 0.55 on the official English test track. This performance secured second place among all locally-hosted, open-source submissions, demonstrating the viability of privacy-preserving, on-premise LLM solutions for clinical NLP.

Key takeaway

For Machine Learning Engineers developing clinical NLP solutions, if you are concerned about data privacy, inference costs, and hallucination risks with proprietary LLMs, consider implementing a local, two-stage pipeline like the one presented. This approach, using models such as MedGemma-27B, allows you to achieve competitive performance (macro-F1 0.55) while maintaining full data sovereignty and strict evidence adherence, avoiding external API dependencies and fine-tuning.

Key insights

A two-stage local LLM pipeline using MedGemma-27B achieves competitive, privacy-preserving medical CRF filling with strict evidence adherence.

Principles

Separate classification from extraction.
Enforce strict textual evidence adherence.
Utilize few-shot in-context learning.

Method

A two-stage architecture first performs binary presence classification, then value extraction. It uses item-specific, few-shot in-context learning with MedGemma-27B, operating fully locally without fine-tuning or external APIs.

In practice

Deploy on-premise LLMs for clinical NLP.
Reduce privacy risks in healthcare informatics.
Achieve data sovereignty for sensitive data.

Topics

Clinical NLP
Case Report Form Filling
Local LLMs
MedGemma-27B
Data Privacy
Few-shot In-context Learning

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.