RASC+: Retrieval-Constrained LLM Adjudication for Clinical Value Set Authoring

2026-06-22 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Medical Devices & Health Technology · Depth: Expert, quick

Summary

The RASC+ system addresses the challenge of authoring clinical value sets, which define standardized terminology codes for quality measurement and clinical decision support. Traditional zero-shot large language model (LLM) generation performs poorly due to the vast, version-controlled nature of clinical code systems. RASC+ proposes a stage-wise alternative, optimizing candidate-pool construction for recall and using a constrained LLM for adjudication. On the full 3,744-value-set RASC test split, Qwen3-based retrieval, enhanced with vocabulary-aware expansion and code-display rescue, increased candidate-pool recall from 0.553 to 0.730. Replacing the original SAPBert cross-encoder with blinded GPT-5 adjudication over this expanded pool significantly boosted full-test macro F1 from 0.287 to 0.549, demonstrating substantial improvement while maintaining safety constraints.

Key takeaway

For AI Scientists developing clinical value sets or similar knowledge base construction, you should consider adopting a retrieval-constrained LLM adjudication framework. This approach, demonstrated by RASC+, significantly improves accuracy and ensures all generated codes originate from an auditable candidate pool, crucial for safety and compliance. Prioritize high-recall retrieval combined with a powerful, constrained LLM adjudicator like GPT-5 to enhance your system's performance and reliability.

Key insights

Retrieval-constrained LLM adjudication significantly improves clinical value set authoring by combining high-recall candidate generation with precise selection.

Principles

Direct LLM memorization is insufficient for large, version-controlled clinical code systems.
Stage-wise approaches can enhance LLM performance on specialized, constrained tasks.
Safety constraints necessitate auditable candidate pools for clinical code generation.

Method

RASC+ employs Qwen3-based retrieval with vocabulary-aware expansion and code-display rescue to build a high-recall candidate pool, followed by blinded GPT-5 adjudication for constrained candidate selection.

In practice

Utilize Qwen3 for robust candidate pool generation in clinical code tasks.
Implement GPT-5 for precise, constrained adjudication of clinical codes.
Ensure all generated codes are traceable to an auditable source pool.

Topics

Clinical Value Sets
Large Language Models
Retrieval-Augmented Generation
Clinical NLP
GPT-5
Qwen3

Best for: NLP Engineer, AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.