LLM-Guided Evolution for Medical Decision Pipelines

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, AI in Clinical Decision Support · Depth: Expert, extended

Summary

LLM-guided MAP-Elites evolution offers an inference-time method for optimizing medical decision pipelines, bypassing costly fine-tuning or manual prompt engineering. Researchers applied this approach across three distinct clinical tasks: urgency triage, interactive consultation, and medical image classification. In triage, evolved programs significantly improved Semigran accuracy from 77.3% to 87.1% and emergency recall from 0.60 to 0.97, while also enhancing MIMIC-ESI exact accuracy from 56.7% to 62.0% and reducing severe undertriage from 3.6% to 1.2%. For interactive consultation, evolved policies improved the accuracy–cost frontier across Llama-3, Qwen-3.5, and Gemma-4 models, achieving accuracy gains (e.g., Llama-3-8B by 3.1 percentage points) and substantial token usage reductions (e.g., Llama-3-8B by 89.6%). Furthermore, prompt-only evolution enhanced frozen MedGemma VLMs for PneumoniaMNIST classification, particularly at lower resolutions. Qualitative analysis revealed that these performance gains stem from interpretable program-level mechanisms, such as calibrated triage boundaries and targeted evidence acquisition, rather than mere prompt rewording.

Key takeaway

For Machine Learning Engineers adapting LLMs for clinical applications, you should consider LLM-guided MAP-Elites evolution as a powerful inference-time optimization method. This approach allows you to discover and refine decision strategies, improving accuracy and safety-relevant behaviors without costly model fine-tuning. Implement safety-weighted objectives and structured evaluation to ensure robust, interpretable gains in areas like triage or interactive consultation.

Key insights

LLM-guided evolution optimizes medical decision pipelines at inference-time, outperforming manual baselines through interpretable program-level changes.

Principles

Method

LLM-guided MAP-Elites optimization uses a frozen LLM (gpt-oss-120b) to mutate executable artifacts (programs, policies, prompts). Task-specific evaluators score candidates, updating an archive of high-performing, diverse solutions.

In practice

Topics

Code references

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.