MARCH: Multi-Agent Radiology Clinical Hierarchy for CT Report Generation

· Source: cs.AI updates on arXiv.org · Field: Health & Wellbeing — Artificial Intelligence & Machine Learning, Medical Devices & Health Technology, Clinical Care & Medical Practice · Depth: Expert, extended

Summary

MARCH (Multi-Agent Radiology Clinical Hierarchy) is a novel multi-agent framework designed to automate 3D radiology report generation, addressing issues like clinical hallucinations and lack of iterative verification common in existing Vision-Language Models (VLMs). The framework emulates a radiology department's professional hierarchy, assigning specialized roles to distinct agents. It features a Resident Agent for initial drafting using multi-scale CT feature extraction, multiple Fellow Agents for retrieval-augmented revision, and an Attending Agent that orchestrates an iterative, stance-based consensus discourse to resolve diagnostic discrepancies. Evaluated on the RadGenome-ChestCT dataset, MARCH significantly outperforms state-of-the-art baselines in both clinical fidelity and linguistic accuracy, demonstrating that human-like organizational structures can enhance AI reliability in high-stakes medical domains.

Key takeaway

For Computer Vision Engineers developing medical AI, MARCH demonstrates that structuring AI agents to mimic human clinical hierarchies significantly enhances report accuracy and reduces hallucinations. You should consider adopting multi-agent frameworks with iterative consensus and retrieval-augmented revision to improve the reliability and clinical fidelity of your automated diagnostic systems, especially for complex 3D imaging data like CT scans. This approach can lead to more trustworthy and clinically coherent outputs.

Key insights

Modeling human clinical hierarchies with multi-agent AI improves radiology report accuracy and reduces hallucinations.

Principles

Method

MARCH employs a three-stage process: initial drafting by a Resident Agent, retrieval-augmented revision by Fellow Agents, and consensus-driven finalization orchestrated by an Attending Agent through iterative discourse.

In practice

Topics

Best for: Computer Vision Engineer, AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.