Ablation Study of a Fairness Auditing Agentic System for Bias Mitigation in Early-Onset Colorectal Cancer Detection
Summary
A study by Cedars-Sinai Medical Center researchers evaluated an agentic AI system designed to audit biomedical machine learning models for fairness in early-onset colorectal cancer (EO-CRC) detection. The system features a two-agent architecture: a Domain Expert Agent synthesizing literature on EO-CRC disparities and a Fairness Consultant Agent recommending sensitive attributes and fairness metrics. An ablation study compared three Ollama large language models (Llama 3.1 8B, GPT-OSS 20B, and GPT-OSS 120B parameters) across three configurations: pretrained LLM-only, Agent without Retrieval-Augmented Generation (RAG), and Agent with RAG. The Agent with RAG configuration consistently achieved the highest semantic similarity to expert-derived reference statements, particularly for disparity identification, suggesting that agentic systems with retrieval can enhance fairness auditing in clinical AI.
Key takeaway
For AI Scientists and Research Scientists developing or deploying clinical AI, you should prioritize agentic systems incorporating Retrieval-Augmented Generation (RAG) to enhance fairness auditing, especially for tasks requiring deep domain knowledge like disparity identification. This approach significantly improves the semantic alignment of audit recommendations with expert standards, helping to mitigate algorithmic bias in high-stakes applications like early-onset colorectal cancer detection. Consider the computational resources available, as smaller models like Llama 3.1 8B can be effective when paired with robust RAG.
Key insights
Agentic AI systems with RAG significantly improve fairness auditing in clinical AI by grounding responses in external knowledge.
Principles
- RAG enhances LLM performance for knowledge-intensive tasks.
- Agentic systems can automate complex, context-dependent auditing.
- Model scale and task type influence RAG's effectiveness.
Method
A two-agent LLM system (Domain Expert, Fairness Consultant) with RAG was evaluated using semantic similarity against expert ground truth for fairness auditing in early-onset colorectal cancer.
In practice
- Use RAG for LLM-driven clinical fairness auditing.
- Consider smaller LLMs (e.g., Llama 3.1 8B) with RAG for resource-constrained settings.
- Tailor agent architecture to task-specific knowledge needs.
Topics
- Agentic AI Systems
- Algorithmic Bias Mitigation
- Retrieval-Augmented Generation
- Clinical AI Fairness
- Early-Onset Colorectal Cancer
Best for: AI Scientist, Research Scientist, AI Researcher, Machine Learning Engineer, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.