JustDiag!: A Diagnostic Justification Engine for Accountable Root Cause Analysis
Summary
JustDiag is a diagnostic justification engine designed for accountable Root Cause Analysis (RCA) in high-stakes operational systems. Developed by researchers from Peking University, University of Edinburgh, and Beijing University of Posts and Telecommunications, JustDiag addresses the limitation of large language models that produce fluent but unverified RCA reports. The system maintains an explicit process state, tracking evidence, findings, competing hypotheses, conflicts, and subsequent checks. Evaluated on 66 real-world incidents, JustDiag improved Outcome Score from 51.0 to 57.7 and Process Score from 44.0 to 50.5, relative to a matched control without diagnostic justification. While accepting a slight reduction in terminal completion from 65/66 to 62/66, this indicates a preference for calibrated non-closure over premature resolution. The system uses Gemini 3 Flash for diagnosis and GPT-5.4 for evaluation.
Key takeaway
For MLOps Engineers deploying LLM-based Root Cause Analysis in high-stakes operational systems, you must prioritize solutions that provide explicit diagnostic justification. Your systems should expose evidence, competing hypotheses, and unresolved uncertainties, rather than just a final answer. This approach ensures accountability and allows human operators to trust, contest, or defer diagnoses responsibly, especially when clean closure is not fully supported by evidence.
Key insights
Accountable RCA requires explicit diagnostic justification artifacts and process-aware evaluation, not just fluent final answers.
Principles
- Evidence must ground narratives.
- Hypotheses compete via claim adjudication.
- Preserve conflicts and uncertainties.
Method
JustDiag orchestrates domain experts to decompose hypotheses into verification claims, adjudicating them as supports, contradicts, or insufficient, and exports a structured diagnosis state, not just a final report.
In practice
- Implement explicit evidence linking.
- Use claim-level hypothesis adjudication.
- Design for calibrated non-closure.
Topics
- Root Cause Analysis
- Large Language Models
- Diagnostic Justification
- Accountability in AI
- Incident Response
- Process-Aware Evaluation
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, MLOps Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.