JustDiag!: A Diagnostic Justification Engine for Accountable Root Cause Analysis

2026-06-19 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Cloud Computing & IT Infrastructure · Depth: Expert, extended

Summary

JustDiag is a diagnostic justification engine designed for accountable Root Cause Analysis (RCA) in high-stakes operational systems. Developed by researchers from Peking University, University of Edinburgh, and Beijing University of Posts and Telecommunications, JustDiag addresses the limitation of large language models that produce fluent but unverified RCA reports. The system maintains an explicit process state, tracking evidence, findings, competing hypotheses, conflicts, and subsequent checks. Evaluated on 66 real-world incidents, JustDiag improved Outcome Score from 51.0 to 57.7 and Process Score from 44.0 to 50.5, relative to a matched control without diagnostic justification. While accepting a slight reduction in terminal completion from 65/66 to 62/66, this indicates a preference for calibrated non-closure over premature resolution. The system uses Gemini 3 Flash for diagnosis and GPT-5.4 for evaluation.

Key takeaway

For MLOps Engineers deploying LLM-based Root Cause Analysis in high-stakes operational systems, you must prioritize solutions that provide explicit diagnostic justification. Your systems should expose evidence, competing hypotheses, and unresolved uncertainties, rather than just a final answer. This approach ensures accountability and allows human operators to trust, contest, or defer diagnoses responsibly, especially when clean closure is not fully supported by evidence.

Key insights

Accountable RCA requires explicit diagnostic justification artifacts and process-aware evaluation, not just fluent final answers.

Principles

Evidence must ground narratives.
Hypotheses compete via claim adjudication.
Preserve conflicts and uncertainties.

Method

JustDiag orchestrates domain experts to decompose hypotheses into verification claims, adjudicating them as supports, contradicts, or insufficient, and exports a structured diagnosis state, not just a final report.

In practice

Implement explicit evidence linking.
Use claim-level hypothesis adjudication.
Design for calibrated non-closure.

Topics

Root Cause Analysis
Large Language Models
Diagnostic Justification
Accountability in AI
Incident Response
Process-Aware Evaluation

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, MLOps Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.