Hierarchical Fault Detection and Diagnosis for Transformer Architectures

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

DEFault++, a hierarchical learning-based diagnostic technique, automates fault detection, categorization, and root-cause diagnosis for Transformer architectures. It identifies whether a fault is present, classifies it into one of 12 transformer-specific fault categories, and pinpoints the underlying root cause from up to 45 mechanisms. To facilitate training and evaluation, the researchers developed DEForm, a mutation technique, and constructed DEFault-bench, a benchmark of 3,739 labeled instances across seven transformer models and nine downstream tasks. DEFault++ measures runtime behavior at the component level, organizes data via a Fault Propagation Graph (FPG), and uses prototype matching with supervised contrastive learning. It achieves an AUROC over 0.96 for detection and a Macro-F1 over 0.85 for categorization and diagnosis. A developer study showed repair action accuracy increased from 57.1% to 83.3% with DEFault++ assistance.

Key takeaway

For Machine Learning Engineers debugging transformer models, DEFault++ provides a critical tool for identifying elusive, silent faults. Your teams should integrate this hierarchical diagnostic approach to move beyond generic DNN fault detection, pinpointing specific transformer component issues like QKV projection or masking faults. This can significantly improve repair action accuracy, as demonstrated by the 26.2% increase in developer study, reducing debugging time and improving model reliability.

Key insights

DEFault++ offers hierarchical, component-level fault diagnosis for Transformers using a Fault Propagation Graph.

Principles

Method

DEFault++ performs three-level diagnosis: fault detection, categorization into 12 types, and root-cause identification from 45 mechanisms. It uses component-level runtime measurements, structured by an FPG, and employs prototype matching with supervised contrastive learning.

In practice

Topics

Code references

Best for: AI Engineer, NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.