SemiFA: An Agentic Multi-Modal Framework for Autonomous Semiconductor Failure Analysis Report Generation

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

SemiFA is an agentic multi-modal framework designed to autonomously generate structured semiconductor failure analysis (FA) reports from inspection images in under one minute. The system employs a four-agent LangGraph pipeline, including a DefectDescriber using DINOv2 embeddings and LLaVA-1.6, a RootCauseAnalyzer integrating SECS/GEM equipment telemetry and Qdrant historical defect records, a SeverityClassifier, and a RecipeAdvisor. A fifth node assembles the final PDF report. SemiFA-930, a new dataset of 930 annotated semiconductor defect images across nine classes with structured FA narratives, was introduced. On this dataset, the DINOv2-based classifier achieved 92.1% accuracy (macro F1 = 0.917) on 140 validation images. The full pipeline generates reports in 48 seconds on an NVIDIA A100-SXM4-40 GB GPU, representing a 150-300x speedup over manual processes. Multi-modal fusion, particularly equipment telemetry, improved root cause reasoning quality by +0.86 composite points (1-5 scale) over an image-only baseline, as evaluated by a GPT-4o judge.

Key takeaway

For Computer Vision Engineers developing industrial inspection systems, SemiFA demonstrates that integrating multi-modal data, especially real-time equipment telemetry via SECS/GEM, significantly enhances root cause analysis quality and automation speed. You should consider adopting agentic architectures for complex, multi-step reasoning tasks and prioritize collecting domain-specific datasets with rich, structured annotations to enable effective VLM fine-tuning.

Key insights

SemiFA automates semiconductor failure analysis report generation using a multi-modal agentic framework and a new dataset.

Principles

Method

SemiFA uses a five-node LangGraph pipeline: DefectDescriber (DINOv2, LLaVA-1.6), RootCauseAnalyzer (SECS/GEM, Qdrant), SeverityClassifier, RecipeAdvisor, and ReportGenerator, all sharing a LLaVA-1.6 model via a singleton registry.

In practice

Topics

Code references

Best for: Computer Vision Engineer, AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.