SemiFA: An Agentic Multi-Modal Framework for Autonomous Semiconductor Failure Analysis Report Generation
Summary
SemiFA is an agentic multi-modal framework designed to autonomously generate structured semiconductor failure analysis (FA) reports from inspection images in under one minute. The system employs a four-agent LangGraph pipeline, including a DefectDescriber using DINOv2 embeddings and LLaVA-1.6, a RootCauseAnalyzer integrating SECS/GEM equipment telemetry and Qdrant historical defect records, a SeverityClassifier, and a RecipeAdvisor. A fifth node assembles the final PDF report. SemiFA-930, a new dataset of 930 annotated semiconductor defect images across nine classes with structured FA narratives, was introduced. On this dataset, the DINOv2-based classifier achieved 92.1% accuracy (macro F1 = 0.917) on 140 validation images. The full pipeline generates reports in 48 seconds on an NVIDIA A100-SXM4-40 GB GPU, representing a 150-300x speedup over manual processes. Multi-modal fusion, particularly equipment telemetry, improved root cause reasoning quality by +0.86 composite points (1-5 scale) over an image-only baseline, as evaluated by a GPT-4o judge.
Key takeaway
For Computer Vision Engineers developing industrial inspection systems, SemiFA demonstrates that integrating multi-modal data, especially real-time equipment telemetry via SECS/GEM, significantly enhances root cause analysis quality and automation speed. You should consider adopting agentic architectures for complex, multi-step reasoning tasks and prioritize collecting domain-specific datasets with rich, structured annotations to enable effective VLM fine-tuning.
Key insights
SemiFA automates semiconductor failure analysis report generation using a multi-modal agentic framework and a new dataset.
Principles
- Decompose complex tasks into specialized, auditable sub-agents.
- Fuse diverse data modalities for robust reasoning.
- Self-improving knowledge bases enhance system accuracy over time.
Method
SemiFA uses a five-node LangGraph pipeline: DefectDescriber (DINOv2, LLaVA-1.6), RootCauseAnalyzer (SECS/GEM, Qdrant), SeverityClassifier, RecipeAdvisor, and ReportGenerator, all sharing a LLaVA-1.6 model via a singleton registry.
In practice
- Utilize DINOv2 for efficient visual feature extraction.
- Integrate SECS/GEM for critical equipment state data.
- Employ vector databases like Qdrant for historical defect retrieval.
Topics
- Semiconductor Failure Analysis
- Agentic AI Frameworks
- Multi-modal Fusion
- SECS/GEM Telemetry
- SemiFA-930 Dataset
Code references
- Shivamckaushik/SemiFA
- langchain-ai/langgraph
- crewaiinc/crewAI
- Junliangwangdhu/WaferMap
- langchain-ai/langchain
Best for: Computer Vision Engineer, AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.