Hybrid Quantum-MambaVision: A Quantum-enhanced State Space Model for Calibrated Mixed-type Wafer Defect Detection

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, long

Summary

Hybrid Quantum-MambaVision is a novel architecture designed for efficient multi-label wafer defect detection in semiconductor manufacturing, addressing challenges like extreme class imbalance and the computational complexity of traditional models. It integrates a linear-complexity State-Space Model (SSM) backbone with a 4-qubit Parameterized Quantum Context Adapter (QCA) and Low-Rank Adaptation (LoRA). The Mamba backbone handles long-range spatial dependencies with $O(N)$ complexity, while the QCA maps compressed latent features into a high-dimensional Hilbert space to disentangle complex, overlapping defect signatures. Evaluated on the highly imbalanced MixedWM38 dataset, the model achieved a mean Average Precision (mAP) of 0.99727 and significantly reduced Maximum Calibration Error (MCE) to 0.5553, minimizing expected false-positive costs compared to classical baselines like ResNet-50 and Vision Transformers.

Key takeaway

For research scientists developing real-time anomaly detection systems in semiconductor manufacturing, Hybrid Quantum-MambaVision offers a scalable and trustworthy solution. You should consider integrating linear-time State-Space Models with quantum context adapters to overcome computational bottlenecks and enhance model calibration, especially when dealing with highly imbalanced, multi-label datasets. This approach can drastically reduce false-positive costs and improve the reliability of defect classification.

Key insights

A quantum-enhanced State-Space Model efficiently detects complex wafer defects and calibrates uncertainty in industrial vision.

Principles

Method

The Hybrid Quantum-MambaVision architecture uses a MambaVision-T-1K backbone with LoRA for fine-tuning, and a 4-qubit QCA inserted between Stage 3 and Stage 4 to process compressed latent features.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.