Bridging the Sim-to-Real Gap in Semiconductor Visual Program Synthesis via Input Binarization
Summary
A visual program synthesis framework is proposed for semiconductor inspection, addressing the challenge of obtaining sufficient real training data while ensuring nanometer-scale geometric accuracy. This framework utilizes a Vision-Language Model (VLM) to convert inspection images into editable Domain-Specific Language (DSL) code, enabling precise control over generated training data. To overcome the domain gap when the VLM, trained on synthetic DSL-rendered data, processes real Scanning Electron Microscope (SEM) images, an input binarization strategy is introduced. This method strips SEM-specific texture and noise, allowing the model to focus on geometric structure. On the MIIC dataset, binarized inputs improved the mean Dice coefficient from 0.4393 to 0.5256 over a raw-input baseline, demonstrating that simple texture abstraction significantly mitigates the sim-to-real gap.
Key takeaway
For AI Scientists developing vision systems for semiconductor metrology, if your models struggle with real-world Scanning Electron Microscope (SEM) image noise after synthetic data training, you should implement input binarization. This strategy, shown to improve Dice coefficient from 0.4393 to 0.5256 on the MIIC dataset, allows your Vision-Language Models to focus on critical geometric structures, directly enhancing nanometer-scale accuracy and reducing the sim-to-real domain gap.
Key insights
Input binarization effectively bridges the sim-to-real gap for VLMs in semiconductor visual program synthesis by abstracting texture.
Principles
- Nanometer-scale accuracy is critical for metrology.
- Synthetic data generation requires geometric control.
- Texture abstraction mitigates sim-to-real gaps.
Method
A Vision-Language Model (VLM) converts inspection images into Domain-Specific Language (DSL) code. Input binarization strips SEM-specific texture and noise before VLM processing.
In practice
- Apply input binarization to SEM images.
- Use DSL for precise circuit geometry control.
- Evaluate with Dice coefficient for segmentation tasks.
Topics
- Semiconductor Metrology
- Visual Program Synthesis
- Vision-Language Models
- Sim-to-Real Gap
- Input Binarization
- SEM Image Processing
Best for: AI Scientist, Research Scientist, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.