Bridging the Sim-to-Real Gap in Semiconductor Visual Program Synthesis via Input Binarization

2026-06-01 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Quality Control & Standards · Depth: Expert, quick

Summary

A visual program synthesis framework is proposed for semiconductor inspection, addressing the challenge of obtaining sufficient real training data while ensuring nanometer-scale geometric accuracy. This framework utilizes a Vision-Language Model (VLM) to convert inspection images into editable Domain-Specific Language (DSL) code, enabling precise control over generated training data. To overcome the domain gap when the VLM, trained on synthetic DSL-rendered data, processes real Scanning Electron Microscope (SEM) images, an input binarization strategy is introduced. This method strips SEM-specific texture and noise, allowing the model to focus on geometric structure. On the MIIC dataset, binarized inputs improved the mean Dice coefficient from 0.4393 to 0.5256 over a raw-input baseline, demonstrating that simple texture abstraction significantly mitigates the sim-to-real gap.

Key takeaway

For AI Scientists developing vision systems for semiconductor metrology, if your models struggle with real-world Scanning Electron Microscope (SEM) image noise after synthetic data training, you should implement input binarization. This strategy, shown to improve Dice coefficient from 0.4393 to 0.5256 on the MIIC dataset, allows your Vision-Language Models to focus on critical geometric structures, directly enhancing nanometer-scale accuracy and reducing the sim-to-real domain gap.

Key insights

Input binarization effectively bridges the sim-to-real gap for VLMs in semiconductor visual program synthesis by abstracting texture.

Principles

Nanometer-scale accuracy is critical for metrology.
Synthetic data generation requires geometric control.
Texture abstraction mitigates sim-to-real gaps.

Method

A Vision-Language Model (VLM) converts inspection images into Domain-Specific Language (DSL) code. Input binarization strips SEM-specific texture and noise before VLM processing.

In practice

Apply input binarization to SEM images.
Use DSL for precise circuit geometry control.
Evaluate with Dice coefficient for segmentation tasks.

Topics

Semiconductor Metrology
Visual Program Synthesis
Vision-Language Models
Sim-to-Real Gap
Input Binarization
SEM Image Processing

Best for: AI Scientist, Research Scientist, Computer Vision Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.