Structured Testbench Generation for LLM-Driven HDL Design and Verification-Oriented Data Curation

2026-06-12 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

The STG (Structured Testbench Generation) framework addresses the bottleneck of automated testbench generation in LLM-driven Register Transfer Level (RTL) workflows. STG exploits hardware design structure to deterministically generate testbenches, offering significant improvements over existing prompt-based LLM approaches. As a direct verification tool, STG runs 720× faster than iterative LLM-based generation, achieves higher coverage, and reduces false-pass verdicts on incorrect Designs Under Test (DUTs). It also functions as a data curation engine, operating 10.6× faster on a single CPU core with 127× less energy than LLM-based filtering, enabling the training of state-of-the-art distilled models. Furthermore, STG reduces search node count by 14-47% in test-time scaling scenarios. The framework was evaluated on 156 VerilogEval problems, identifying and correcting a systematic race condition in its testbenches. Models are available at https://huggingface.co/collections/AS-SiliconMind/siliconmind-v12.

Key takeaway

For AI Architects or ML Engineers building LLM-driven hardware design workflows, you should integrate structured verification tools like STG. This framework offers 720× faster testbench generation and 127× energy savings for data curation, significantly improving efficiency and reliability. By adopting STG, you can reduce false-pass verdicts, accelerate iterative design refinement, and produce stronger distilled RTL generation models with simpler training pipelines.

Key insights

Structured testbench generation significantly outperforms stochastic LLM-based methods for RTL verification and data curation.

Principles

Hardware verification benefits from structured, deterministic approaches.
Stochastic LLM testbenches can yield unreliable pass/fail verdicts.
Verification quality is critical for LLM-driven design iteration.

Method

STG analyzes HDL structure, classifies designs (combinational, sequential, FSM-dominated), and uses template-based rendering with specific stimulus strategies (exhaustive, two-pass, FSM traversal) to generate deterministic testbenches.

In practice

Use STG to identify and fix benchmark testbench flaws.
Filter large RTL datasets for model distillation efficiently.
Improve LLM-guided design search efficiency.

Topics

Structured Testbench Generation
LLM-driven HDL Design
RTL Verification
Data Curation
Hardware Functional Verification
VerilogEval Benchmark

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.