Structured Testbench Generation for LLM-Driven HDL Design and Verification-Oriented Data Curation
Summary
The STG (Structured Testbench Generation) framework addresses the bottleneck of automated testbench generation in LLM-driven Register Transfer Level (RTL) workflows. STG exploits hardware design structure to deterministically generate testbenches, offering significant improvements over existing prompt-based LLM approaches. As a direct verification tool, STG runs 720× faster than iterative LLM-based generation, achieves higher coverage, and reduces false-pass verdicts on incorrect Designs Under Test (DUTs). It also functions as a data curation engine, operating 10.6× faster on a single CPU core with 127× less energy than LLM-based filtering, enabling the training of state-of-the-art distilled models. Furthermore, STG reduces search node count by 14-47% in test-time scaling scenarios. The framework was evaluated on 156 VerilogEval problems, identifying and correcting a systematic race condition in its testbenches. Models are available at https://huggingface.co/collections/AS-SiliconMind/siliconmind-v12.
Key takeaway
For AI Architects or ML Engineers building LLM-driven hardware design workflows, you should integrate structured verification tools like STG. This framework offers 720× faster testbench generation and 127× energy savings for data curation, significantly improving efficiency and reliability. By adopting STG, you can reduce false-pass verdicts, accelerate iterative design refinement, and produce stronger distilled RTL generation models with simpler training pipelines.
Key insights
Structured testbench generation significantly outperforms stochastic LLM-based methods for RTL verification and data curation.
Principles
- Hardware verification benefits from structured, deterministic approaches.
- Stochastic LLM testbenches can yield unreliable pass/fail verdicts.
- Verification quality is critical for LLM-driven design iteration.
Method
STG analyzes HDL structure, classifies designs (combinational, sequential, FSM-dominated), and uses template-based rendering with specific stimulus strategies (exhaustive, two-pass, FSM traversal) to generate deterministic testbenches.
In practice
- Use STG to identify and fix benchmark testbench flaws.
- Filter large RTL datasets for model distillation efficiently.
- Improve LLM-guided design search efficiency.
Topics
- Structured Testbench Generation
- LLM-driven HDL Design
- RTL Verification
- Data Curation
- Hardware Functional Verification
- VerilogEval Benchmark
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.