Library-Aware Doubles and Iterative Repair for Large Language Model-Generated Unit Tests in OpenSIL Firmware
Summary
An automated unit test (UT) authoring workflow for Advanced Micro Devices' (AMD) Open-Source Silicon Initialization Library (openSIL) firmware significantly reduces manual effort in validating low-level C firmware changes. This LLM-guided multi-agent pipeline automates test scaffold generation, creates or reuses library-aware stubs, mocks, and fakes, and employs an iterative compile-dispatch repair loop using build logs and line-coverage feedback. Evaluated on 76 functions under test (FUTs), the workflow generated compilable UTs for 73 functions (96.1%). For a 48-function subset, mean line coverage reached 98.8% with line-coverage guidance (LCA-only) and 94.7% with LCA combined with vector-database (VDB) retrieval. The system utilizes GPT-4.1-mini, o4-mini, and o3 LLMs for different stages.
Key takeaway
For firmware developers automating unit test generation in constrained C environments, you should prioritize iterative repair loops and coverage-guided refinement over single-pass LLM generation. Integrate retrieval-augmented generation to reuse existing test doubles and enforce strict build constraints. This approach significantly improves build success and line coverage, reducing manual debugging and ensuring test suite consistency.
Key insights
Iterative LLM-guided repair and coverage feedback are crucial for generating robust unit tests in constrained firmware environments.
Principles
- Buildability precedes test logic in firmware UTs.
- Library reuse reduces linker errors and improves consistency.
- Coverage feedback guides targeted test refinement.
Method
The workflow uses a multi-agent LLM pipeline with retrieval-augmented generation, iterative compile-dispatch repair, and coverage-guided refinement to produce and improve unit tests.
In practice
- Implement a "do-not-redefine" list for symbols.
- Confine LLM edits to template-approved code blocks.
- Use LCOV for line-by-line coverage feedback.
Topics
- Automated Test Generation
- LLM Code Generation
- Firmware Testing
- openSIL
- Unit Testing
- Coverage-Guided Testing
Code references
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.