ChipMATE: Multi-Agent Training via Reinforcement Learning for Enhanced RTL Generation
Summary
ChipMATE is the first self-trained multi-agent framework designed for Register-Transfer Level (RTL) code generation, addressing key misalignments with industrial practices. Unlike existing API-based systems, ChipMATE operates without a golden testbench at generation time, avoids reliance on closed-source APIs, and can be trained on proprietary RTL codebases. It features a Verilog agent paired with a Python reference-model agent that mutually verify outputs, mimicking industrial cross-comparison. The framework incorporates a backtrack-based inference workflow to prevent error propagation and a two-stage training pipeline. This pipeline first individually trains each agent to maximize code-generation capability, then jointly trains the team for effective collaboration. To facilitate training, ChipMATE utilizes a hybrid data-generation framework that produced 64.4K high-quality reference model training samples. ChipMATE achieves 75.0% and 80.1% pass@1 on VerilogEval V2 with 4B and 9B base models, respectively, surpassing all existing self-trained models and even the 1600B DeepSeek V4.
Key takeaway
For research scientists developing RTL code generation systems, ChipMATE demonstrates a robust approach to overcoming industrial constraints. You should consider implementing multi-agent architectures with mutual verification and two-stage training pipelines to improve code correctness and adaptability to proprietary data, potentially outperforming larger, single-turn models.
Key insights
ChipMATE is a self-trained multi-agent framework for RTL generation using mutual verification without a golden oracle.
Principles
- Mutual verification enhances correctness.
- Backtracking prevents error propagation.
- Two-stage training optimizes agent collaboration.
Method
ChipMATE employs a backtrack-based inference workflow and a two-stage training pipeline: individual agent saturation followed by joint team collaboration training, supported by 64.4K hybrid data samples.
In practice
- Use multi-agent systems for complex code generation.
- Implement mutual verification for self-correction.
- Adopt two-stage training for agentic systems.
Topics
- RTL Code Generation
- Multi-Agent Frameworks
- Reinforcement Learning
- Mutual Verification
- VerilogEval V2
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Hardware Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.