Graph Reinforcement Learning for Calibration-Aware Quantum Circuit Routing
Summary
A new calibration-aware graph reinforcement-learning router has been developed to improve quantum circuit fidelity on noisy intermediate-scale quantum processors. This router addresses the issue where routes efficient by standard metrics still lose fidelity due to poorly calibrated couplers. It utilizes same-day IBM Heron r2 calibration data to intelligently choose hardware-edge SWAPs. The policy is trained using proximal policy optimization and evaluated via exact simulated fidelity across nine Munich Quantum Toolkit (MQT) Bench circuits and three calibration snapshots. The method achieved a pooled mean exact fidelity of 0.727, significantly outperforming SABRE-best20 (0.440) and target-aware SABRE (0.481). While fidelity gains were concentrated in 5q and 8q circuit families and came with higher routed two-qubit counts, 10q families favored SABRE-best20 under a fixed tree action graph. These results demonstrate that calibration-aware learned routing can enhance fidelity beyond traditional gate-count-driven compilation.
Key takeaway
For Quantum Software Engineers optimizing circuit compilation, you should integrate calibration-aware routing into your workflows. This approach, which uses same-day hardware data and reinforcement learning, significantly boosts fidelity, achieving 0.727 compared to 0.440 for SABRE-best20. Consider applying this method particularly for 5q and 8q circuits, but be aware that 10q circuits might still favor traditional gate-count-driven methods under certain graph constraints. Your focus should shift from purely gate-count optimization to fidelity-driven, hardware-aware strategies.
Key insights
Calibration-aware graph reinforcement learning significantly improves quantum circuit fidelity by optimizing SWAP operations based on real-time hardware data.
Principles
- Hardware calibration data is crucial for quantum circuit routing fidelity.
- Reinforcement learning can optimize routing beyond gate-count metrics.
- Fidelity gains may vary across different circuit sizes.
Method
Train a graph reinforcement learning policy using proximal policy optimization, leveraging same-day hardware calibration data to select hardware-edge SWAPs for quantum circuit routing.
In practice
- Integrate real-time calibration data into quantum compilers.
- Apply PPO for optimizing hardware-specific quantum operations.
- Focus calibration-aware routing on 5q and 8q circuits for best gains.
Topics
- Quantum Circuit Routing
- Reinforcement Learning
- Quantum Compilation
- IBM Heron r2
- Quantum Fidelity
- Proximal Policy Optimization
Best for: AI Scientist, Research Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.