Graph Reinforcement Learning for Calibration-Aware Quantum Circuit Routing

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Quantum Computing · Depth: Expert, quick

Summary

A new calibration-aware graph reinforcement-learning router has been developed to improve quantum circuit fidelity on noisy intermediate-scale quantum processors. This router addresses the issue where routes efficient by standard metrics still lose fidelity due to poorly calibrated couplers. It utilizes same-day IBM Heron r2 calibration data to intelligently choose hardware-edge SWAPs. The policy is trained using proximal policy optimization and evaluated via exact simulated fidelity across nine Munich Quantum Toolkit (MQT) Bench circuits and three calibration snapshots. The method achieved a pooled mean exact fidelity of 0.727, significantly outperforming SABRE-best20 (0.440) and target-aware SABRE (0.481). While fidelity gains were concentrated in 5q and 8q circuit families and came with higher routed two-qubit counts, 10q families favored SABRE-best20 under a fixed tree action graph. These results demonstrate that calibration-aware learned routing can enhance fidelity beyond traditional gate-count-driven compilation.

Key takeaway

For Quantum Software Engineers optimizing circuit compilation, you should integrate calibration-aware routing into your workflows. This approach, which uses same-day hardware data and reinforcement learning, significantly boosts fidelity, achieving 0.727 compared to 0.440 for SABRE-best20. Consider applying this method particularly for 5q and 8q circuits, but be aware that 10q circuits might still favor traditional gate-count-driven methods under certain graph constraints. Your focus should shift from purely gate-count optimization to fidelity-driven, hardware-aware strategies.

Key insights

Calibration-aware graph reinforcement learning significantly improves quantum circuit fidelity by optimizing SWAP operations based on real-time hardware data.

Principles

Method

Train a graph reinforcement learning policy using proximal policy optimization, leveraging same-day hardware calibration data to select hardware-edge SWAPs for quantum circuit routing.

In practice

Topics

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.