HQF-Net: A Hybrid Quantum-Classical Multi-Scale Fusion Network for Remote Sensing Image Segmentation

2026-04-21 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation, Remote Sensing · Depth: Expert, extended

Summary

HQF-Net, a novel hybrid quantum-classical multi-scale fusion network, has been developed for remote sensing image semantic segmentation. This architecture integrates multi-scale semantic guidance from a frozen DINOv3 ViT-L/16 backbone with a customized U-Net. Key innovations include a Deformable Multiscale Cross-Attention Fusion (DMCAF) module for aligning features, Quantum-enhanced Skip connections (QSkip) for feature refinement, and a Quantum bottleneck with Mixture-of-Experts (QMoE) that adaptively combines local, global, and directional quantum circuits. HQF-Net demonstrated consistent improvements across three remote sensing benchmarks, achieving 0.8568 mIoU and 96.87% overall accuracy on LandCover.ai, 71.82% mIoU on OpenEarthMap, and 55.28% mIoU with 99.37% overall accuracy on SeasoNet. An ablation study confirmed the individual contributions of its major components.

Key takeaway

For research scientists developing advanced remote sensing segmentation models, HQF-Net demonstrates that integrating hybrid quantum-classical components can yield superior performance. You should consider incorporating multi-scale semantic guidance and quantum-enhanced feature refinement modules, particularly a Mixture-of-Experts approach, to improve accuracy and boundary delineation in complex imagery, even under current NISQ device constraints.

Key insights

Hybrid quantum-classical networks can significantly improve remote sensing semantic segmentation by fusing multi-scale features and quantum-enhanced refinement.

Principles

Multi-scale semantic guidance improves segmentation.
Quantum circuits enhance feature interactions.
Adaptive routing of quantum experts is effective.

Method

HQF-Net integrates a DINOv3 ViT-L/16 backbone with a U-Net via DMCAF, QSkip, and a QMoE bottleneck, using parameterized quantum circuits for feature enrichment and adaptive routing.

In practice

Use DINOv3 for robust visual representations.
Implement deformable cross-attention for feature alignment.
Employ quantum-enhanced skip connections for refinement.

Topics

HQF-Net
Remote Sensing Segmentation
Hybrid Quantum-Classical Learning
DINOv3 Vision Transformer
Deformable Cross-Attention

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.