GSPan: A Continuous Gaussian Primitive Representation for Arbitrary-Scale Pansharpening

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision · Depth: Expert, quick

Summary

GSPan is a novel pansharpening framework that addresses the scale adaptation limitations of existing deep learning methods by introducing 2D Gaussian Splatting (GS). Instead of predicting pixels on a fixed grid, GSPan represents band-wise residual details as continuous, learnable 2D Gaussian primitives. Its Dual-Stream Hierarchical Interaction (DSHI) architecture, featuring a Spatial-Spectral Interactive Attention (SSIA) module, estimates these primitives from panchromatic (PAN) and multispectral (MS) observations. The predicted primitives are then rendered as a residual detail field and integrated into the upsampled MS image. This continuous representation enables GSPan to render fused images on arbitrary target sampling grids without requiring scale-specific retraining. Furthermore, it facilitates a Scale-Decoupled Asymmetric Inference (SDAI) strategy, which estimates primitives at a reduced resolution for efficient large-scene pansharpening. Experiments on QuickBird, GaoFen-2, WorldView-3, and WorldView-3-4K datasets demonstrate GSPan's leading fusion performance, with SDAI significantly accelerating inference while maintaining quality.

Key takeaway

For Computer Vision Engineers developing pansharpening solutions, GSPan offers a significant architectural shift. If you are struggling with fixed-grid limitations or computational costs for large scenes, consider adopting continuous Gaussian primitive representations. This approach allows arbitrary-scale rendering without retraining and enables efficient Scale-Decoupled Asymmetric Inference. Evaluate GSPan's methodology to enhance both the flexibility and performance of your image fusion pipelines.

Key insights

GSPan uses continuous Gaussian primitives for scale-adaptive pansharpening, overcoming fixed-grid limitations.

Principles

Method

Estimate band-wise residual details as 2D Gaussian primitives via a DSHI architecture with SSIA. Render these primitives into a residual field, then inject into the upsampled MS image.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.