SatSplatDiff: Geometry-preserving generative refinement for high-fidelity satellite Gaussian Splatting

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

SatSplatDiff is a novel method designed to enhance high-fidelity satellite 3D reconstruction using Gaussian Splatting, addressing limitations of limited top-viewpoint satellite imagery that cause surface holes and degraded visual fidelity. While generative refinement can improve visual quality, it often introduces geometric degradation and hallucinations due to independent view processing. SatSplatDiff minimizes these issues by building on photogrammetric DSM initialization and 2DGS-based shadow casting. It incorporates monocular depth supervision and multi-scale geometric refinement to establish accurate surface representations. Furthermore, it employs shadow-guided generative refinement, where geometrically calculated shadow maps ensure consistency with the underlying geometry. Evaluated on IARPA2016 and DFC2019 datasets, SatSplatDiff achieves leading performance, reducing geometric MAE by up to 18% and improving visual fidelity (FID-CLIP) by 28-45% compared to existing baselines. The method delivers up to 5x resolution enhancement with minimal hallucination and strong scalability.

Key takeaway

For Computer Vision Engineers developing high-fidelity satellite 3D reconstruction systems, SatSplatDiff offers a robust approach to overcome geometric degradation and hallucinations. You should consider integrating monocular depth supervision and shadow-guided generative refinement into your pipelines to achieve up to 5x resolution enhancement and significantly improve both geometric accuracy and visual fidelity, as demonstrated by an 18% MAE reduction and 28-45% FID-CLIP improvement. This method ensures sensor-consistent appearance and strong scalability for large-scale projects.

Key insights

SatSplatDiff integrates monocular depth and shadow-guided generative refinement to achieve geometry-preserving, high-fidelity satellite 3D reconstruction.

Principles

Method

Build on DSM initialization and 2DGS shadow casting, then introduce monocular depth supervision and multi-scale geometric refinement, followed by shadow-guided generative refinement using calculated shadow maps.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.