WildSplatter: Feed-forward 3D Gaussian Splatting with Appearance Control from Unconstrained Images

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

WildSplatter is a novel feed-forward 3D Gaussian Splatting (3DGS) model designed to process unconstrained images with unknown camera parameters and varying lighting conditions. Unlike traditional 3DGS, which requires iterative optimization and consistent lighting, WildSplatter is trained on diverse photo collections like MegaScenes. It jointly learns 3D Gaussians and appearance embeddings, allowing flexible modulation of Gaussian colors to represent significant variations in lighting and appearance. The model reconstructs 3D Gaussians from sparse input views in under one second, enabling real-time novel view synthesis and appearance control. Experimental results on the NeRF-OSR dataset demonstrate that WildSplatter outperforms existing pose-free 3DGS methods, achieving higher PSNR and lower LPIPS scores, particularly when handling diverse illumination.

Key takeaway

For research scientists developing real-time 3D scene reconstruction and novel view synthesis systems, WildSplatter offers a significant advancement. You should consider integrating its feed-forward, pose-free approach to handle unconstrained image collections with varying lighting, as it provides superior performance and appearance control compared to existing methods, without the computational overhead of iterative optimization.

Key insights

WildSplatter enables fast, pose-free 3DGS from unconstrained images by disentangling geometry and appearance via learned embeddings.

Principles

Method

WildSplatter uses a Vision Transformer backbone and dual-DPT heads to estimate colorless 3D Gaussian geometry and local scene features. Global appearance embeddings are then predicted from target images and combined with local features to determine Gaussian colors.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.