WildSplatter: Feed-forward 3D Gaussian Splatting with Appearance Control from Unconstrained Images

2026-04-24 · Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

WildSplatter is a novel feed-forward 3D Gaussian Splatting (3DGS) model designed to process unconstrained images with unknown camera parameters and varying lighting conditions. Unlike traditional 3DGS, which requires iterative optimization and consistent lighting, WildSplatter is trained on diverse photo collections like MegaScenes. It jointly learns 3D Gaussians and appearance embeddings, allowing flexible modulation of Gaussian colors to represent significant variations in lighting and appearance. The model reconstructs 3D Gaussians from sparse input views in under one second, enabling real-time novel view synthesis and appearance control. Experimental results on the NeRF-OSR dataset demonstrate that WildSplatter outperforms existing pose-free 3DGS methods, achieving higher PSNR and lower LPIPS scores, particularly when handling diverse illumination.

Key takeaway

For research scientists developing real-time 3D scene reconstruction and novel view synthesis systems, WildSplatter offers a significant advancement. You should consider integrating its feed-forward, pose-free approach to handle unconstrained image collections with varying lighting, as it provides superior performance and appearance control compared to existing methods, without the computational overhead of iterative optimization.

Key insights

WildSplatter enables fast, pose-free 3DGS from unconstrained images by disentangling geometry and appearance via learned embeddings.

Principles

Geometry remains invariant under lighting changes.
Appearance variations can be captured in a low-dimensional latent space.

Method

WildSplatter uses a Vision Transformer backbone and dual-DPT heads to estimate colorless 3D Gaussian geometry and local scene features. Global appearance embeddings are then predicted from target images and combined with local features to determine Gaussian colors.

In practice

Reconstruct 3D Gaussians from sparse views in under one second.
Control appearance via embedding interpolation.
Generalize appearance control across different datasets.

Topics

3D Gaussian Splatting
Feed-forward Models
Appearance Embeddings
Pose-free Reconstruction
Unconstrained Images

Code references

yfujimura/WildSplatter

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.