Surflo: Consistent 3D Surface Flow Model with Global State
Summary
Surflo, a novel 3D surface flow model, addresses limitations in existing feed-forward reconstruction methods by introducing a global state for consistent, high-resolution output from variable unposed RGB views. Unlike per-view models that generate overlapping pointmaps or global-latent methods with fixed, low-resolution outputs, Surflo compresses input images into K latent tokens. It then decodes oriented 3D surface points by transporting them from noise onto the surface using flow matching, allowing for arbitrary resolution output, from a few thousand to a million points in a single forward pass. To ensure consistency, an inference-time guidance term injects a photometric gradient during ODE integration, correlating nearby points. Surflo achieves performance comparable to or better than feed-forward baselines on surface metrics and is significantly faster than optimization-based techniques requiring hundreds of views. It stands out as the only feed-forward approach combining a global latent with arbitrary-resolution decoding.
Key takeaway
For Computer Vision Engineers needing high-fidelity 3D surface reconstruction from sparse, unposed RGB images, Surflo offers a compelling alternative. You can achieve arbitrary-resolution point clouds, from thousands to millions of points, significantly faster than traditional optimization-based methods. This approach combines a global latent state with flow matching, providing consistent geometry without fixed grid limitations. Consider integrating Surflo to accelerate your workflows and improve output quality for complex scene reconstruction tasks.
Key insights
Surflo uses a global latent state and flow matching to reconstruct arbitrary-resolution 3D surfaces from variable image inputs.
Principles
- Geometry's viewpoint invariance enables redundant image encoding.
- Global latent states can unify variable inputs for 3D reconstruction.
- Flow matching can decode surfaces from noise to arbitrary resolution.
Method
Surflo compresses variable RGB views into K latent tokens, then decodes 3D surface points via flow matching from noise. An inference-time photometric gradient guides ODE integration for consistency.
In practice
- Reconstruct high-fidelity 3D surfaces from sparse images.
- Generate millions of 3D points from a single latent state.
- Accelerate 3D reconstruction over optimization methods.
Topics
- 3D Surface Reconstruction
- Flow Matching
- Global Latent State
- Point Cloud Generation
- Multi-view 3D
- Arbitrary Resolution
Code references
Best for: Research Scientist, AI Scientist, Computer Vision Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.