City-Mesh3R: Simulation-Ready City-Scale 3D Mesh Reconstruction from Multi-View Images

2026-05-28 · Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

City-Mesh3R is a scalable framework designed for reconstructing watertight 3D surface meshes of city-scale scenes directly from large, unordered image collections. Addressing limitations of existing methods like NeRF and Gaussian Splatting, which often produce incomplete or noisy geometry unsuitable for 3D simulation, City-Mesh3R employs an end-to-end images-to-mesh approach using a divide-and-conquer strategy. The process begins with topological image clustering, followed by cluster-wise independent sparse Structure-from-Motion (SfM) and map merging, eliminating the need for exhaustive image feature matching. The reconstructed sparse city map is then spatially partitioned for geometry-aware camera selection, dense surface reconstruction, and surface refinement via curvature-aware adaptive vertex density remeshing. Finally, these partition meshes are stitched to form a global city mesh. Evaluated on city-scale datasets, City-Mesh3R demonstrates high-fidelity, watertight 3D meshes with regular geometry and fine details, proving suitable for arbitrarily large scenes in a distributed processing environment.

Key takeaway

For Computer Vision Engineers tasked with generating simulation-ready 3D city models from extensive image collections, City-Mesh3R offers a robust solution. You should consider this end-to-end, distributed framework to overcome the limitations of traditional methods that yield incomplete or noisy meshes. Its divide-and-conquer strategy and curvature-aware remeshing ensure high-fidelity, watertight geometry, significantly improving the utility of your reconstructed assets for urban simulations and digital twin applications.

Key insights

City-Mesh3R reconstructs simulation-ready, watertight city-scale 3D meshes from multi-view images using a scalable, end-to-end divide-and-conquer approach.

Principles

Divide-and-conquer scales complex 3D reconstruction.
Topological clustering reduces feature matching needs.
Curvature-aware remeshing refines surface details.

Method

The method involves topological image clustering, independent sparse SfM, map merging, spatial partitioning, geometry-aware camera selection, dense surface reconstruction, curvature-aware remeshing, and global mesh stitching.

In practice

Generate high-fidelity urban digital twins.
Create assets for 3D city simulations.
Reconstruct large scenes from image datasets.

Topics

City-scale 3D Reconstruction
Multi-View Stereo
Mesh Generation
Structure-from-Motion
Digital Twins
Urban Simulation

Best for: Research Scientist, AI Scientist, Computer Vision Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.