Zero-Shot Polygon Matching with Pre-trained Models for Pose Estimation and Polygon Cloud from Challenging Stereo

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

U(PM)2, a novel unsupervised polygon matching framework for stereo images, addresses challenges like disparity discontinuity, scale variation, and generalization without requiring training. It employs a multi-stage pipeline: Segment Anything Model (SAM) for masks, vectorization to polygons, a global matcher with bidirectional-pyramid strategy and LoFTR for viewpoint/scale changes, and a local matcher with local-joint geometry and multi-feature matching (LoJoGM) using the Hungarian algorithm for local discontinuities. Benchmarked on ScanNet and SceneFlow, U(PM)2 achieved leading accuracy (87.50% on SceneFlow) and competitive speed, outperforming MESA, SGAM, and MASA by 28.29% in Matching Precision (MP) when combined with SuperPoint and LightGlue, without any training requirement. It also handles large-format imagery effectively.

Key takeaway

For computer vision engineers developing robust stereo matching solutions, U(PM)2 offers a training-free, accurate method for polygon matching, crucial for urban reconstruction or detailed 3D modeling. You can integrate its modular components, such as SAM and LoFTR, to overcome scale variations and local discontinuities, achieving top-tier accuracy at a competitive speed without extensive training data.

Key insights

Unsupervised polygon matching for stereo images is achievable by integrating pre-trained models with handcrafted features.

Principles

Method

U(PM)2 detects polygons/points, globally matches with a bidirectional-pyramid strategy and LoFTR, then locally refines using LoJoGM with Hungarian algorithm for geometric and texture correlations.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.