MAPRPose: Mask-Aware Proposal and Amodal Refinement for Multi-Object 6D Pose Estimation

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Computer Vision & Pattern Recognition · Depth: Expert, quick

Summary

MAPRPose is a novel two-stage framework designed for multi-object 6D pose estimation, specifically addressing challenges posed by severe occlusion and sensor noise in cluttered scenes. The first stage, Mask-Aware Pose Proposal (MAPP), establishes reliable 3D keypoint matches by lifting 2D correspondences into 3D space, generating geometrically consistent pose hypotheses. It then selects the top-K candidates based on correspondence-level scoring. The subsequent refinement stage incorporates an Amodal Mask Prediction and ROI Re-Alignment (AMPR) module within a tensorized render-and-compare pipeline. AMPR reconstructs complete object geometry and dynamically adjusts the Region-of-Interest to reduce localization errors and spatial misalignment under heavy occlusion. MAPRPose also features GPU-accelerated RGB-XYZ reprojection, allowing simultaneous refinement of multiple pose hypotheses. On the BOP benchmark, MAPRPose achieved an Average Recall (AR) of 76.5%, surpassing FoundationPose by 3.1% AR and demonstrating a 43x speedup in multi-object inference.

Key takeaway

For research scientists developing 6D object pose estimation systems, MAPRPose offers a significant advancement in handling occlusion and improving inference speed. You should consider integrating mask-aware proposal generation and amodal refinement techniques into your pipelines to achieve higher accuracy and efficiency, especially in complex, cluttered environments. This approach demonstrates superior performance and substantial speedup compared to prior methods.

Key insights

MAPRPose enhances 6D pose estimation by combining mask-aware proposals with amodal refinement for robust occlusion handling.

Principles

Method

MAPRPose uses a two-stage process: Mask-Aware Pose Proposal (MAPP) for initial hypotheses, followed by Amodal Mask Prediction and ROI Re-Alignment (AMPR) for robust refinement via render-and-compare.

In practice

Topics

Best for: Research Scientist, AI Scientist, Computer Vision Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.