MAPRPose: Mask-Aware Proposal and Amodal Refinement for Multi-Object 6D Pose Estimation
Summary
MAPRPose is a novel two-stage framework designed for multi-object 6D pose estimation, specifically addressing challenges posed by severe occlusion and sensor noise in cluttered scenes. The first stage, Mask-Aware Pose Proposal (MAPP), establishes reliable 3D keypoint matches by lifting 2D correspondences into 3D space, generating geometrically consistent pose hypotheses. It then selects the top-K candidates based on correspondence-level scoring. The subsequent refinement stage incorporates an Amodal Mask Prediction and ROI Re-Alignment (AMPR) module within a tensorized render-and-compare pipeline. AMPR reconstructs complete object geometry and dynamically adjusts the Region-of-Interest to reduce localization errors and spatial misalignment under heavy occlusion. MAPRPose also features GPU-accelerated RGB-XYZ reprojection, allowing simultaneous refinement of multiple pose hypotheses. On the BOP benchmark, MAPRPose achieved an Average Recall (AR) of 76.5%, surpassing FoundationPose by 3.1% AR and demonstrating a 43x speedup in multi-object inference.
Key takeaway
For research scientists developing 6D object pose estimation systems, MAPRPose offers a significant advancement in handling occlusion and improving inference speed. You should consider integrating mask-aware proposal generation and amodal refinement techniques into your pipelines to achieve higher accuracy and efficiency, especially in complex, cluttered environments. This approach demonstrates superior performance and substantial speedup compared to prior methods.
Key insights
MAPRPose enhances 6D pose estimation by combining mask-aware proposals with amodal refinement for robust occlusion handling.
Principles
- Leverage mask-aware correspondences for robust pose proposals.
- Reconstruct complete object geometry for amodal refinement.
Method
MAPRPose uses a two-stage process: Mask-Aware Pose Proposal (MAPP) for initial hypotheses, followed by Amodal Mask Prediction and ROI Re-Alignment (AMPR) for robust refinement via render-and-compare.
In practice
- Utilize 3D keypoint matches for geometrically consistent pose.
- Employ dynamic ROI adjustment to mitigate occlusion errors.
Topics
- 6D Pose Estimation
- Multi-Object Pose
- MAPRPose
- Mask-Aware Proposal
- Amodal Refinement
Best for: Research Scientist, AI Scientist, Computer Vision Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.