SAM 3D - AI at Meta

· Source: ai.meta.com via Google News · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Computer Vision · Depth: Intermediate, quick

Summary

Meta has introduced SAM 3D, a suite of two models, SAM 3D Objects and SAM 3D Body, designed for accurate 3D reconstruction from single 2D images. SAM 3D Objects reconstructs detailed 3D geometry and texture of masked objects, robust to occlusion, and suitable for manipulation. SAM 3D Body estimates human body shape and pose, even with partial visibility, also enabling manipulation. Both models can jointly reconstruct multiple objects and people within a shared scene context. The architecture for SAM 3D Body uses a transformer-based encoder-decoder for pose and mesh parameters, while SAM 3D Objects employs two stages of DiTs for shape, pose, texture, and detail refinement. Meta claims SAM 3D achieves state-of-the-art performance across various benchmarks and is designed for practical applications like AR shopping on Facebook Marketplace, physical therapy, and robotics.

Key takeaway

For Machine Learning Engineers developing AR/VR applications or robotics, SAM 3D offers robust 3D reconstruction capabilities from single images. You should explore integrating SAM 3D Objects and SAM 3D Body to create interactive 3D experiences, such as visualizing products in real-world environments or enabling more sophisticated human-robot interaction. Consider leveraging the joint reconstruction feature for complex scene understanding.

Key insights

Meta's SAM 3D models reconstruct detailed 3D objects and human bodies from single 2D images.

Principles

Method

SAM 3D Body uses a transformer encoder-decoder for human pose/mesh; SAM 3D Objects employs two DiT stages for object shape, pose, and texture refinement.

In practice

Topics

Code references

Best for: Machine Learning Engineer, Research Scientist, AI Scientist, Computer Vision Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ai.meta.com via Google News.