MonoPhysics: Estimating Geometry, Appearance, and Physical Parameters from Monocular Videos

2026-05-28 · Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

MonoPhysics is a novel framework designed for monocular inverse physics estimation of deformable objects, overcoming limitations of single-camera views such as scale ambiguity and inaccurate geometry. It integrates differentiable Material Point Method (MPM) simulation with 3D Gaussian Splatting to jointly optimize an object's geometry, appearance, and physical parameters from a single video stream. The framework introduces three "visual-physical bridges": global scale alignment, physics-aware geometry refinement, and a differentiable position map, which collectively enable accurate optimization. Evaluated on the Vid2Sim dataset and a new dataset comprising elastic and plastic objects, MonoPhysics demonstrates superior performance compared to existing monocular baselines and achieves results comparable to multi-view methods using only a single camera.

Key takeaway

For Computer Vision Engineers developing systems for deformable object analysis, MonoPhysics offers a robust approach to overcome monocular data limitations. You can now accurately estimate geometry, appearance, and physical parameters from single-camera videos, reducing hardware complexity. Consider integrating differentiable MPM simulation and 3D Gaussian Splatting to enhance your models' ability to infer complex physical properties from limited visual input. This method allows for high-fidelity reconstruction without multi-view setups.

Key insights

MonoPhysics accurately estimates deformable object physics, geometry, and appearance from monocular video using differentiable simulation and 3D Gaussian Splatting.

Principles

Monocular inverse physics needs visual-physical bridges.
Joint optimization improves accuracy from single views.
Differentiable simulation resolves scale ambiguity.

Method

MonoPhysics jointly optimizes geometry, appearance, and physical parameters using differentiable MPM simulation and 3D Gaussian Splatting, employing global scale alignment, physics-aware geometry refinement, and a differentiable position map.

In practice

Apply to elastic and plastic object analysis.
Use single-camera video for physics estimation.
Reconstruct 3D properties from 2D observations.

Topics

MonoPhysics
Inverse Physics
Monocular Video
Deformable Objects
Differentiable Simulation
3D Gaussian Splatting

Best for: Research Scientist, AI Scientist, Computer Vision Engineer, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.