It's all about the angle: Your photos, re-composed
Summary
Google DeepMind and Google Platforms & Devices have introduced a new image editing approach, now integrated into the Auto frame feature in Google Photos, as announced on April 22, 2026. This method allows users to re-imagine photos from a new perspective after they have been taken, interpreting a standard 2D photo as a 3D scene. Unlike traditional editing tools, it uses machine learning models to understand the scene's spatial layout and generative AI to create new perspectives, including previously hidden content. The two-stage process involves 3D scene and camera estimation, followed by generative inpainting and retouching using a latent diffusion model. This enables automatic adjustment of camera pose and focal length, and correction of wide-angle lens distortions, particularly beneficial for portraits.
Key takeaway
For AI Product Managers evaluating new photo editing capabilities, this Google Photos update demonstrates a significant leap beyond traditional cropping. Your teams should explore integrating 3D scene understanding and generative inpainting to offer users more dynamic post-capture adjustments. Consider how similar two-stage ML pipelines could enhance user experience by correcting common photographic imperfections like perspective distortion, providing a single-action improvement for "almost perfect" shots.
Key insights
A new Google Photos feature uses ML and generative AI to re-compose 2D images as 3D scenes, enabling perspective changes.
Principles
- Decouple 3D estimation from image formation for faithful manipulation.
- Use classifier guidance to preserve original content during generation.
Method
The method involves two stages: (1) 3D scene and camera estimation using a point map model, and (2) generative inpainting and retouching with a latent diffusion model to fill newly revealed areas.
In practice
- Automatically adjust camera viewpoint in portraits.
- Correct wide-angle lens distortions in selfies.
Topics
- Google Photos
- Auto frame
- Generative AI
- 3D Scene Reconstruction
- Latent Diffusion Models
Best for: Computer Vision Engineer, Research Scientist, AI Product Manager, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The latest research from Google.