FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Advanced, quick

Summary

FaceCam is a novel system designed to generate video with customizable camera trajectories from monocular human portrait video input. Addressing common issues like geometric distortions and visual artifacts in existing large video-generation models, FaceCam introduces a face-tailored, scale-aware representation for camera transformations. This representation provides deterministic conditioning without relying on potentially error-prone 3D priors. The system's video generation model is trained using both multi-view studio captures and in-the-wild monocular videos. To enable generalization to dynamic camera trajectories, FaceCam employs two data generation strategies: synthetic camera motion and multi-shot stitching. Evaluations on the Ava-256 dataset and various in-the-wild videos confirm FaceCam's superior performance in camera controllability, visual quality, and preservation of identity and motion.

Key takeaway

For computer vision engineers developing portrait video applications, FaceCam's approach offers a robust solution to camera control challenges. You should consider integrating scale-aware conditioning and synthetic data generation strategies to overcome geometric distortions and artifacts, enhancing both visual quality and identity preservation in your generated videos. This method provides a clear path to more controllable and realistic portrait video synthesis.

Key insights

FaceCam uses a scale-aware, face-tailored representation for robust camera control in portrait video generation.

Principles

Method

FaceCam trains a video generation model on multi-view and monocular videos, using synthetic camera motion and multi-shot stitching to create diverse camera-control data for dynamic trajectory inference.

In practice

Topics

Best for: Research Scientist, AI Researcher, AI Scientist, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.