TIGER: Taming Identity, Geometry, and Generative Priors for High-Quality Face Video Restoration
Summary
TIGER is a novel structured tri-prior fusion framework designed for high-quality Face Video Restoration (FVR), addressing common challenges like identity shift, viewpoint-entangled guidance, and perceptual realism. Developed by Wenxue Li, Peng Zhang, Yifei Chen, Fei Wang, Daiguo Zhou, and others, TIGER integrates an Identity Prior by injecting subject-discriminative embeddings into the latent space to anchor identity. It constructs a Geometry Prior by lifting 2D reference cues into a disentangled 3D parameter space for temporally consistent structural guidance. Furthermore, it utilizes a video generation model's Generative Prior via a one-step rectified flow for efficiency and realism. The framework incorporates a progressive three-stage training optimization strategy to refine structural fidelity, textural reconstruction, and distribution-level realism. A new large-scale FVR dataset was also created for robust training and evaluation. Extensive experiments demonstrate that TIGER achieves superior performance in both identity fidelity and temporal stability, delivering a high-quality, efficient and identity-consistent FVR.
Key takeaway
For Machine Learning Engineers developing face video restoration systems, you should consider TIGER's tri-prior fusion approach to overcome identity shifts and temporal inconsistencies. Implementing distinct Identity, Geometry, and Generative Priors, combined with a progressive three-stage training, can significantly enhance fidelity and realism. Focus on disentangling 3D parameters for structural guidance and leveraging one-step rectified flow for efficient, high-quality output in your models.
Key insights
TIGER fuses identity, geometry, and generative priors to restore high-fidelity, identity-consistent face videos efficiently.
Principles
- Identity anchoring prevents shifts.
- 3D geometry ensures temporal consistency.
- Progressive training refines realism.
Method
TIGER establishes Identity, Geometry, and Generative Priors, then fuses them. It uses a progressive three-stage training strategy to optimize structural fidelity, textural reconstruction, and distribution-level realism.
In practice
- Restore degraded face videos.
- Preserve identity in dynamic videos.
- Enhance realism in video generation.
Topics
- Face Video Restoration
- Identity Preservation
- 3D Facial Priors
- Generative Models
- Temporal Consistency
- Rectified Flow
Code references
Best for: Research Scientist, AI Scientist, Computer Vision Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.