CUPID: Reconstructing UV Texture Maps for Interpretable Person-of-Interest Deepfake Detection
Summary
CUPID is a novel Person-of-Interest (POI) deepfake detector designed to address the threat of high-profile individual deepfakes. This method combines UV texture maps, derived from 3D face reconstructions, with the representation learning capabilities of a Masked Autoencoder (MAE). A key innovation is that CUPID does not require deepfake videos or even specific POIs in its training phase, learning discriminative facial features from real video frames. In testing, it matches embeddings from a query video against pristine references to assess authenticity. Experiments across four deepfake datasets demonstrate that CUPID outperforms current methods on most datasets, achieving superior robustness against strong downscaling and compression, while also providing substantially faster inference. The system also offers interpretability by highlighting manipulated facial regions via decoded residual maps.
Key takeaway
For AI Security Engineers or forensic analysts tasked with combating sophisticated deepfakes, CUPID offers a robust and interpretable solution. Its ability to detect Person-of-Interest deepfakes without requiring deepfake training data or specific POIs in the training set simplifies deployment. You can leverage its UV space interpretability to pinpoint manipulated facial regions, enhancing investigative clarity and providing faster, more reliable authenticity assessments against post-processing challenges like downscaling and compression.
Key insights
CUPID uses UV texture maps and MAE for robust, interpretable POI deepfake detection without deepfake training data.
Principles
- 3D face reconstructions yield robust appearance representations.
- Masked Autoencoders learn discriminative features from real data.
- UV space enables interpretable deepfake detection.
Method
CUPID extracts UV texture maps from real video frames, trains a Masked Autoencoder for context-guided reconstruction, then matches query video embeddings against pristine references to assess authenticity.
In practice
- Detect deepfakes of high-profile individuals.
- Identify manipulated facial regions.
- Authenticate video content without deepfake training data.
Topics
- Deepfake Detection
- Person-of-Interest
- UV Texture Maps
- Masked Autoencoder
- Facial Reconstruction
- Video Authenticity
- Interpretability
Code references
Best for: Research Scientist, AI Scientist, AI Security Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.