CUPID: Reconstructing UV Texture Maps for Interpretable Person-of-Interest Deepfake Detection

2026-06-18 · Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Expert, quick

Summary

CUPID is a novel Person-of-Interest (POI) deepfake detector designed to address the threat of high-profile individual deepfakes. This method combines UV texture maps, derived from 3D face reconstructions, with the representation learning capabilities of a Masked Autoencoder (MAE). A key innovation is that CUPID does not require deepfake videos or even specific POIs in its training phase, learning discriminative facial features from real video frames. In testing, it matches embeddings from a query video against pristine references to assess authenticity. Experiments across four deepfake datasets demonstrate that CUPID outperforms current methods on most datasets, achieving superior robustness against strong downscaling and compression, while also providing substantially faster inference. The system also offers interpretability by highlighting manipulated facial regions via decoded residual maps.

Key takeaway

For AI Security Engineers or forensic analysts tasked with combating sophisticated deepfakes, CUPID offers a robust and interpretable solution. Its ability to detect Person-of-Interest deepfakes without requiring deepfake training data or specific POIs in the training set simplifies deployment. You can leverage its UV space interpretability to pinpoint manipulated facial regions, enhancing investigative clarity and providing faster, more reliable authenticity assessments against post-processing challenges like downscaling and compression.

Key insights

CUPID uses UV texture maps and MAE for robust, interpretable POI deepfake detection without deepfake training data.

Principles

3D face reconstructions yield robust appearance representations.
Masked Autoencoders learn discriminative features from real data.
UV space enables interpretable deepfake detection.

Method

CUPID extracts UV texture maps from real video frames, trains a Masked Autoencoder for context-guided reconstruction, then matches query video embeddings against pristine references to assess authenticity.

In practice

Detect deepfakes of high-profile individuals.
Identify manipulated facial regions.
Authenticate video content without deepfake training data.

Topics

Deepfake Detection
Person-of-Interest
UV Texture Maps
Masked Autoencoder
Facial Reconstruction
Video Authenticity
Interpretability

Code references

polimi-ispl/CUPID

Best for: Research Scientist, AI Scientist, AI Security Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.