Multi-Frequency Fusion for Robust Video Face Forgery Detection
Summary
Researchers have developed two novel face video forgery detectors, LFWS and LFWL, which achieve high accuracy with significantly smaller model sizes compared to existing wide or dual-stream backbone methods. Both detectors are built upon the Xception baseline model, which has 21.9 million parameters. LFWS integrates a low-frequency Wavelet-Denoised Feature (WDF) with a phase-only Spatial-Phase Shallow Learning (SPSL) map, while LFWL merges WDF with Local Binary Patterns (LBP). This lightweight fusion is achieved through an additional 1x1 convolution module, adding only 292 parameters and maintaining the total parameter count at 21.9 million, demonstrating improved efficiency.
Key takeaway
For AI scientists developing real-time deepfake detection systems, consider integrating lightweight fusion modules that combine handcrafted features like Wavelet-Denoised Features (WDF) with Spatial-Phase Shallow Learning (SPSL) or Local Binary Patterns (LBP). This approach can yield higher accuracy with minimal parameter overhead, allowing your models to run more efficiently on resource-constrained platforms without sacrificing detection performance.
Key insights
Lightweight fusion of handcrafted cues can significantly improve face forgery detection accuracy with minimal model overhead.
Principles
- Simpler models can outperform complex ones.
- Handcrafted features enhance deep learning models.
Method
Combine Wavelet-Denoised Features (WDF) with either Spatial-Phase Shallow Learning (SPSL) or Local Binary Patterns (LBP) using a 1x1 convolution.
In practice
- Integrate WDF and SPSL for forgery detection.
- Utilize LBP with WDF for alternative detection.
Topics
- Face Forgery Detection
- Lightweight Models
- Handcrafted Features
- Xception Architecture
- Wavelet Denoising
Best for: AI Scientist, Research Scientist, AI Researcher, Deep Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Apple Machine Learning Research.