UAU-Net: Uncertainty-aware Representation Learning and Evidential Classification for Facial Action Unit Detection
Summary
UAU-Net is an Uncertainty-aware AU (Action Unit) detection framework that addresses challenges in facial AU detection by explicitly modeling uncertainty at both representation and decision stages. It introduces CV-AFE, a conditional VAE-based AU feature extraction module, which learns probabilistic AU representations by estimating feature means and variances across multiple spatio-temporal scales. CV-AFE also captures uncertainty from inter-AU dependencies by conditioning on AU labels. For the decision stage, UAU-Net employs AB-ENN, an Asymmetric Beta Evidential Neural Network, which parameterizes predictive uncertainty using Beta distributions and mitigates overconfidence with an asymmetric loss designed for highly imbalanced binary labels. Experiments on BP4D and DISFA datasets demonstrate that UAU-Net achieves state-of-the-art average F1-scores of 66.8% and 66.6% respectively, showing improved robustness and reliability.
Key takeaway
Research Scientists developing facial AU detection models should integrate explicit uncertainty modeling into their pipelines. By adopting techniques like UAU-Net's CVAE-based feature extraction and evidential neural networks with asymmetric loss, you can enhance model robustness against visual noise and label imbalance, leading to more reliable and calibrated predictions, especially for challenging or subtle AUs.
Key insights
Explicitly modeling uncertainty in both feature representation and classification improves facial Action Unit detection robustness.
Principles
- Uncertainty is intrinsic to both AU representation and prediction.
- Beta distributions can model predictive uncertainty for binary tasks.
- Asymmetric loss mitigates class imbalance in evidential learning.
Method
UAU-Net uses a CVAE-based module (CV-AFE) for probabilistic AU feature extraction and an Asymmetric Beta Evidential Neural Network (AB-ENN) for multi-label classification, addressing both representation and predictive uncertainty.
In practice
- Use CVAEs to learn probabilistic feature embeddings.
- Employ Beta distributions for binary evidential classification.
- Apply asymmetric loss for imbalanced multi-label tasks.
Topics
- Facial Action Unit Detection
- Uncertainty Modeling
- Evidential Neural Networks
- Conditional VAE
- Asymmetric Beta Loss
Best for: Research Scientist, AI Scientist, Computer Vision Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.