UAU-Net: Uncertainty-aware Representation Learning and Evidential Classification for Facial Action Unit Detection

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision · Depth: Expert, extended

Summary

UAU-Net is an Uncertainty-aware AU (Action Unit) detection framework that addresses challenges in facial AU detection by explicitly modeling uncertainty at both representation and decision stages. It introduces CV-AFE, a conditional VAE-based AU feature extraction module, which learns probabilistic AU representations by estimating feature means and variances across multiple spatio-temporal scales. CV-AFE also captures uncertainty from inter-AU dependencies by conditioning on AU labels. For the decision stage, UAU-Net employs AB-ENN, an Asymmetric Beta Evidential Neural Network, which parameterizes predictive uncertainty using Beta distributions and mitigates overconfidence with an asymmetric loss designed for highly imbalanced binary labels. Experiments on BP4D and DISFA datasets demonstrate that UAU-Net achieves state-of-the-art average F1-scores of 66.8% and 66.6% respectively, showing improved robustness and reliability.

Key takeaway

Research Scientists developing facial AU detection models should integrate explicit uncertainty modeling into their pipelines. By adopting techniques like UAU-Net's CVAE-based feature extraction and evidential neural networks with asymmetric loss, you can enhance model robustness against visual noise and label imbalance, leading to more reliable and calibrated predictions, especially for challenging or subtle AUs.

Key insights

Explicitly modeling uncertainty in both feature representation and classification improves facial Action Unit detection robustness.

Principles

Method

UAU-Net uses a CVAE-based module (CV-AFE) for probabilistic AU feature extraction and an Asymmetric Beta Evidential Neural Network (AB-ENN) for multi-label classification, addressing both representation and predictive uncertainty.

In practice

Topics

Best for: Research Scientist, AI Scientist, Computer Vision Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.