UAU-Net: Uncertainty-aware Representation Learning and Evidential Classification for Facial Action Unit Detection

2026-04-24 · Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision · Depth: Expert, extended

Summary

UAU-Net is an Uncertainty-aware AU (Action Unit) detection framework that addresses challenges in facial AU detection by explicitly modeling uncertainty at both representation and decision stages. It introduces CV-AFE, a conditional VAE-based AU feature extraction module, which learns probabilistic AU representations by estimating feature means and variances across multiple spatio-temporal scales. CV-AFE also captures uncertainty from inter-AU dependencies by conditioning on AU labels. For the decision stage, UAU-Net employs AB-ENN, an Asymmetric Beta Evidential Neural Network, which parameterizes predictive uncertainty using Beta distributions and mitigates overconfidence with an asymmetric loss designed for highly imbalanced binary labels. Experiments on BP4D and DISFA datasets demonstrate that UAU-Net achieves state-of-the-art average F1-scores of 66.8% and 66.6% respectively, showing improved robustness and reliability.

Key takeaway

Research Scientists developing facial AU detection models should integrate explicit uncertainty modeling into their pipelines. By adopting techniques like UAU-Net's CVAE-based feature extraction and evidential neural networks with asymmetric loss, you can enhance model robustness against visual noise and label imbalance, leading to more reliable and calibrated predictions, especially for challenging or subtle AUs.

Key insights

Explicitly modeling uncertainty in both feature representation and classification improves facial Action Unit detection robustness.

Principles

Uncertainty is intrinsic to both AU representation and prediction.
Beta distributions can model predictive uncertainty for binary tasks.
Asymmetric loss mitigates class imbalance in evidential learning.

Method

UAU-Net uses a CVAE-based module (CV-AFE) for probabilistic AU feature extraction and an Asymmetric Beta Evidential Neural Network (AB-ENN) for multi-label classification, addressing both representation and predictive uncertainty.

In practice

Use CVAEs to learn probabilistic feature embeddings.
Employ Beta distributions for binary evidential classification.
Apply asymmetric loss for imbalanced multi-label tasks.

Topics

Facial Action Unit Detection
Uncertainty Modeling
Evidential Neural Networks
Conditional VAE
Asymmetric Beta Loss

Best for: Research Scientist, AI Scientist, Computer Vision Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.