InfoShield: Privacy-Preserving Speech Representations for Mental Health Screening via Information-Theoretic Optimization

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Mental Health & Psychological Support · Depth: Expert, quick

Summary

InfoShield is a novel system designed to create privacy-preserving speech representations for mental health screening, addressing critical user concerns about demographic information exposure. It tackles the limitations of existing methods like adversarial training and Differential Privacy, which often compromise diagnostic performance or fail against unseen threats. InfoShield minimizes mutual information between speech representations and sensitive attributes while maintaining depression classification accuracy. A key innovation is TimeAwareMINE, which uses cross-modal attention to align acoustic frames with attribute embeddings, overcoming standard MINE estimators' struggles with sequential speech. Experiments on the Androids Corpus demonstrate InfoShield's effectiveness, reducing gender inference from 92.6% to 55.5% and age inference from 55.7% to 30.3%, with only a 6% F1 reduction. It achieves an F1 score of 0.784, surpassing the prior SOTA's 0.723.

Key takeaway

For AI Scientists developing speech-based diagnostic tools, InfoShield offers a robust framework to mitigate privacy risks without severely impacting diagnostic accuracy. You should consider integrating information-theoretic optimization, specifically TimeAwareMINE, to reduce sensitive attribute leakage in sequential data. This approach allows for scalable mental health screening while addressing critical user privacy concerns, potentially accelerating clinical deployment. Evaluate your models against both privacy inference rates and diagnostic F1 scores to ensure a balanced solution.

Key insights

InfoShield creates privacy-preserving speech representations for mental health screening by minimizing sensitive attribute leakage with minimal utility loss.

Principles

Method

InfoShield employs TimeAwareMINE with cross-modal attention to align acoustic frames with attribute embeddings, minimizing mutual information between speech representations and sensitive attributes while preserving classification accuracy.

In practice

Topics

Best for: Research Scientist, AI Scientist, NLP Engineer, AI Ethicist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.