SamaVaani: Auditing and Debiasing Multilingual Clinical ASR for Indian Languages
Summary
A study introduces SamaVaani, a unified debiasing technique designed to improve Automatic Speech Recognition (ASR) performance and fairness across demographic groups in multilingual clinical settings. Researchers first conducted a systematic audit of eight state-of-the-art ASR models—IndicWhisper, WhisperLargeV3, Sarvam, GoogleS2T, Gemma3n, OmniLingual, Vaani, and Gemini—on real-world psychiatric interview data in Kannada, Hindi, and Indian English. The audit revealed significant performance variability across models and languages, with some systems excelling in Indian English but failing in regional speech. Further fine-tuning of Gemma3n and OmniLingual exposed systematic performance gaps related to speaker role and gender, which fairness-aware fine-tuning successfully mitigated. SamaVaani addresses these issues by simultaneously enhancing ASR accuracy and ensuring equitable deployment.
Key takeaway
For NLP Engineers developing ASR systems for diverse healthcare environments, you must rigorously audit model performance across languages and demographic groups. The observed variability and biases, particularly concerning speaker role and gender, necessitate fairness-aware fine-tuning. Consider implementing debiasing techniques like SamaVaani to ensure equitable and accurate clinical ASR deployment, preventing potential disparities in patient care documentation. Your focus should be on robust evaluation and bias mitigation.
Key insights
Auditing multilingual clinical ASR reveals bias, which can be mitigated by fairness-aware fine-tuning and debiasing techniques like SamaVaani.
Principles
- ASR performance varies significantly across languages and models.
- Speaker role and gender introduce systematic performance gaps.
- Fairness-aware fine-tuning can mitigate ASR demographic biases.
Method
The proposed SamaVaani technique unifies debiasing to simultaneously improve ASR performance and enhance fairness across demographic groups in clinical settings.
In practice
- Audit ASR models on diverse real-world clinical data.
- Fine-tune top-performing open-source models for specific contexts.
- Implement fairness-aware fine-tuning to address demographic biases.
Topics
- Automatic Speech Recognition
- Clinical ASR
- Multilingual Models
- Bias Mitigation
- Fairness-aware AI
- Indian Languages
Best for: Research Scientist, AI Scientist, NLP Engineer, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.