Transcribing speech is never neutral. It shapes power and bias
Summary
Automated language technologies, such as autocorrect and automatic speech recognition (ASR) systems, often fail to accurately process non-standard English or Indigenous languages, leading to significant misinterpretations. This issue stems from dictionaries and language models primarily trained on "prestige dialects" like those found in the Oxford English Dictionary or BBC English. For instance, "Boorloo" (Perth's Nyungar name) was autocorrected to "Barolo." Research from Cornell and Carnegie Mellon demonstrates that error-prone subtitles negatively impact viewer perception of a speaker's clarity and knowledge. The consequences are particularly severe for Australian First Nations people, where culturally significant communication elements like sustained silences are misinterpreted, and unique words go unrecognized. In critical contexts like legal, medical, and welfare settings, these transcription errors can lead to fabricated diagnoses, incorrect medication records, and even impact legal outcomes, highlighting a significant justice issue.
Key takeaway
For AI/ML leaders deploying language technologies in sensitive domains like healthcare or legal services, you must scrutinize the training data and inherent biases of ASR and transcription systems. Your teams should prioritize models that incorporate diverse linguistic data, especially for non-standard dialects and Indigenous languages, to prevent critical errors and ensure equitable outcomes. Failing to address these biases risks severe misdiagnoses, legal injustices, and erosion of trust in automated systems.
Key insights
Language technologies trained on prestige dialects often misrepresent non-standard and Indigenous communication, leading to serious consequences.
Principles
- All transcription protocols embed assumptions about standardized speech.
- Transcription quality impacts perception of speaker and content.
- Systematic misrepresentation of language is a justice issue.
In practice
- Develop more diverse models for automated speech recognition.
- Explicitly state transcription conventions and system limitations.
- Resist normalizing speech to an imagined standard reader.
Topics
- Automatic Speech Recognition
- Language Bias
- Indigenous Communication
- AI Scribes
- Transcription Ethics
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Ethicist, Policy Maker, Legal Professional
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial intelligence (AI) – The Conversation.