A Sociolinguistic Analysis of Automatic Speech Recognition Bias in Newcastle English
Summary
A study investigated Automatic Speech Recognition (ASR) bias in Newcastle English, a regional dialect from North-East England, using spontaneous speech from the Diachronic Electronic Corpus of Tyneside English (DECTE). Researchers evaluated a commercial ASR system, analyzing over 3,000 transcription errors classified by linguistic domain and social variables like gender, age, and socioeconomic status. An acoustic case study also examined how phonetic variation contributes to misrecognition. Results showed that phonological variation caused most errors, specifically dialect-specific vowel quality, glottalisation, local vocabulary, and non-standard grammar. Error rates were higher for men and speakers at the extremes of the age spectrum, indicating socially patterned ASR errors.
Key takeaway
For NLP Engineers developing or evaluating ASR systems, understanding that errors are socially patterned and linked to dialectal variation is crucial. You should prioritize incorporating sociolinguistic expertise into system development and evaluation, ensuring training data explicitly addresses regional accents and community-based speech to build more equitable and accurate ASR technologies.
Key insights
ASR errors are not random but socially patterned, driven by dialectal variation and demographic factors.
Principles
- Dialectal variation challenges ASR performance.
- Sociolinguistic factors influence ASR error rates.
Method
Evaluated a commercial ASR system on spontaneous dialectal speech, classifying over 3,000 transcription errors by linguistic domain and social variables, complemented by an acoustic case study.
In practice
- Focus ASR training on dialect-specific features.
- Include diverse community-based speech data.
- Analyze errors by social group.
Topics
- Automatic Speech Recognition
- ASR Bias
- Sociolinguistic Analysis
- Dialectal Variation
- Newcastle English
Best for: NLP Engineer, Research Scientist, AI Researcher, AI Scientist, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.