Automatic Speech Recognition for Child Reading: A Phonemic Approach using Isolated Words in Brazilian Portuguese

· Source: Paper Index on ACL Anthology · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

A new methodology improves Automatic Speech Recognition (ASR) models for assessing reading decoding in Brazilian Portuguese children. Researchers fine-tuned Wav2Vec2.0 models, transforming the transcription paradigm from orthographic to phonemic. Utilizing a novel corpus of 5,400 isolated word audio samples from 2nd and 3rd-grade elementary school children, the study compared pre-trained Portuguese and multilingual models. The phonemic approach, combined with fine-tuning, data augmentation, and adapted tokenization, significantly reduced the Phoneme Error Rate (PER). This advancement addresses data scarcity and high speech variability challenges, validating ASR's utility for detailed diagnosis of decoding errors and phonological acquisition, surpassing commercial tool limitations.

Key takeaway

For research scientists developing tools for child literacy assessment, you should consider adopting a phonemic transcription paradigm when fine-tuning ASR models. This approach, combined with data augmentation and adapted tokenization, offers a robust method to overcome the challenges of child speech variability and data scarcity, enabling more accurate diagnosis of reading decoding errors than current commercial solutions.

Key insights

A phonemic approach with fine-tuned Wav2Vec2.0 significantly improves ASR for child reading assessment in Brazilian Portuguese.

Principles

Method

The methodology involves fine-tuning Wav2Vec2.0 models, transforming orthographic to phonemic transcription, using data augmentation, and adapting tokenization for child speech.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.