Deep Supervised Contrastive Learning of Pitch Contours for Robust Pitch Accent Classification in Seoul Korean

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

Dual-Glob, a deep supervised contrastive learning framework, has been developed to classify fine-grained pitch accent patterns in Seoul Korean. This framework addresses the challenge of mapping continuous F0 contours to discrete tonal categories within the Autosegmental-Metrical (AM) model of intonational phonology, which is complicated by variable F0 realizations in natural speech. Dual-Glob captures holistic F0 contour shapes by ensuring structural consistency between clean and augmented views in a shared latent space, departing from conventional local predictive models. The researchers also introduced the first large-scale benchmark dataset for this task, comprising 10,093 manually annotated Accentual Phrases in Seoul Korean. Experimental results demonstrate that Dual-Glob achieves a state-of-the-art accuracy of 77.75% and an F1-score of 51.54%, significantly outperforming strong baseline models.

Key takeaway

For research scientists working on intonational phonology or speech processing for tonal languages, Dual-Glob offers a robust methodology for classifying pitch accent patterns. You should consider integrating deep supervised contrastive learning to capture holistic F0 contour shapes, especially when dealing with the variability of real-world speech. This approach can significantly improve classification accuracy and F1-score compared to traditional local predictive models.

Key insights

Dual-Glob uses deep supervised contrastive learning to robustly classify Seoul Korean pitch accent patterns from continuous F0 contours.

Principles

Method

Dual-Glob employs deep supervised contrastive learning to enforce structural consistency between clean and augmented F0 contour views in a shared latent space, enabling robust classification of pitch accent patterns.

In practice

Topics

Best for: Research Scientist, AI Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.