TMASC: Transmasculine Attitude and Speech Corpus

2026-06-15 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Natural Language Processing · Depth: Expert, quick

Summary

The Transmasculine Attitudes and Speech Corpus (TMASC) is a newly introduced multimodal dataset featuring contributions from 196 transmasculine individuals. This comprehensive corpus integrates detailed questionnaire responses, specifically exploring aspects of vocal health, with 66 corresponding audio recordings. The audio component captures diverse vocalizations, including cough and throat-clearing samples, a standardized reading passage, and answers to session-specific questions. The accompanying paper meticulously outlines the corpus's development methodology and the precise data collection procedures employed. To illustrate TMASC's practical utility, three distinct case studies are presented, demonstrating its application in integrating perceptual and acoustic data, identifying group-level vocal characteristics, and calibrating acoustic measurements, all designed to support transmasculine individuals.

Key takeaway

For research scientists developing speech technology or studying vocal health in specific populations, TMASC offers a unique, specialized dataset. You should consider integrating this multimodal corpus to enhance the accuracy and inclusivity of your models, particularly when analyzing transmasculine voices. Utilizing TMASC can improve the calibration of acoustic measurements and help identify distinct group-level vocal characteristics, leading to more precise and supportive applications.

Key insights

TMASC provides a multimodal corpus for research into transmasculine vocal health and characteristics.

Principles

Multimodal data enhances vocal health research.
Crowd-sourcing can build specialized corpora.

Method

The corpus was developed by collecting questionnaire responses and 66 audio recordings (cough, throat-clearing, reading, questions) from 196 transmasculine individuals.

In practice

Integrate perceptual and acoustic data.
Identify group-level vocal characteristics.
Calibrate acoustic measurements.

Topics

Transmasculine Voices
Speech Corpus
Vocal Health
Multimodal Data
Acoustic Analysis
Data Collection

Best for: AI Scientist, Research Scientist, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.