The BD-LSC Dataset: Facilitating the Benchmarking of Models for Lexical Semantic Change Detection in Slang and Standard Usage

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

The BD-LSC and ST-WSD datasets have been introduced to improve benchmarking for lexical semantic change (LSC) detection, particularly for bi-directional shifts and words with both slang and standard meanings. The Bi-Directional Lexical Semantic Change (BD-LSC) dataset tracks sense gain, loss, and stability across three time periods, enabling the study of complex semantic trajectories. Complementarily, the SlangTrack Word Sense Disambiguation (ST-WSD) dataset offers fine-grained, instance-level sense annotations for words combining slang and standard usages. Evaluations across unsupervised clustering, supervised machine learning, transformer-based models, and large language models revealed that few-shot GPT-4o achieved the strongest aggregate performance on Exact Sense Match (ESM) and multi-label accuracy. However, Macro-F1 scores near 0.5 across all systems highlight that rare slang senses remain a significant open challenge.

Key takeaway

For NLP Engineers developing semantic change detection models, these new BD-LSC and ST-WSD datasets offer critical benchmarks for bi-directional and slang-inclusive LSC. You should focus research efforts on improving Macro-F1 scores, as current models, even GPT-4o, struggle significantly with rare slang senses. Integrating these datasets into your evaluation pipeline will reveal model weaknesses in complex semantic shifts.

Key insights

New benchmarks address bi-directional lexical semantic change and slang usage, revealing challenges with rare slang senses.

Principles

Method

The BD-LSC dataset captures bi-directional semantic change, while ST-WSD provides instance-level sense annotations for slang and standard usage. Models are systematically evaluated across diverse methodological families.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, NLP Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.