Dependency Parsing Across the Resource Spectrum: Evaluating Architectures on High and Low-Resource Languages

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A study evaluated four dependency parsers—Biaffine LSTM, Stack-Pointer Network, AfroXLMR-large, and RemBERT—across ten typologically diverse languages, including low-resource African languages. The research aimed to understand the performance of Transformer-based models versus simpler architectures in low-resource settings. Findings indicate that the Biaffine LSTM consistently outperforms Transformer models in low-resource environments. Transformers begin to show an advantage as training data increases, with the crossover point occurring within the typical resource range for under-resourced language treebanks. Morphological complexity, quantified by MATTR, was identified as a significant secondary factor influencing Transformers' relative disadvantage, even after accounting for corpus size.

Key takeaway

For AI Engineers developing syntactic tools for low-resource languages, you should initially favor the Biaffine LSTM architecture. This approach provides superior performance until a substantial amount of annotated training data becomes available. Once sufficient data is acquired, you can then transition to pre-trained Transformer models to leverage their full representational capacity.

Key insights

Biaffine LSTMs excel in low-resource dependency parsing, outperforming Transformers until sufficient data is available.

Principles

Method

Evaluated Biaffine LSTM, Stack-Pointer Network, AfroXLMR-large, and RemBERT parsers on ten languages, focusing on low-resource African languages, and analyzed performance against corpus size and morphological complexity.

In practice

Topics

Best for: AI Engineer, Machine Learning Engineer, Research Scientist, AI Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.