Multilingual Coreference Resolution via Cycle-Consistent Machine Translation

2026-06-06 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, long

Summary

A novel coreference resolution (CR) pipeline significantly enhances performance in low-resource languages by leveraging machine translation (MT) to generate or expand training data. The system, which extends the Maverick model with a multilingual encoder (mmBERT-base), utilizes Claude Sonnet 4.6 to translate English annotated samples to a target language and then back to English. The quality of these translated samples is automatically validated via BERTScore, which measures cosine similarity between the original and back-translated English texts in a BERT model's latent space. This similarity score is integrated into the loss function as a weighting factor (s^p) during training. Extensive experiments across French, Hungarian, Romanian, and Russian demonstrate substantial performance gains, enabling accurate CR even in Romanian, where no prior corpora existed.

Key takeaway

For NLP Engineers developing coreference resolution systems in low-resource languages, you should consider implementing a cycle-consistent machine translation pipeline. This approach allows you to generate or augment training data effectively, even for languages with no existing corpora, significantly boosting CR performance. By weighting training samples based on back-translation quality, you can mitigate noise from translation artifacts and achieve higher precision in your models.

Key insights

Coreference resolution in low-resource languages can be significantly improved by cycle-consistent machine translation data augmentation.

Principles

Weighting translated training samples by back-translation cycle consistency improves model precision.
Multilingual encoders enable a single CR model across diverse languages.
Zero-shot LLMs lag specialized CR models by 10-20% F1 on benchmarks like CoNLL-2012/OntoNotes.

Method

The pipeline translates English CR data to a target language, back-translates it to English, computes BERTScore for cycle consistency, and weights the CR model's loss function with s^p during training.

In practice

Generate CR training data for languages lacking resources using LLM-based MT.
Expand existing small CR datasets with cycle-consistent translated samples.
Employ BERTScore over BLEU for semantic similarity in back-translation quality assessment.

Topics

Coreference Resolution
Low-Resource Languages
Machine Translation
Data Augmentation
BERTScore
Claude Sonnet 4.6
Maverick Model

Best for: Research Scientist, AI Engineer, AI Scientist, NLP Engineer, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.