CLewR: Curriculum Learning with Restarts for Machine Translation Preference Learning

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, extended

Summary

CLewR (Curriculum Learning with Restarts) is a novel curriculum learning strategy designed to enhance machine translation (MT) performance by integrating into existing preference optimization (PO) algorithms. This method addresses catastrophic forgetting by repeatedly iterating through an easy-to-hard data curriculum during training epochs. CLewR demonstrates consistent and statistically significant performance gains in MT across several large language model (LLM) families, including Gemma2, Qwen2.5, and Llama3.1, when applied with preference optimization techniques like DPO, CPO, and ARPO. The approach also introduces CLewR-z, which derives its curriculum score from the ARPO distance, and an enhanced ARPO variant that incorporates external semantic signals from MT metrics like BLEU and COMET-22. The code for CLewR is publicly available on GitHub.

Key takeaway

Research Scientists working on fine-tuning LLMs for machine translation should consider implementing CLewR to improve performance. By integrating this curriculum learning strategy with restarts into preference optimization algorithms like DPO, CPO, or ARPO, you can achieve consistent and statistically significant gains, particularly for generic LLMs. This approach effectively mitigates catastrophic forgetting, ensuring that models retain knowledge of easier examples while learning harder ones.

Key insights

CLewR improves machine translation by integrating curriculum learning with restarts into preference optimization to mitigate catastrophic forgetting.

Principles

Method

CLewR sorts preference triplets based on a similarity score derived from MT metrics (BLEU, COMET-22, METEOR). Training proceeds in an easy-to-hard order, with this permutation reused in every epoch to mitigate catastrophic forgetting.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.