COMPASS: COntinual Multilingual PEFT with Adaptive Semantic Sampling

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

COMPASS (COntinual Multilingual PEFT with Adaptive Semantic Sampling) is a new data-centric framework designed to adapt large language models (LLMs) to target languages, mitigating performance disparities and negative cross-lingual interference. It employs parameter-efficient fine-tuning (PEFT) by training lightweight, language-specific adapters on a carefully chosen subset of auxiliary multilingual data. The framework's core is a distribution-aware sampling strategy that utilizes multilingual embeddings and clustering to identify semantic gaps between current training data and the target usage distribution. By prioritizing auxiliary data from under-represented semantic clusters, COMPASS enhances positive cross-lingual transfer and minimizes interference. An extension, COMPASS-ECDA, provides a continual learning framework that monitors for data distribution shifts in production, dynamically updating adapters to prevent model staleness while preserving existing knowledge. This approach consistently outperforms baseline methods across Phi-4-Mini, Llama-3.1-8B, and Qwen2.5-7B architectures on benchmarks like Global-MMLU, MMLU-ProX, and OneRuler.

Key takeaway

For AI Engineers developing or maintaining multilingual LLMs, COMPASS offers a robust framework to improve performance and prevent model staleness. By adopting its distribution-aware semantic sampling and continual learning approach, you can effectively adapt models to new languages and dynamic data shifts, ensuring high performance across diverse linguistic contexts. Consider integrating COMPASS-ECDA for production environments requiring ongoing adaptation.

Key insights

COMPASS adapts LLMs to new languages using PEFT and semantic sampling to minimize cross-lingual interference.

Principles

Method

COMPASS uses multilingual embeddings and clustering to identify semantic gaps, then samples auxiliary data from under-represented clusters for PEFT adapter training.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.