KD4MT: A Survey of Knowledge Distillation for Machine Translation

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Advanced, long

Summary

A survey titled "KD4MT: A Survey of Knowledge Distillation for Machine Translation" synthesizes 105 papers published through October 1, 2025, on Knowledge Distillation (KD) in Machine Translation (MT). It highlights that KD in MT functions beyond mere model compression, serving as a general-purpose knowledge transfer mechanism that influences supervision, translation quality, and efficiency. The survey introduces MT and KD fundamentals, categorizes KD4MT advances by methodological contributions and practical applications, and identifies trends, research gaps, and the absence of unified evaluation practices. It also provides practical guidelines for KD method selection, discusses risks like increased hallucination and bias amplification, and explores the evolving role of Large Language Models (LLMs) in KD4MT. A public database and glossary complement the survey.

Key takeaway

For AI Scientists and Research Scientists developing or deploying Machine Translation systems, recognize that Knowledge Distillation offers more than just model compression. You should strategically apply KD to enhance translation quality, expand language coverage, or adapt models to specific domains, especially when dealing with resource constraints or the need to specialize general-purpose LLMs for MT tasks. Be mindful of potential risks like hallucination and bias amplification when implementing KD.

Key insights

KD in Machine Translation is a versatile knowledge transfer mechanism, not solely a compression technique.

Principles

Method

KD involves training a powerful "teacher" model, then training a smaller "student" model with supervision from the trained teacher, minimizing divergence between their output distributions.

In practice

Topics

Code references

Best for: AI Scientist, Research Scientist, AI Researcher, NLP Engineer, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.