Better Literary Translation: A Multi-Aspect Data Generation and LLM Training Approach

2026-06-04 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

A new multi-aspect iterative refinement framework addresses challenges in literary translation. It tackles the scarcity of high-quality annotated data and the need to balance expression fluency with literary effect. This framework generates high-quality translation references and preference data. It uses specialized LLM translators, each targeting a distinct quality dimension. The generated data supports supervised fine-tuning (SFT) and reinforcement learning (RL). Experiments show generated references outperform original ground truth for SFT by 8.65 CEA100 points. For RL, DPO degraded performance, but GRPO yielded an additional 1.51 point improvement. This is attributed to GRPO's stability and online exploration. The resulting LitMT-8B and LitMT-14B models achieved 67.25 and 69.07 CEA100 respectively on MetaphorTrans English-to-Chinese. These scores are competitive with Claude Sonnet 4.5 at 68.43 CEA100 and generalize well to out-of-domain literary work.

Key takeaway

For NLP Engineers developing literary translation systems, this multi-aspect data generation and LLM training approach offers a robust method. It overcomes data scarcity and enhances model performance. You should consider implementing specialized LLM translators for iterative data refinement. Also, leverage GRPO for reinforcement learning. GRPO demonstrated superior stability and exploration compared to DPO, achieving competitive results against models like Claude Sonnet 4.5.

Key insights

Multi-aspect data generation and iterative refinement significantly enhance LLM literary translation.

Principles

Specialized LLM translators can target distinct quality dimensions.
Generated references can surpass original ground truth for SFT.
GRPO offers stability and online exploration for RL in this context.

Method

A multi-aspect iterative refinement framework generates high-quality translation references and preference data via specialized LLM translators, then uses this data for supervised fine-tuning and GRPO-based reinforcement learning.

In practice

Generate superior SFT data using specialized LLM translators.
Employ GRPO for reinforcement learning in literary translation tasks.

Topics

Literary Translation
Large Language Models
Data Generation
Reinforcement Learning
Supervised Fine-tuning
GRPO
MetaphorTrans Benchmark

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.