Loong: A Human-Like Long Document Translation Agent with Observe-and-Act Adaptive Context Selection

2026-05-28 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

Loong, a novel human-like long document translation agent, addresses the challenges of limited context windows and redundant information in large language models for document-level translation. It incorporates a 3E memory module, storing summaries, sentence pairs, and entity records as historical context. Unlike traditional methods that passively attend to all history, Loong actively identifies optimal context for translation guidance through deep reasoning. The agent's context policy is optimized using reinforcement learning, utilizing preference data derived from its own sampled observe-and-act reasoning trajectories. Empirical evaluations demonstrate that Loong achieves substantial translation quality improvements, with average gains of up to 13.0 points across three evaluation metrics in English ⇔ Chinese, German, and French translation directions. Furthermore, Loong exhibits strong generalization across domains, robustness against contextual noise, and remarkable stability in ultra-long document translation. Its code is publicly available on GitHub.

Key takeaway

For NLP Engineers developing document-level translation systems, Loong's approach offers a significant paradigm shift. You should consider integrating adaptive context selection and reinforcement learning into your LLM-based translation workflows. This method can substantially improve translation quality and robustness. It is especially effective for ultra-long documents and diverse domains. This can reduce the impact of limited context windows and redundant information in your current models.

Key insights

Loong is a human-like agent that uses adaptive context selection and reinforcement learning for long document translation.

Principles

Adaptive context selection improves LLM translation.
Reinforcement learning optimizes context policies.
Memory modules enhance historical context use.

Method

Loong employs a 3E memory module for historical context, performs deep reasoning to adaptively select optimal context, and optimizes its policy via reinforcement learning using observe-and-act trajectories.

In practice

Implement 3E memory for context management.
Use RL to fine-tune context selection.
Evaluate gains in multi-language MT.

Topics

Long Document Translation
Large Language Models
Reinforcement Learning
Context Management
Machine Translation Agents
3E Memory Module

Code references

YutongWang1216/LoongDocMT

Best for: AI Engineer, Research Scientist, AI Scientist, NLP Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.