Loong: A Human-Like Long Document Translation Agent with Observe-and-Act Adaptive Context Selection
Summary
Loong, a novel human-like long document translation agent, addresses the challenges of limited context windows and redundant information in large language models for document-level translation. It incorporates a 3E memory module, storing summaries, sentence pairs, and entity records as historical context. Unlike traditional methods that passively attend to all history, Loong actively identifies optimal context for translation guidance through deep reasoning. The agent's context policy is optimized using reinforcement learning, utilizing preference data derived from its own sampled observe-and-act reasoning trajectories. Empirical evaluations demonstrate that Loong achieves substantial translation quality improvements, with average gains of up to 13.0 points across three evaluation metrics in English ⇔ Chinese, German, and French translation directions. Furthermore, Loong exhibits strong generalization across domains, robustness against contextual noise, and remarkable stability in ultra-long document translation. Its code is publicly available on GitHub.
Key takeaway
For NLP Engineers developing document-level translation systems, Loong's approach offers a significant paradigm shift. You should consider integrating adaptive context selection and reinforcement learning into your LLM-based translation workflows. This method can substantially improve translation quality and robustness. It is especially effective for ultra-long documents and diverse domains. This can reduce the impact of limited context windows and redundant information in your current models.
Key insights
Loong is a human-like agent that uses adaptive context selection and reinforcement learning for long document translation.
Principles
- Adaptive context selection improves LLM translation.
- Reinforcement learning optimizes context policies.
- Memory modules enhance historical context use.
Method
Loong employs a 3E memory module for historical context, performs deep reasoning to adaptively select optimal context, and optimizes its policy via reinforcement learning using observe-and-act trajectories.
In practice
- Implement 3E memory for context management.
- Use RL to fine-tune context selection.
- Evaluate gains in multi-language MT.
Topics
- Long Document Translation
- Large Language Models
- Reinforcement Learning
- Context Management
- Machine Translation Agents
- 3E Memory Module
Code references
Best for: AI Engineer, Research Scientist, AI Scientist, NLP Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.