Improving Code Translation with Syntax-Guided and Semantic-aware Preference Optimization

· Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, quick

Summary

A novel approach called CTO (Code Translation Optimization) has been developed to enhance large language model (LLM) performance in code translation, specifically addressing challenges in syntactic correctness and semantic consistency. Traditional preference-based learning methods are often limited by unreliable semantic rewards, which typically stem from sparse test cases or overly restrictive reference translations. CTO introduces a robust semantic reward system derived directly from the source code. It employs contrastive learning to train a cross-lingual semantic model, enabling direct assessment of functional equivalence between source and translated code. This semantic signal is then integrated with compiler-based syntactic feedback within a direct preference optimization framework, treating code translation as a multi-objective optimization problem. Experiments across C++, Java, and Python translations show CTO significantly outperforms current baselines and other preference optimization strategies.

Key takeaway

For research scientists developing code translation models, CTO offers a critical advancement by demonstrating how to achieve superior syntactic correctness and semantic consistency. You should consider integrating source-derived semantic rewards and compiler-based syntactic feedback into your preference optimization frameworks to overcome limitations of traditional test-case-dependent methods, potentially leading to more robust and functionally equivalent code translations.

Key insights

CTO improves code translation by unifying syntax-guided and source-derived semantic feedback within a preference optimization framework.

Principles

Method

CTO trains a cross-lingual semantic model via contrastive learning to assess functional equivalence, then unifies this semantic signal with compiler-based syntactic feedback using direct preference optimization.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.