Improving Code Translation with Syntax-Guided and Semantic-aware Preference Optimization

2026-05-15 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

CTO, a novel approach, improves code translation by integrating syntax-guided and semantic-aware preference optimization. Large Language Models (LLMs) often struggle with syntactic correctness and semantic consistency in code translation, and existing preference-based learning methods are hindered by unreliable semantic rewards. CTO addresses this by training a cross-lingual semantic model via contrastive learning to directly assess functional equivalence between source and translated code. This robust semantic signal is then unified with compiler-based syntactic feedback within a direct preference optimization (DPO) framework, treating code translation as a multi-objective optimization problem. Experiments on C++, Java, and Python translations demonstrate that CTO significantly outperforms existing baselines and alternative preference optimization strategies, achieving accuracy gains of up to 3.66% on TransCoder-Test and 4.27% on HumanEval-X with CodeT5, and even higher gains with CodeLlama-7B and Qwen2.5-Coder-7B.

Key takeaway

For AI Engineers and Research Scientists working on cross-lingual code migration, CTO offers a robust method to enhance translation quality. By integrating direct semantic equivalence assessment with compiler-based syntactic feedback, your models can achieve superior accuracy and functional alignment. Consider adopting CTO's multi-objective preference optimization to overcome limitations of traditional supervised finetuning and improve the reliability of your code translation systems, especially for critical legacy modernization projects.

Key insights

CTO unifies syntax-guided and semantic-aware preference optimization for robust, accurate code translation.

Principles

Semantic rewards must derive directly from source code.
Compiler feedback provides infallible syntactic correctness signals.
Multi-objective optimization improves code translation accuracy.

Method

CTO trains a cross-lingual semantic model via contrastive learning, then unifies its semantic reward with compiler-based syntactic feedback within a DPO framework, formulating code translation as a multi-objective optimization problem.

In practice

Use Qwen3-8B for negative sample generation.
Employ LoRA for finetuning larger models like CodeLlama-7B.
Prioritize syntactic correctness in preference dataset construction.

Topics

Code Translation
Large Language Models
Direct Preference Optimization
Cross-lingual Semantic Model
Syntactic Correctness

Code references

nju-websoft/CTO

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.