AdaTrans: Automated C to Rust Transformation via Error-Adaptive Repair
Summary
The AdaTrans framework, introduced in 2026, automates C-to-Rust code transformation by addressing Rust's strict ownership and borrowing semantics. It integrates a Strategy-Driven Retrieval-Augmented Generation (RAG) mechanism to map compiler errors to specific repair strategies, an Error-Stratified Transformation Strategy (ESTS) that classifies compiler diagnostics for adaptive temperature scheduling, and a multi-stage validation pipeline for compilability and functional equivalence. Evaluated on a dataset of 104 algorithmic problems from LeetCode, AdaTrans achieved a mean compilation pass rate of 95.51% (± 1.11%) and a mean solve rate of 81.09% (± 3.09%) with a low unsafe file rate of 1.19%. This significantly improves upon existing LLM-based tools and zero-shot baselines using gpt-4o-mini, demonstrating that error-adaptive repair reconciles transformation correctness with memory safety.
Key takeaway
For AI Engineers tasked with migrating C codebases to Rust, AdaTrans demonstrates a critical path to achieving both functional equivalence and memory safety. You should consider implementing error-adaptive repair loops that leverage compiler diagnostics and dynamic temperature scheduling. This approach, which significantly outperforms brute-force sampling, allows you to systematically address Rust's strict ownership rules and reduce reliance on unsafe blocks, ensuring robust and secure code transformations.
Key insights
Adapting LLM repair strategies to compiler error categories significantly improves C-to-Rust transformation correctness and memory safety.
Principles
- LLMs struggle with deterministic static constraints like Rust's ownership system.
- Compiler feedback is often underutilized for targeted program repair.
- Uniform repair strategies are ineffective across heterogeneous error types.
Method
AdaTrans uses a generate-verify-repair loop, classifying compiler diagnostics (SL, MS, LB, AF), retrieving error-specific repair templates via RAG, and adapting LLM generation temperature based on error category and stagnation.
In practice
- Categorize compiler errors to tailor LLM repair strategies effectively.
- Integrate RAG with LLMs for context-aware code transformation guidance.
- Implement dynamic temperature scheduling for varied error types (e.g., low for syntax, high for logic).
Topics
- C-to-Rust Transformation
- Large Language Models
- Automated Program Repair
- Memory Safety
- Retrieval-Augmented Generation
- Error-Stratified Repair
Code references
Best for: Research Scientist, AI Scientist, AI Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.