Regression Language Models for Code

2026-06-17 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

Regression Language Models (RLMs) offer a unified approach to code-to-metric regression, predicting numeric outcomes directly from code text, bypassing complex feature engineering. A 300M parameter RLM, initialized from T5Gemma, demonstrates strong performance, achieving over 0.9 Spearman-rank on APPS competitive programming for memory footprint prediction in Python and C++. This single model also attains over 0.5 average Spearman-rank across 17 CodeNet languages and a 0.46 average Kendall-Tau on five Neural Architecture Search (NAS) design spaces, outperforming graph neural networks. RLMs can simultaneously predict Triton GPU kernel latency and neural network accuracy/speed from ONNX representations. Key findings also highlight benefits from language and regression pretraining, the superiority of decoder-based numeric outputs over MSE heads, and the impact of encoder size and custom tokenization.

Key takeaway

For AI Scientists and Machine Learning Engineers optimizing code performance or designing neural architectures, Regression Language Models (RLMs) provide a compelling, unified solution. You can leverage RLMs to predict metrics like memory, latency, and accuracy directly from raw code or ONNX graphs, eliminating tedious feature engineering. Consider integrating RLMs into your development pipeline to accelerate performance prediction and guide optimization decisions across diverse programming languages and hardware platforms.

Key insights

A unified Regression Language Model (RLM) can predict diverse code metrics directly from text, simplifying computational graph regression.

Principles

Text-to-text regression simplifies complex feature engineering.
Pretraining on language data and synthetic metrics accelerates convergence.
Decoder-based numeric outputs are superior and normalization-free.

Method

The RLM method treats regression as a next-token prediction problem using an encoder-decoder architecture. It employs explicit digit-by-digit numeric tokenization for y-values, like `<+><-><1><7><2><5>`, and constrained decoding to ensure valid numerical outputs without normalization.

In practice

Employ RLMs for memory and latency prediction in Python and C++.
Use RLMs to predict neural network performance from ONNX graphs.
Pretrain RLMs on synthetic metrics to boost real-task convergence.

Topics

Regression Language Models
Code Performance Prediction
Neural Architecture Search
ONNX
T5Gemma
Multi-task Learning

Code references

google-deepmind/regress-lm

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.