Detecting Functional Memorization in Code Language Models

2026-06-11 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Software Development & Engineering · Depth: Expert, quick

Summary

Functional memorization in code language models is a phenomenon where models extract functional logic beyond what textual overlap metrics can detect. This work investigates this by constructing a counterfactual setup using Olmo-3-32B. Researchers compared a midtrained model, which had been exposed to specific target code, against a pretrained reference model that had not. Both models were prompted with Python function signatures, and their outputs were evaluated for similarity using both textual metrics and functional similarity assessments, specifically an LLM-as-a-judge approach and execution-based comparisons. The results provide clear evidence of functional memorization, underscoring the necessity for auditing metrics that extend beyond simple textual overlap to accurately assess code generation capabilities and potential data extraction.

Key takeaway

For AI Security Engineers or Machine Learning Engineers auditing code generation models, your current reliance on textual overlap metrics is insufficient to detect functional memorization. This research demonstrates that models like Olmo-3-32B can extract and reproduce functional logic even when the generated code is textually dissimilar from training data. You should integrate functional similarity assessments, such as LLM-as-a-judge or execution-based evaluations, into your auditing pipelines to accurately identify and mitigate risks associated with unintended data exposure or intellectual property concerns.

Key insights

Code LLMs can functionally memorize logic even when textually dissimilar, requiring advanced auditing.

Principles

Functional equivalence differs from textual similarity.
Auditing code LLMs needs beyond textual metrics.
Counterfactual setups reveal hidden memorization.

Method

A counterfactual setup compares midtrained (exposed) and pretrained (unexposed) Olmo-3-32B models. Prompts are Python function signatures, evaluated by LLM-as-a-judge and execution-based functional similarity.

In practice

Implement functional similarity checks for code LLMs.
Use execution-based metrics for code generation audits.
Design counterfactual experiments for model evaluation.

Topics

Code Language Models
Functional Memorization
Model Auditing
Olmo-3-32B
Python Function Signatures
Execution-based Evaluation

Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, Machine Learning Engineer, AI Security Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.