A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation
Summary
Cadmus is a small-scale system designed for autoregressive program synthesis, offering a controlled environment for research into program completion and inductive reasoning. It comprises an integer virtual machine, a diverse dataset of true programs, and an autoregressive transformer model trained for under $200. This system allows researchers to conduct experiments with fine-grained control over the training distribution and enables detailed model inspection, which is often cost-prohibitive with large language models (LLMs). Cadmus models achieved 100% accuracy on integer arithmetic program completion tasks in its domain-specific language, outperforming GPT-5, which scored 95%. The system also highlights how GPT-5 introduces unknown priors, complicating investigations where the training set's relationship to the task must be fully understood.
Key takeaway
For research scientists investigating program synthesis or inductive reasoning, Cadmus offers a cost-effective and transparent alternative to large language models. You can achieve high accuracy (100% on arithmetic tasks) while maintaining full control over the training data and model internals, avoiding the confounding factors of unknown priors inherent in larger, pre-trained models.
Key insights
Small, controlled program synthesis systems offer transparent research into model reasoning and training distribution effects.
Principles
- Smaller models enable affordable, fine-grained experimental control.
- Unknown priors in LLMs can confound research into training set relationships.
Method
Cadmus integrates an integer VM, a true program dataset, and a low-cost autoregressive transformer for controlled program synthesis research.
In practice
- Use Cadmus for studying out-of-distribution representations.
- Inspect model instrumentation on complex reasoning tasks.
Topics
- Autoregressive Program Synthesis
- Small Language Models
- Transformer Architecture
- Out-of-Distribution Generalization
- Inductive Reasoning
Best for: Research Scientist, AI Researcher, AI Scientist, Deep Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Apple Machine Learning Research.