How We Built Zeta2: Training an Edit Prediction Model in Production — Ben Kunkle, Zed

2026-05-30 · Source: AI Engineer · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, medium

Summary

Zed developed Zeta2, an edit prediction model designed to suggest the next code edit around a user's cursor, requiring high speed for keystroke-level operation. Its training pipeline utilizes opt-in production data, capturing code snapshots, cursor positions, type definitions, and diagnostics. A key process is distillation, where a frontier model generates initial predictions, followed by a "repair step" using another frontier model to fix identified bad predictions. These refined predictions form the student model's expected output. The system employs "settled data," where user-completed edits are captured after a 10-second pause, though this data is noisy. To filter noise, Zed uses student models to generate multiple predictions and compares them via Levenshtein distance to the settled state, identifying ideal training examples. Offline evaluations use a held-out test set, tracking metrics like "delta car f" and "reversal ratio," with A/B testing in production to assess acceptance rates and latency.

Key takeaway

For AI Engineers building or enhancing code prediction models, prioritize a robust data pipeline that refines noisy production data. Implement distillation from larger models and an iterative repair step to improve training example quality. To manage costs, leverage your own student models for filtering "settled data" rather than expensive frontier model calls. A/B test new model versions with controlled production traffic to validate real-world performance and user acceptance before full deployment.

Key insights

Training a specialized code edit prediction model in production leverages frontier model distillation and filtered user-settled data for optimal examples.

Principles

Distillation from frontier models is effective for specialized tasks.
Production data, even noisy, can inform model training.
Iterative repair steps improve training data quality.

Method

Capture production snapshots, distill frontier model predictions, repair bad outputs, and format prompts. Filter noisy "settled data" by comparing student model predictions to user-completed edits via Levenshtein distance.

In practice

Use JSONL for flexible data pipeline stages.
Filter noisy user data with cheaper student model inferences.
A/B test new models with partial production traffic.

Topics

Edit Prediction
Code Generation
Machine Learning Pipelines
Model Distillation
A/B Testing
Production ML
Data Filtering

Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Engineer.