The Optimization That Broke the GasTown Experiment

· Source: LLM on Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Advanced, medium

Summary

An experiment to build a Spotify playlist steganography tool, initially created in 18 minutes with Claude Code (33 files, 16 tests), encountered significant issues when replicated using a GasTown-inspired agentic workflow. This second attempt involved a human orchestrator ("Claude Mayor") and a ~30B polecat model (specifically qwen3.6-27b). While the qwen3.6-27b model performed well on ten scoped tasks, including writing idiomatic Go and handling API interactions, the project failed at units B11 and B12. The root cause was two "optimizations" introduced by the human orchestrator: a shift from opportunistic 1-3 bytes per track to a fixed 2 bytes, and a change in letter extraction logic. These deviations from the original algorithm, not decided by the model, invalidated the comparison and exposed a structural gap in agentic orchestration where the orchestrator's changes go unchecked. This led to the development of "Ratchet," a system designed with an "Adjudicator" role to verify design decomposition and preserve constraints.

Key takeaway

For AI Architects designing agentic workflows, ensure your orchestration layer includes robust oversight. Unchecked "optimizations" or implicit changes introduced during design decomposition can invalidate model evaluations and mask true capability gaps. Implement an "Adjudicator" role or similar mechanism to explicitly verify that work unit specifications preserve original algorithmic constraints. This prevents accidental divergence and ensures your experiments accurately assess model performance, rather than flawed human input.

Key insights

Unchecked human "optimizations" in agentic workflows can invalidate experiments and mask true model capabilities.

Principles

Method

The article describes a structured workflow where a human orchestrator decomposes a design into scoped work units for an executor model to build.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Engineer, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by LLM on Medium.