Keep Deterministic Work Deterministic

2026-03-19 · Source: AI & ML – Radar · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, long

Summary

An analysis of an LLM-driven blackjack simulation reveals the "March of Nines" problem, where achieving higher reliability (e.g., 90% to 99%) requires disproportionate engineering effort due to cascading failures. Initial runs of the simulation, where an LLM played hands against plain English strategies, had a 37% pass rate. The author demonstrates this compounding error with an 8-prompt exercise using ChatGPT 5.3 Instant, where a single early miscalculation can derail an entire sequence, leading to incorrect final scores. The article emphasizes that LLMs struggle with deterministic tasks like character counting within tokens, making them susceptible to cascading failures in multi-step pipelines. The author iterated through eight versions of the blackjack pipeline, improving the pass rate from 31% to 94% by making deterministic work (like card dealing and strategy validation) explicit code and applying structural constraints (like Chain of Thought and rigid output formats) to LLM calls.

Key takeaway

For AI Engineers building multi-step LLM workflows, prioritize identifying and offloading deterministic tasks to conventional code. Your pipelines will achieve significantly higher reliability and reduce debugging complexity by making steps like data validation or rule application deterministic. This approach, exemplified by replacing an LLM validator with a 10-line script for a 31% pass rate jump, is more effective than extensive prompt engineering for tasks that don't require LLM judgment.

Key insights

Deterministic tasks in LLM pipelines should be handled by code to prevent cascading failures and improve reliability.

Principles

Each "nine" of reliability costs as much as the last.
Deterministic work should be handled by deterministic code.
Cascading failures are inherent in chained LLM operations.

Method

Improve LLM pipeline reliability by identifying and replacing deterministic LLM steps with code, and applying structural constraints like Chain of Thought for remaining LLM-dependent steps.

In practice

Use code for arithmetic, string matching, and rule evaluation.
Implement Chain of Thought to reduce LLM errors.
Employ rigid output formats to guide LLM responses.

Topics

LLM Reliability
Agentic Engineering
Cascading Failures
LLM Pipelines
Deterministic Systems

Code references

andrewstellman/octobatch

Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI & ML – Radar.