Keep Deterministic Work Deterministic

· Source: AI & ML – Radar · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, long

Summary

An analysis of an LLM-driven blackjack simulation reveals the "March of Nines" problem, where achieving higher reliability (e.g., 90% to 99%) requires disproportionate engineering effort due to cascading failures. Initial runs of the simulation, where an LLM played hands against plain English strategies, had a 37% pass rate. The author demonstrates this compounding error with an 8-prompt exercise using ChatGPT 5.3 Instant, where a single early miscalculation can derail an entire sequence, leading to incorrect final scores. The article emphasizes that LLMs struggle with deterministic tasks like character counting within tokens, making them susceptible to cascading failures in multi-step pipelines. The author iterated through eight versions of the blackjack pipeline, improving the pass rate from 31% to 94% by making deterministic work (like card dealing and strategy validation) explicit code and applying structural constraints (like Chain of Thought and rigid output formats) to LLM calls.

Key takeaway

For AI Engineers building multi-step LLM workflows, prioritize identifying and offloading deterministic tasks to conventional code. Your pipelines will achieve significantly higher reliability and reduce debugging complexity by making steps like data validation or rule application deterministic. This approach, exemplified by replacing an LLM validator with a 10-line script for a 31% pass rate jump, is more effective than extensive prompt engineering for tasks that don't require LLM judgment.

Key insights

Deterministic tasks in LLM pipelines should be handled by code to prevent cascading failures and improve reliability.

Principles

Method

Improve LLM pipeline reliability by identifying and replacing deterministic LLM steps with code, and applying structural constraints like Chain of Thought for remaining LLM-dependent steps.

In practice

Topics

Code references

Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI & ML – Radar.