Your model is probabilistic. Your system of record can’t be.

2026-06-03 · Source: AI Advances - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, long

Summary

The article discusses the fundamental probabilistic nature of large language models (LLMs) and the critical distinction between their outputs and deterministic systems of record. It explains that an LLM's final layer uses a softmax function with a "temperature" knob (T) to sample tokens from a probability distribution, meaning the "most likely token is not the correct token." This inherent probabilism, even at T=0 (greedy decoding), provides repeatability but not guaranteed correctness, as the model still samples from a distribution of plausibility. The author argues that integrating LLMs into software without acknowledging this leads to a "category error," treating a sampler as a deterministic function. The core solution involves drawing a clear line between outputs where probabilism is acceptable (advisory, transient, bounded blast radius) and those requiring determinism (shared state, automation, auditable). For deterministic outputs, a "membrane" architecture is proposed, involving proposing, gating, pinning, deterministic execution, and accounting, ensuring the model only suggests, while a deterministic system decides.

Key takeaway

For AI Engineers and Architects integrating LLMs into production systems, you must explicitly differentiate between probabilistic model outputs and deterministic system-of-record requirements. Do not rely on temperature settings for correctness; instead, build a "membrane" around critical outputs. This involves having the model propose, then using deterministic gates to validate, pin, and execute, ensuring auditable and reproducible results, even if it adds latency and engineering overhead.

Key insights

The core challenge in integrating LLMs is managing their inherent probabilistic nature within deterministic software systems.

Principles

LLMs sample from plausibility, not truth.
Probabilism is inherent, not a switch.
Determinism is built, not configured.

Method

The article proposes a "membrane" architecture for outputs requiring determinism: propose, gate, pin, execute deterministically, and account. This ensures the model suggests, but a deterministic system decides and records.

In practice

Classify outputs by shared state, automation, auditability.
Implement a "membrane" for deterministic outputs.
Ensure model proposals are idempotent.

Topics

Large Language Models
Probabilistic AI
System of Record
Deterministic Systems
AI Architecture
Model Output Management

Best for: AI Engineer, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Advances - Medium.