N-Version Programming with Coding Agents
Summary
A study on N-Version Programming (NVP) with AI coding agents evaluated 48 agent-generated implementations of the Launch Interceptor Program (LIP) specification using 1,000,000 randomized test inputs. The research found significant common-mode failures, with 429 coincident-failure cases observed compared to 115.36 predicted by an independence model, yielding a z-statistic of 29.20. Despite this fault correlation, the study demonstrated practical benefits: majority voting three-version units reduced the mean failure count from 387.44 to 130.99, and 11,844 units achieved zero observed failures. The experiment utilized five agent systems (Cursor, Claude Code, OpenAI Codex, Gemini, OpenCode), 23 models, and three languages (Python, Rust, Pascal), identifying that failures often concentrated in ambiguous specification parts like LICs #9 and #14.
Key takeaway
For software engineering teams developing critical systems, N-Version Programming with AI coding agents offers a viable strategy to enhance reliability. You should implement majority voting across multiple agent-generated versions to significantly reduce failure rates, even if individual versions exhibit correlated faults. Additionally, analyze common failure patterns to pinpoint and refine ambiguous areas within your specifications, improving overall system robustness.
Key insights
N-Version Programming with AI coding agents is feasible and improves reliability despite correlated failures stemming from specification ambiguities.
Principles
- Fault independence is not achieved with AI coding agents.
- Specification ambiguities drive common-mode failures.
- Redundancy provides reliability gains despite fault correlation.
Method
Replicate Knight–Leveson: generate versions from agents, screen with oracle, run 1M tests, analyze coincident failures, and simulate majority-vote N-version units.
In practice
- Combine agent-generated versions via majority voting.
- Analyze common failure modes to identify specification weaknesses.
- Vary agent, model, and language for version generation.
Topics
- N-Version Programming
- AI Coding Agents
- Software Reliability
- Fault Tolerance
- Common-Mode Failures
- Specification Ambiguity
Code references
Best for: AI Architect, Research Scientist, CTO, AI Scientist, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.