Claude Opus 4.6 vs. GPT-5.3 Codex: How I shipped 93,000 lines of code in 5 days

· Source: Lenny's Newsletter · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, extended

Summary

OpenAI's GPT-5.3 Codex and Anthropic's Claude Opus 4.6 coding models were rigorously tested by shipping 44 pull requests over five days, primarily focusing on a marketing site redesign and internal component refactoring. Codex, available as a desktop app, emphasizes Git primitives like repositories, branches, and worktrees, and features "skills" and "automations" for consistent task execution. However, GPT-5.2 Codex demonstrated overly literal interpretation and overfitting during the marketing site redesign, struggling with creative, broad tasks. Claude Opus 4.6, tested within the Cursor desktop app, excelled at generative, greenfield work, producing a superior site redesign and effective initial refactoring. A combined workflow emerged where Opus handled initial development (80-90% completion), and Codex performed architectural review, bug detection, and code polishing, replicating a principal software engineer's role.

Key takeaway

For AI Architects and VP of Engineering evaluating new coding models, consider a dual-model strategy. Deploy Claude Opus 4.6 for initial product development and feature implementation, leveraging its strength in generative and creative tasks. Subsequently, integrate GPT-5.3 Codex for rigorous code review, architectural validation, and identifying edge cases, treating it as a "principal engineer" for hardening code quality. This approach can significantly accelerate development cycles and improve code robustness, despite the higher token costs associated with advanced models.

Key insights

Combining Claude Opus 4.6 for generative coding with GPT-5.3 Codex for code review optimizes development workflows.

Principles

Method

Use Opus 4.6 for initial feature implementation and broad redesigns, then employ Codex for architectural review, bug detection, and code polishing before shipping to production.

In practice

Topics

Best for: AI Architect, CTO, VP of Engineering/Data, AI Engineer, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Lenny's Newsletter.