Claude Opus 4.6 vs. GPT-5.3 Codex: How I shipped 93,000 lines of code in 5 days
Summary
OpenAI's GPT-5.3 Codex and Anthropic's Claude Opus 4.6 coding models were rigorously tested by shipping 44 pull requests over five days, primarily focusing on a marketing site redesign and internal component refactoring. Codex, available as a desktop app, emphasizes Git primitives like repositories, branches, and worktrees, and features "skills" and "automations" for consistent task execution. However, GPT-5.2 Codex demonstrated overly literal interpretation and overfitting during the marketing site redesign, struggling with creative, broad tasks. Claude Opus 4.6, tested within the Cursor desktop app, excelled at generative, greenfield work, producing a superior site redesign and effective initial refactoring. A combined workflow emerged where Opus handled initial development (80-90% completion), and Codex performed architectural review, bug detection, and code polishing, replicating a principal software engineer's role.
Key takeaway
For AI Architects and VP of Engineering evaluating new coding models, consider a dual-model strategy. Deploy Claude Opus 4.6 for initial product development and feature implementation, leveraging its strength in generative and creative tasks. Subsequently, integrate GPT-5.3 Codex for rigorous code review, architectural validation, and identifying edge cases, treating it as a "principal engineer" for hardening code quality. This approach can significantly accelerate development cycles and improve code robustness, despite the higher token costs associated with advanced models.
Key insights
Combining Claude Opus 4.6 for generative coding with GPT-5.3 Codex for code review optimizes development workflows.
Principles
- Models overfit to literal prompts.
- Generative models excel at greenfield work.
- Review models identify architectural flaws.
Method
Use Opus 4.6 for initial feature implementation and broad redesigns, then employ Codex for architectural review, bug detection, and code polishing before shipping to production.
In practice
- Redesign marketing sites with Opus 4.6.
- Refactor complex components using Opus.
- Use Codex for final code review and bug fixes.
Topics
- AI Code Generation
- GPT-5.3 Codex
- Claude Opus 4.6
- Software Development Workflows
- Code Review
Best for: AI Architect, CTO, VP of Engineering/Data, AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Lenny's Newsletter.