A new version of OpenAI’s Codex is powered by a new dedicated chip
Summary
OpenAI has released GPT-5.3-Codex-Spark, a lightweight version of its agentic coding tool designed for faster inference and real-time collaboration. This new model is powered by a dedicated Cerebras Wafer Scale Engine 3 (WSE-3) chip, marking a deeper integration in OpenAI's physical infrastructure. The WSE-3 is Cerebras' third-generation waferscale megachip, featuring 4 trillion transistors. This release is the "first milestone" in a multi-year, over $10 billion partnership between OpenAI and Cerebras, announced last month. Spark is intended for rapid prototyping and daily productivity, complementing the original GPT-5.3-Codex model's heavier tasks. It is currently available in research preview for ChatGPT Pro users within the Codex app.
Key takeaway
For CTOs and VP of Engineering evaluating AI infrastructure, the integration of Cerebras' WSE-3 chip for OpenAI's GPT-5.3-Codex-Spark demonstrates the value of specialized hardware in achieving ultra-low latency for agentic coding tools. Your teams should consider dedicated AI accelerators for specific, performance-critical workflows to enhance developer productivity and enable new interaction paradigms, rather than relying solely on general-purpose compute.
Key insights
Dedicated AI hardware accelerates lightweight models for real-time developer collaboration and rapid iteration.
Principles
- Low latency enhances AI interaction patterns.
- Specialized hardware optimizes specific AI workflows.
In practice
- Utilize lightweight models for rapid prototyping.
- Integrate specialized chips for low-latency AI tasks.
Topics
- OpenAI Codex
- Cerebras Wafer Scale Engine
- AI Hardware
- Code Generation
- Low-Latency Inference
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI News & Artificial Intelligence | TechCrunch.