OpenAI unveils its first custom chip, built by Broadcom

· Source: AI News & Artificial Intelligence | TechCrunch · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Robotics & Autonomous Systems · Depth: Intermediate, extended

Summary

OpenAI has unveiled "Jalapeño," its first custom-built inference processor, developed in collaboration with Broadcom over the past 18 months. Designed specifically for OpenAI's inference systems, the chip's development was assisted by OpenAI's own AI models. Early testing indicates significantly better performance-per-watt compared to existing alternatives. This initiative aims to reduce OpenAI's reliance on Nvidia GPUs, following similar efforts by Google and Amazon. Jalapeño focuses on low operating costs for real-time coding models, though pre-training may still use Nvidia hardware. OpenAI plans to deploy 10 gigawatts of these new systems starting late next year, with rapid expansion over three years, aiming for a total capacity near 30 gigawatts. The company emphasizes optimizing the entire AI stack, from chip architecture to user experience, to make models faster, more reliable, and more affordable, ultimately driving compute abundance.

Key takeaway

For Directors of AI/ML scaling large language model services, OpenAI's Jalapeño chip highlights the critical shift towards custom silicon for inference. Your strategy should evaluate vertical integration opportunities and partnerships for purpose-built AI accelerators to significantly reduce operational costs and improve performance-per-watt. This approach is essential to meet escalating demand and achieve compute abundance, moving beyond reliance on general-purpose GPUs for specific high-volume workloads.

Key insights

OpenAI's custom chip, Jalapeño, signifies a strategic vertical integration to optimize AI inference costs and scale.

Principles

Method

Design custom inference chips with partners like Broadcom, applying AI models for optimization, and integrating across the full system stack including networking and algorithms.

In practice

Topics

Best for: Investor, VP of Engineering/Data, AI Architect, AI Hardware Engineer, Director of AI/ML, CTO

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI News & Artificial Intelligence | TechCrunch.