Why The CPU Is Back
Summary
Arm recently achieved a record quarter, doubling its AGI CPU demand in six weeks, driven by a shift in the AI scaling curve. The AI landscape is characterized by three distinct regimes, each moving the computational bottleneck to a different silicon layer. The first two regimes, focused on pretraining and inference-time scaling, primarily benefited NVIDIA. However, the current third regime, centered on agentic scaling, has shifted demand towards CPUs, specifically benefiting Arm. This shift does not come at NVIDIA's expense but rather represents an additional layer of compute consumption, indicating a broader expansion of total compute demand across different hardware types.
Key takeaway
For VPs of Engineering and Data evaluating future AI infrastructure investments, recognize that the shift to agentic scaling fundamentally alters hardware requirements. Your strategy should now account for increased CPU demand, particularly Arm-based solutions, alongside existing GPU infrastructure to optimize for emerging AI workloads and maintain competitive performance.
Key insights
AI scaling has shifted to agentic workloads, driving significant CPU demand for Arm.
Principles
- AI scaling involves three distinct regimes.
- Each regime redistributes compute consumption.
- Agentic scaling drives CPU demand.
In practice
- Monitor agentic AI workload growth.
- Evaluate CPU architectures for AI inference.
Topics
- Arm CPU Demand
- AI Scaling Regimes
- Agentic Scaling
- Pretraining Scaling
- Inference-Time Scaling
Best for: Investor, VP of Engineering/Data, MLOps Engineer, Director of AI/ML, AI Architect, CTO
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Business Engineer.