Powering the AI Inference Wave with EPRI's Ben Sooter - Ep. 292
Summary
EPRI's Director of R&D, Ben Sooter, discusses the evolving relationship between AI data centers and the electric grid, highlighting that approximately 80% of an AI model's lifetime energy consumption occurs during inference, not training. He introduces the concept of micro data centers, which are smaller, geographically dispersed facilities designed to handle inference loads closer to end-users. These micro data centers, typically 3-20 megawatts, can be strategically located near underutilized electrical substations to leverage existing infrastructure, enhance grid resilience, and reduce interconnection queues. This distributed approach aims to meet the growing demand for low-latency AI services while optimizing energy distribution and potentially integrating with clean energy solutions like solar, wind, and energy storage.
Key takeaway
For CTOs and VPs of Engineering planning AI infrastructure, recognize that inference will dominate your long-term energy costs and demand patterns. You should investigate distributed micro data center strategies, particularly co-locating with underutilized substations, to ensure low-latency service delivery and enhance grid stability, avoiding the bottlenecks and high costs associated with large, centralized training facilities.
Key insights
AI model inference consumes 80% of lifetime energy, necessitating distributed micro data centers for grid resilience and low-latency services.
Principles
- Inference drives most AI energy demand.
- Geographic proximity improves AI service latency.
- Utilize existing grid infrastructure for efficiency.
Method
Deploy micro data centers (3-20 MW) near underutilized substations, distributing compute capacity across multiple sites to meet regional demand and optimize grid integration.
In practice
- Consider micro data centers for latency-sensitive AI applications.
- Explore substation co-location for new data center builds.
- Integrate energy storage for demand flexibility.
Topics
- AI Energy Demand
- Micro Data Centers
- AI Inference
- Electric Grid Integration
- Distributed Inference
Best for: CTO, VP of Engineering/Data, Director of AI/ML, Machine Learning Engineer, MLOps Engineer, AI Operations Specialist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by NVIDIA AI Podcast.