Measurement of Generative AI Workload Power Profiles for Whole-Facility Data Center Infrastructure Planning
Summary
A new methodology measures the power consumption of generative AI workloads at 0.1-second resolution, linking these high-resolution measurements to whole-facility energy demand for data center infrastructure planning. Researchers used NLR's high-performance computing data center, equipped with NVIDIA H100 GPUs, to profile AI training, fine-tuning, and inference jobs. Workloads were characterized using MLCommons benchmarks for training and fine-tuning, and vLLM benchmarks for inference, ensuring reproducibility. The collected power profiles are then scaled to the whole-facility level using a bottom-up, event-driven energy model. This approach captures realistic temporal fluctuations driven by AI workloads and user behavior, providing critical data for planning grid connections, on-site energy generation, and distributed microgrids. The dataset of power consumption profiles is publicly available.
Key takeaway
For data center infrastructure planners and CTOs evaluating future energy needs, understanding the precise power demands of generative AI workloads is crucial. Your teams should integrate high-resolution AI power profiles into facility planning to accurately forecast energy consumption and optimize infrastructure investments for grid connections, on-site generation, and microgrids. This data-driven approach mitigates risks associated with under- or over-provisioning.
Key insights
High-resolution AI workload power profiles enable accurate whole-facility data center energy planning.
Principles
- Standardized benchmarks ensure reproducible workload profiling.
- Bottom-up modeling scales workload data to facility level.
Method
Measure AI workload power at 0.1-second resolution using NVIDIA H100 GPUs, characterize with MLCommons/vLLM benchmarks, then scale profiles to whole-facility energy demand via an event-driven model.
In practice
- Utilize public dataset for AI power consumption.
- Inform grid connection planning with temporal power profiles.
Topics
- Generative AI Workloads
- Data Center Energy Planning
- Power Consumption Profiling
- NVIDIA H100 GPUs
- MLCommons Benchmarks
Code references
Best for: CTO, VP of Engineering/Data, Director of AI/ML, MLOps Engineer, AI Architect, AI Hardware Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.