Measurement of Generative AI Workload Power Profiles for Whole-Facility Data Center Infrastructure Planning

· Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure · Depth: Advanced, medium

Summary

A new methodology measures the power consumption of generative AI workloads at 0.1-second resolution, linking these high-resolution measurements to whole-facility energy demand for data center infrastructure planning. Researchers used NLR's high-performance computing data center, equipped with NVIDIA H100 GPUs, to profile AI training, fine-tuning, and inference jobs. Workloads were characterized using MLCommons benchmarks for training and fine-tuning, and vLLM benchmarks for inference, ensuring reproducibility. The collected power profiles are then scaled to the whole-facility level using a bottom-up, event-driven energy model. This approach captures realistic temporal fluctuations driven by AI workloads and user behavior, providing critical data for planning grid connections, on-site energy generation, and distributed microgrids. The dataset of power consumption profiles is publicly available.

Key takeaway

For data center infrastructure planners and CTOs evaluating future energy needs, understanding the precise power demands of generative AI workloads is crucial. Your teams should integrate high-resolution AI power profiles into facility planning to accurately forecast energy consumption and optimize infrastructure investments for grid connections, on-site generation, and microgrids. This data-driven approach mitigates risks associated with under- or over-provisioning.

Key insights

High-resolution AI workload power profiles enable accurate whole-facility data center energy planning.

Principles

Method

Measure AI workload power at 0.1-second resolution using NVIDIA H100 GPUs, characterize with MLCommons/vLLM benchmarks, then scale profiles to whole-facility energy demand via an event-driven model.

In practice

Topics

Code references

Best for: CTO, VP of Engineering/Data, Director of AI/ML, MLOps Engineer, AI Architect, AI Hardware Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.