Unleash the Power of Intel® Xeon® 6 Processors with P-cores as AI Host CPU with Priority Core Turbo

· Source: Artificial Intelligence (AI) articles · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, AI Hardware Optimization · Depth: Advanced, medium

Summary

Intel's Priority Core Turbo (PCT) feature, available in select Intel Xeon 6 processors with P-core SKUs, significantly enhances AI system performance by enabling designated high-priority cores to achieve elevated peak turbo frequencies. This acceleration is crucial for demanding AI workloads and maximizing GPU utilization. In long-context inference tests, an Intel Xeon 6776P processor with PCT and eight NVIDIA HGX B300 GPUs achieved 218 tokens/sec with the QWEN3-235B model and a 100K-token context in FP16, representing a 1.8x improvement over 121 tokens/sec without PCT, while sustaining a request rate of 6 and meeting 400 ms goodput SLOs. For checkpointing, PCT reduced completion times for large-scale Llama 3.3 models, showing a 5.4% reduction for Llama 3.3-70B and a 7.2% improvement for a synthetic Llama 3.3-140B model under Distributed Checkpointing. These gains stem from the CPU's role in orchestrating tasks like tokenization and data movement, allowing GPUs to produce the first token faster.

Key takeaway

For AI Architects designing high-performance inference or training infrastructure, integrating Intel Xeon 6 processors with Priority Core Turbo (PCT) can significantly improve system efficiency. You should consider PCT-enabled SKUs to accelerate CPU-bound tasks like tokenization and checkpointing, directly enhancing GPU utilization and goodput. This optimization is critical for maintaining low-latency performance and meeting stringent service-level objectives in demanding AI environments. Evaluate binding your NVIDIA HGX GPUs to PCT cores for maximum benefit.

Key insights

Intel's Priority Core Turbo significantly boosts AI inference and training performance by accelerating CPU-bound tasks.

Principles

Method

Optimize AI host CPU performance by enabling Priority Core Turbo (PCT) on Intel Xeon 6 processors and binding GPUs to these PCT-enabled cores.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, Machine Learning Engineer, AI Architect, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence (AI) articles.