NVIDIA NVbandwidth: Your Essential Tool for Measuring GPU Interconnect and Memory Performance

· Source: NVIDIA Technical Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, medium

Summary

NVIDIA NVbandwidth is a CUDA-based tool designed to measure memory bandwidth and latency across various memory copy patterns in single-GPU, multi-GPU, and multi-node NVIDIA systems. It utilizes either copy engine (CE) or kernel copy methods to report current measured bandwidth, offering insights into data transfer performance between CPU memory and GPU memory, and GPU memory to GPU memory. The tool supports comprehensive unidirectional, bidirectional, multi-GPU, and multi-node tests, along with latency testing. NVbandwidth is topology-agnostic, working across NVLINK, NVLink C2C, or PCIe interconnects, and provides flexible output options including plain text and JSON. It requires a CUDA-enabled NVIDIA GPU, CUDA toolkit (version 11.X+ for single-node, 12.3+ for multi-node), compatible NVIDIA display driver, C++17 compiler, CMake 3.20+, and Boost program options library.

Key takeaway

For ML infrastructure engineers and system architects evaluating or optimizing GPU deployments, NVbandwidth provides critical metrics for data transfer performance. You should integrate NVbandwidth into your validation workflows to benchmark new hardware, identify bottlenecks in existing systems, and perform regression testing after software or driver updates. This ensures your CUDA applications achieve optimal data movement and overall system efficiency.

Key insights

NVbandwidth measures GPU memory transfer performance to optimize CUDA applications and validate system configurations.

Principles

Method

NVbandwidth measures performance by enqueuing a spin kernel, then a start event, multiple memcpy iterations, and a stop event, ensuring overhead exclusion.

In practice

Topics

Code references

Best for: Machine Learning Engineer, AI Architect, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by NVIDIA Technical Blog.