Introduction to profiling tools for AMD hardware

· Source: AMD ROCm Blogs · Field: Technology & Digital — Software Development & Engineering, Artificial Intelligence & Machine Learning · Depth: Intermediate, long

Summary

AMD provides a suite of profiling tools designed to help developers optimize applications for AMD hardware, including "Zen" Core CPUs, RDNA™ GPUs, and CDNA™ accelerators. The article introduces key tools like rocprofiler-sdk, rocprofv3, rocprof-sys, rocprof-compute, Radeon™ GPU Profiler, and AMD uProf, detailing their specific capabilities and supported architectures/operating systems. It emphasizes that efficient application performance requires more than just benchmarking execution time; it necessitates understanding where a program spends its time and identifying bottlenecks. The post outlines a decision-making framework based on profiling objectives: identifying hot spots, assessing hardware utilization (e.g., through Roofline Analysis), and understanding the root causes of observed performance via hardware metrics. Several superseded tools are also noted, with their modern replacements highlighted.

Key takeaway

For AI Engineers and HPC developers optimizing applications on AMD hardware, you should select your profiling tool based on your specific objective and target architecture. Start by identifying performance hot spots with timeline tracing tools like rocprof-sys or rocprofv3. Then, use tools such as rocprof-compute for detailed kernel analysis or AMD uProf for broader system insights to diagnose the root causes of performance bottlenecks, ensuring efficient hardware utilization.

Key insights

AMD offers diverse profiling tools to optimize application performance across its CPU and GPU architectures.

Principles

Method

To profile, first identify hot spots with timeline traces (rocprof-sys, rocprofv3). Then, assess hardware utilization via Roofline Analysis (rocprof-compute, AMD uProf). Finally, collect hardware metrics to understand performance causes (rocprofv3, rocprof-sys, rocprof-compute, AMD uProf).

In practice

Topics

Code references

Best for: Machine Learning Engineer, AI Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AMD ROCm Blogs.