ROCm 7.13: Expanding Hardware, Tools, and Reach
Summary
AMD has released ROCm Core 7.13, AMD GPU Driver 31.30, and AMD GPU Virtualization 9.0, significantly expanding hardware support and developer tools for enterprise datacenters. ROCm 7.13 introduces support for AMD Instinct MI350-series accelerators, including bare-metal and Kubernetes support for the MI350P on Ubuntu 24.04.4, Ubuntu 26.04, and RHEL 9.6, and extends GPU partitioning to MI350X and MI355X. The update enhances GPU virtualization across KVM and VMware ESXi, supporting passthrough for MI300X on ESXi 8.0 Update 3 and virtualization for MI350X/MI355X on ESXi 9.1, with Multi-VF support for MI300X on KVM creating up to 8 virtual GPUs. Developer experience is improved with streamlined rocprofiler-systems, open-sourced ROCprof Trace decoder, live attach/detach profiling, and the new ROCm Optiq 0.4.0 visualization tool. Multi-node communication is optimized for Strix Halo with RCCL and rocSHMEM, and a modular packaging structure simplifies installation.
Key takeaway
For MLOps Engineers deploying AI workloads, ROCm 7.13 offers critical enhancements for managing AMD GPU infrastructure. You can now utilize MI350-series accelerators with expanded bare-metal and Kubernetes support, or employ enhanced GPU virtualization on KVM and VMware ESXi for efficient resource sharing. Consider adopting the new modular packaging for streamlined deployments and explore ROCm Optiq 0.4.0 for deeper performance analysis. This update simplifies scaling AI development from prototyping to production.
Key insights
ROCm 7.13 expands AMD GPU hardware support, virtualization, and developer tools for AI workloads.
Principles
- Open-source accelerates adoption and improves stability.
- Modular packaging simplifies core and specialized installations.
- Unified tools enable prototyping on consumer GPUs.
Method
TheRock, AMD's automated build system, streamlines ROCm packaging into a Core SDK and optional domain-specific extensions, using a pure CMake build for stable nightly builds.
In practice
- Configure MI350P in SPX for max throughput or CPX for multiple workloads.
- Use Multi-VF on MI300X for up to 8 isolated virtual GPUs.
- Prototype AI apps on Radeon laptops before datacenter deployment.
Topics
- ROCm 7.13
- AMD Instinct Accelerators
- GPU Virtualization
- AI Development Tools
- Performance Profiling
- Modular Packaging
- MLOps
Code references
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Machine Learning Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AMD ROCm Blogs.