The NVIDIA Rubin Platform: Six New Chips, One AI Supercomputer
Summary
NVIDIA has unveiled the Vera Rubin AI supercomputer architecture, a system comprising six distinct chips engineered for unified operation. This architecture features the custom-designed Vera CPU, which doubles the performance of its predecessor, and the Reuben GPU, both co-designed for bidirectional, coherent, low-latency data sharing. The Vera Rubin compute board integrates 17,000 components, including one Vera CPU and two Reuben GPUs, delivering 100 petaflops of AI performance, a five-fold increase. Data transfer is handled by Connect X9, providing 1.6 terabits per second bandwidth per GPU, and the Bluefield 4 DPU, which offloads storage and security tasks. The system's compute tray is redesigned to eliminate cables, hoses, and fans, housing a Bluefield 4 DPU, eight Connect X9s, two Vera CPUs, and four Reuben GPUs. The MVL Link switch, now in its sixth generation, connects 18 compute nodes, scaling up to 72 Reuben GPUs, while Spectrum X Ethernet Photonix enables scaling to thousands of racks. The first Vera Rubin MVL 72 rack, the culmination of 15,000 engineer-years of development, contains six breakthrough chips, 18 compute trays, nine MVLink switch trays, and 220 trillion transistors, weighing nearly two tons.
Key takeaway
For CTOs and VPs of Engineering evaluating next-generation AI infrastructure, the Vera Rubin architecture represents a significant leap in compute density and interconnectivity. Your teams should consider its integrated CPU/GPU/DPU design and MVL Link capabilities for demanding AI workloads, especially those requiring massive scale-out and high data throughput. This system's design, emphasizing cable-free trays and advanced networking, could simplify deployment and maintenance for large-scale AI factories.
Key insights
The Vera Rubin architecture integrates co-designed CPUs, GPUs, and DPUs for unprecedented AI supercomputing performance and scalability.
Principles
- Co-design CPUs and GPUs for optimal data sharing.
- Offload non-compute tasks to DPUs.
- Eliminate cables for improved reliability and cooling.
In practice
- Utilize MVL Link for high-bandwidth GPU interconnects.
- Deploy Spectrum X Ethernet for large-scale AI factories.
- Integrate Bluefield DPUs for enhanced security and storage.
Topics
- Vera Rubin Architecture
- AI Processors
- High-Speed Interconnects
- Data Processing Units
Best for: CTO, VP of Engineering/Data, AI Architect, MLOps Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by NVIDIA.