Unbox one of NVIDIA's first co-packaged optics switches with us. See why we bet on CPO early.

· Source: The Lambda Deep Learning Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Robotics & Autonomous Systems · Depth: Advanced, medium

Summary

NVIDIA's Quantum-X InfiniBand Photonics Q3450-LD switch, featuring co-packaged optics (CPO), represents a significant shift in large GPU cluster networking, particularly for 800G and NVIDIA GB300 NVL72 scale deployments. This 4U, liquid-cooled switch, powered by a 48V DC busbar and an NVIDIA Quantum-X800 ASIC, offers 144 x 800G InfiniBand ports and 115.2 Tb/s non-blocking switching capacity. CPO reduces networking power consumption, freeing up significant power for GPUs; for example, a 41,472-GPU cluster could gain 3,137 power-equivalent GPUs. It also enhances reliability by eliminating 655,000 discrete pluggable transceivers in a 128,000-GPU data center, reducing potential failure points. The Q3450-LD integrates optical conversion directly next to the switch ASIC, shortening electrical paths from centimeters to micrometers and dropping signal loss from 20dB to 4dB, removing the need for power-intensive DSPs. Early access to engineering samples allows for critical pre-production planning for rack design, cooling, and fiber management.

Key takeaway

For AI Architects and MLOps Engineers designing large-scale GPU clusters, adopting co-packaged optics (CPO) like the NVIDIA Q3450-LD is crucial. You can significantly increase GPU density within existing power envelopes and enhance network reliability for agentic workloads. Prioritize early infrastructure planning for cooling, power, and fiber management to integrate CPO effectively and maximize your cluster's token throughput and operational uptime.

Key insights

Co-packaged optics (CPO) significantly improves power efficiency and reliability in large-scale GPU cluster networking.

Principles

Method

Installing CPO switches requires upfront planning for rack fit, busbar alignment, liquid-cooling connections, pressure checks, and fiber routing, integrating vendor and deployment teams.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Architect, MLOps Engineer, AI Hardware Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Lambda Deep Learning Blog.