Startup Boosts Scale-Up to 1000+ GPUs in a Single Domain

2026-05-27 · Source: Big Data & AI News - EE Times · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure · Depth: Advanced, short

Summary

Delos Data, a startup, is developing a cluster management software stack and a new server design to enable GPU scale-up domains exceeding 1000 GPUs for AI inference workloads. Their Nonstop AI platform offers flexible topology options, a disaggregated server design, and 72x 200 Gb/s ports per server via OSFPs. This approach aims to reduce cost and power per token by improving GPU utilization, addressing the nanosecond latency sensitivity of distributed inference. The system supports huge scale-up domains, potentially 10,000 GPUs, and features a Mosaic software stack for graceful failure handling and re-routing data. Broader availability is planned for the fourth quarter of 2026.

Key takeaway

For MLOps Engineers optimizing large-scale AI inference, Delos Data's disaggregated architecture offers a path to significantly larger GPU scale-up domains than current NVLink limits. You should evaluate how flexible topologies and robust software-managed resilience could reduce your operational costs and improve GPU utilization for latency-sensitive workloads. Consider its Q4 2026 availability for future infrastructure planning.

Key insights

Disaggregated server design and software enable resilient, large-scale GPU inference clusters with flexible topologies.

Principles

Distributed inference demands nanosecond latency and always-on reliability.
Modular architectures enhance flexibility and physical disaggregation.
Software is crucial for managing resilience in large-scale networks.

In practice

Design for 1000+ GPU scale-up domains.
Utilize OSFP cables for flexible interconnects.
Implement software for graceful failure handling.

Topics

GPU Scale-Up
AI Inference
Cluster Management
Disaggregated Systems
Network Topology
Data Center Interconnects

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Architect, MLOps Engineer, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Big Data & AI News - EE Times.