NVIDIA Partners With Microsoft on Unified Stack for Agentic AI Deployment, From Windows Devices to Cloud to Local

· Source: NVIDIA Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Expert, extended

Summary

NVIDIA and Microsoft have expanded their partnership to deliver a comprehensive stack for agentic AI deployment across Windows devices, Azure cloud, and local environments. Key announcements include RTX Spark PCs, offering 1 petaflop AI performance and up to 128GB unified memory, and DGX Station for Windows, a deskside AI supercomputer with 20 petaflops FP4 performance for 1 trillion parameter models, both running NVIDIA OpenShell for secure agent runtime. The collaboration also accelerates Microsoft Fabric Data Warehouse with NVIDIA GPUs, achieving up to 7x faster SQL execution. NVIDIA's open models, like Nemotron 3 Ultra and Cosmos 3, are now available on Microsoft Foundry, supporting enterprise-scale agentic workflows. Furthermore, the partnership extends to physical AI, secure agent development in GitHub Copilot, and the deployment of NVIDIA Grace Blackwell and Vera Rubin systems in Azure's Fairwater AI factories, significantly boosting inference throughput and reducing costs. Microsoft also introduced new MAI models and Frontier Tuning for custom enterprise AI.

Key takeaway

For AI Architects evaluating agentic AI deployments, this partnership provides a robust, full-stack solution spanning edge to cloud. You should consider integrating NVIDIA's RTX Spark or DGX Station for Windows for local agent development and leveraging GPU-accelerated Microsoft Fabric for data-intensive workflows. Utilize NVIDIA OpenShell for secure agent runtimes and explore Frontier Tuning to build continuously improving, custom enterprise AI models, ensuring your solutions are both performant and secure.

Key insights

NVIDIA and Microsoft are building a full-stack, secure, and scalable ecosystem for agentic AI, from edge devices to cloud.

Principles

Method

Frontier Tuning enables organizations to build "hill-climbing machines" by defining private evals, contexts, tools, and training models on their own data for continuous, task-specific agent improvement.

In practice

Topics

Code references

Best for: CTO, VP of Engineering/Data, MLOps Engineer, AI Engineer, AI Architect, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by NVIDIA Blog.