Nvidia Backs DeepInfra’s $107 Million Series B and the Investment Is About More Than One Inference Startup - Startup Fortune

2026-05-04 · Source: Series A" OR "Series B" OR "Series C" AI startup via Google News · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation, Cloud Computing & IT Infrastructure · Depth: Intermediate, medium

Summary

DeepInfra, an inference platform providing API access to open-weight models at competitive prices, secured a $107 million Series B funding round, bringing its total funding to over $133 million. Nvidia participated in this round, continuing its strategic pattern of investing in AI infrastructure companies like CoreWeave and Anthropic. DeepInfra has scaled its processing volume 8,000 times since 2022 by hosting models on owned Nvidia hardware, significantly undercutting hyperscaler rates. This investment highlights the growing strategic importance of the AI inference layer and Nvidia's expanding influence across the AI stack, from chip design to cloud infrastructure and inference platforms. DeepInfra's growth demonstrates a market stratification where independent clouds attract cost-sensitive developers, while hyperscalers target enterprise clients with compliance and support.

Key takeaway

For AI Architects evaluating inference solutions, DeepInfra's funding and Nvidia's involvement signal a critical shift. Your choice of inference provider now involves understanding not just cost and performance, but also the complex web of hardware supplier investments. Be aware that a provider's GPU supplier, if also an investor, has a financial interest in your product roadmap and hardware procurement, potentially impacting long-term cost and strategic flexibility.

Key insights

Nvidia's investment in DeepInfra highlights its strategic vertical integration across the AI stack, influencing inference infrastructure.

Principles

Owning hardware can reduce inference costs.
Price and model variety drive developer adoption.
Nvidia investments deepen hardware dependency.

Method

DeepInfra hosts open-weight models on owned Nvidia hardware, serving them via API at prices 5-10x lower than major providers, targeting cost-sensitive developers and enterprises.

In practice

Consider independent clouds for lower inference costs.
Evaluate platform concentration risk.
Understand investor influence on infrastructure.

Topics

DeepInfra Funding
NVIDIA Investment Strategy
AI Inference Platforms
Open-Weight Models
Cloud AI Infrastructure

Best for: CTO, AI Architect, MLOps Engineer, Investor, Director of AI/ML, VP of Engineering/Data

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Series A" OR "Series B" OR "Series C" AI startup via Google News.