Open Model Inference at Scale on Foundry: What’s New with Fireworks AI

2026-05-14 · Source: Microsoft Foundry Blog articles · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Intermediate, short

Summary

Microsoft Foundry has expanded its Fireworks AI offerings, introducing DeepSeek V4 Pro and Kimi K2.6 models, alongside enhanced EU Datazone PTUs and new PTU support in the US Data Zone. Launched two months ago, Fireworks AI on Foundry provides low-latency, high-throughput open model inference directly within Azure, addressing the need for frontier open model performance without rearchitecting infrastructure or managing separate compliance. The platform integrates Fireworks' inference engine with Azure's enterprise control plane, offering unified access controls, audit logs, content filtering, and compliance tooling. This expansion allows US-based customers, particularly those in regulated industries, to achieve predictable, steady-state performance with data residency within US Azure regions, utilizing serverless or PTU options for these new models.

Key takeaway

For CTOs and VPs of Engineering evaluating open model adoption, this update means you can deploy frontier models like DeepSeek V4 Pro and Kimi K2.6 with predictable performance and strict data residency requirements within your existing Azure ecosystem. Your teams can move from experimentation to production faster, avoiding the overhead of new security reviews or fragmented tooling, by leveraging Azure's built-in enterprise governance and PTU support in the US Data Zone.

Key insights

Microsoft Foundry expands its Fireworks AI offerings with new models and data zone support for enterprise-grade open model inference.

Principles

Combine performance with enterprise controls.
Data residency is critical for regulated industries.

Method

Integrate Fireworks AI's inference engine with Azure's enterprise control plane to deliver high-performance open models, ensuring compliance and data sovereignty through PTUs and data zone support.

In practice

Benchmark DeepSeek V4 Pro for code generation.
Utilize Kimi K2.6 for complex reasoning tasks.
Request PTU quota for consistent production throughput.

Topics

Fireworks AI
Microsoft Foundry
Open Model Inference
DeepSeek V4 Pro
Kimi K2.6

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, MLOps Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.