Open Model Inference at Scale on Foundry: What’s New with Fireworks AI
Summary
Microsoft Foundry has expanded its Fireworks AI offerings, introducing DeepSeek V4 Pro and Kimi K2.6 models, alongside enhanced EU Datazone PTUs and new PTU support in the US Data Zone. Launched two months ago, Fireworks AI on Foundry provides low-latency, high-throughput open model inference directly within Azure, addressing the need for frontier open model performance without rearchitecting infrastructure or managing separate compliance. The platform integrates Fireworks' inference engine with Azure's enterprise control plane, offering unified access controls, audit logs, content filtering, and compliance tooling. This expansion allows US-based customers, particularly those in regulated industries, to achieve predictable, steady-state performance with data residency within US Azure regions, utilizing serverless or PTU options for these new models.
Key takeaway
For CTOs and VPs of Engineering evaluating open model adoption, this update means you can deploy frontier models like DeepSeek V4 Pro and Kimi K2.6 with predictable performance and strict data residency requirements within your existing Azure ecosystem. Your teams can move from experimentation to production faster, avoiding the overhead of new security reviews or fragmented tooling, by leveraging Azure's built-in enterprise governance and PTU support in the US Data Zone.
Key insights
Microsoft Foundry expands its Fireworks AI offerings with new models and data zone support for enterprise-grade open model inference.
Principles
- Combine performance with enterprise controls.
- Data residency is critical for regulated industries.
Method
Integrate Fireworks AI's inference engine with Azure's enterprise control plane to deliver high-performance open models, ensuring compliance and data sovereignty through PTUs and data zone support.
In practice
- Benchmark DeepSeek V4 Pro for code generation.
- Utilize Kimi K2.6 for complex reasoning tasks.
- Request PTU quota for consistent production throughput.
Topics
- Fireworks AI
- Microsoft Foundry
- Open Model Inference
- DeepSeek V4 Pro
- Kimi K2.6
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, MLOps Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.