Now in Foundry: IBM Granite 4.1, NVIDIA Nemotron Nano Omni, and Qwen3.6-35B-A3B

2026-05-06 · Source: Microsoft Foundry Blog articles · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Advanced, medium

Summary

Microsoft Foundry has integrated three new major model families: IBM Granite 4.1, NVIDIA Nemotron-3-Nano-Omni-30B-A3B-Reasoning, and Qwen3.6-35B-A3B. IBM's Granite 4.1 is a family of 10 models, including 3B, 8B, and 30B LLMs, a safety model, a vision-language model for document extraction, and a multilingual speech recognition model. NVIDIA's Nemotron-3-Nano-Omni-30B-A3B-Reasoning is a 31B Mamba2-Transformer Hybrid Mixture-of-Experts (MoE) model, activating only 3B parameters per pass, offering multimodal capabilities for video, audio, image, and text with a 256,000-token context. Qwen3.6-35B-A3B is a 35B MoE model with 3B active parameters, designed for agentic coding, featuring thinking preservation across conversation turns and a context window extensible to 1 million tokens.

Key takeaway

For AI Architects and NLP Engineers evaluating new model deployments, these additions to Microsoft Foundry provide specialized and general-purpose options. Consider IBM Granite 4.1 for comprehensive enterprise document processing and safety, NVIDIA Nemotron-3-Nano-Omni for efficient multimodal media analysis, and Qwen3.6-35B-A3B for advanced agentic coding workflows requiring long context and preserved reasoning. Your choice should align with specific modality requirements and desired inference cost efficiencies.

Key insights

New models in Microsoft Foundry offer diverse capabilities from multimodal reasoning to agentic coding and document intelligence.

Principles

Hybrid architectures enhance efficiency.
Multimodality simplifies complex workflows.
Context preservation improves iterative tasks.

Method

Deploy models from the Hugging Face collection in Microsoft Foundry or directly from the Hugging Face Hub using one-click deployment to managed endpoints for secure, scalable inference.

In practice

Use Granite 4.1 for multilingual RAG and tool calling.
Employ Nemotron-3-Nano-Omni for meeting intelligence.
Apply Qwen3.6-35B-A3B for repository-level code changes.

Topics

Microsoft Foundry
IBM Granite 4.1
NVIDIA Nemotron Nano Omni
Qwen3.6-35B-A3B
Multimodal AI Models

Best for: AI Architect, NLP Engineer, Computer Vision Engineer, AI Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.