Run NVIDIA Nemotron 3 Super on Amazon Bedrock

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Intermediate, medium

Summary

NVIDIA Nemotron 3 Super, a 120B parameter hybrid Mixture of Experts (MoE) model with 12B active parameters, is now available as a fully managed, serverless offering on Amazon Bedrock. This model features a Hybrid Transformer-Mamba architecture and supports context lengths up to 256K tokens, processing text input to text output in English, French, German, Italian, Japanese, Spanish, and Chinese. It achieves up to 5x higher throughput efficiency and 2x higher accuracy than its predecessor, excelling in reasoning and agentic tasks across benchmarks like AIME 2025 and SWE Bench. Key architectural innovations include Latent MoE, which allows 4x more experts at the same inference cost, and Multi-token Prediction (MTP) for increased throughput in long reasoning sequences. Nemotron 3 Super is designed for multi-agent applications and specialized agentic AI systems, with open weights, datasets, and recipes for customization.

Key takeaway

For AI Engineers and Machine Learning Engineers building agentic AI applications, Nemotron 3 Super on Amazon Bedrock provides a powerful, managed solution. Its advanced MoE architecture and multi-token prediction capabilities can significantly improve the accuracy and efficiency of your reasoning and multi-agent workflows. You should explore its use for complex tasks like distributed system design, code generation, and specialized industry applications, leveraging the serverless infrastructure to reduce operational overhead.

Key insights

Nemotron 3 Super offers high efficiency and accuracy for agentic AI via a hybrid MoE architecture.

Principles

Method

Access Nemotron 3 Super via Amazon Bedrock console, AWS CLI, or AWS SDKs (Boto3, OpenAI-compatible API) using model ID "nvidia.nemotron-super-3-120b" for generative AI applications.

In practice

Topics

Best for: AI Engineer, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.