NVIDIA Releases Nemotron-Cascade 2: An Open 30B MoE with 3B Active Parameters, Delivering Better Reasoning and Strong Agentic Capabilities

· Source: Machine Learning ML & Generative AI News · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Advanced, quick

Summary

NVIDIA has released Nemotron-Cascade 2, an open-weight Mixture-of-Experts (MoE) model featuring a 30B architecture with 3B active parameters, designed to enhance "intelligence density." This model is the second open-weight offering to achieve Gold Medal-level performance in IMO 2025 and IOI 2025 benchmarks. Its core innovation lies in integrating Cascade RL with Multi-domain On-Policy Distillation (MOPD), which provides a dense token-level advantage and improves sample efficiency compared to sequence-level reward methods like GRPO. Nemotron-Cascade 2 demonstrates strong performance in math, coding, and instruction following, outperforming Qwen3.5-35B-A3B on AIME 2025 and ArenaHard v2, though it trades off performance in knowledge-intensive tasks. It also features a 1M context window and a "Thinking Mode" for complex reasoning and agentic workflows.

Key takeaway

For AI Architects and Research Scientists evaluating open-weight models for complex reasoning, Nemotron-Cascade 2 offers a compelling option due to its strong performance in math, coding, and agentic capabilities. Consider its 1M context window and "Thinking Mode" for applications requiring deep logical processing, but be mindful of its reduced efficacy in knowledge-intensive domains.

Key insights

NVIDIA's Nemotron-Cascade 2 MoE model excels in reasoning and agentic tasks via Cascade RL and MOPD.

Principles

Method

Nemotron-Cascade 2 integrates Cascade RL with Multi-domain On-Policy Distillation (MOPD) to provide dense token-level advantages, enhancing sample efficiency and recovering performance regressions during training.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Architect, AI Researcher, AI Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning ML & Generative AI News.