Alibaba's open model Qwen3.6 leads Google's Gemma 4 across agentic coding benchmarks

· Source: The Decoder · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, quick

Summary

Alibaba has introduced Qwen3.6-35B-A3B, a new open mixture-of-experts AI model that activates only three of its 35 billion parameters at a time, which Alibaba states reduces compute costs without compromising quality. This model significantly outperforms its predecessor, Qwen3.5-35B-A3B, on agentic coding tasks. Against Google's open Gemma 4-31B, Qwen3.6-35B-A3B leads across all listed coding benchmarks, scoring 73.4 to 52.0 on SWE-bench Verified and 51.5 to 42.9 on Terminal-Bench 2.0. It also shows superior performance on reasoning tests like GPQA (86.0 to 84.3) and AIME26 (92.7 to 89.2), and reportedly matches Claude Sonnet 4.5 on image and video tasks. The model is available through Qwen Studio, Alibaba Cloud Model Studio API (as Qwen3.6 Flash), and for download on Hugging Face and ModelScope.

Key takeaway

For AI Engineers evaluating open models for agentic coding or reasoning tasks, Qwen3.6-35B-A3B presents a compelling option. Its strong benchmark performance against Gemma 4-31B and Qwen3.5, coupled with its cost-efficient mixture-of-experts architecture, suggests it could be a more resource-friendly choice for deployment. You should consider testing its "thinking" and "non-thinking" modes for specific application needs.

Key insights

Alibaba's Qwen3.6-35B-A3B model offers strong performance with reduced compute via a mixture-of-experts architecture.

Principles

In practice

Topics

Best for: AI Engineer, AI Architect, Research Scientist, AI Scientist, Machine Learning Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.