NEW GEMMA 4 beats GPT-5.4: The A4B Model

· Source: Discover AI · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

Google has released the Gemma 4 series of open-source models under an Apache 2 license, including 2B, 4B, 26B Mixture-of-Experts (MoE), and 31B dense models. This analysis focuses on live testing the 26B MoE (which activates 3.88B parameters) and the 31B dense model using a complex "elevator puzzle" designed to assess causal reasoning and logical problem-solving without external tools. The 4B active MoE model consistently demonstrated superior self-reflection, strategic planning, and constraint adherence, ultimately finding a valid 10-button press solution. In contrast, the 31B dense model struggled with optimization, often getting stuck in local minima and violating puzzle constraints, leading to invalid or suboptimal solutions. The 4B MoE's performance rivaled or exceeded larger proprietary models like GPT-5.4 (non-X-High) on this specific task.

Key takeaway

For AI Scientists and Machine Learning Engineers evaluating open-source LLMs for complex logical tasks, you should prioritize the Gemma 4 26B MoE (4B active) model. Its demonstrated self-correction and strategic planning capabilities make it a strong contender for applications requiring robust causal reasoning, potentially outperforming larger dense models and even some proprietary alternatives on such challenges. Consider its 31B dense counterpart primarily as a foundation for extensive fine-tuning.

Key insights

Gemma 4's 4B active MoE model excels in complex logical reasoning and self-correction, outperforming its 31B dense counterpart.

Principles

Method

The "elevator puzzle" assesses LLM causal reasoning by requiring shortest path optimization under complex mathematical button functions, energy limits, and floor caps, without external solvers.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Discover AI.