What’s new in Gemma 4?

· Source: Google DeepMind · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Internet of Things (IoT) & Connected Devices · Depth: Intermediate, quick

Summary

Google DeepMind has released Gemma 4, a new family of open-source models under the Apache 2.0 license, built with the same research and technology as Gemini 3. Designed for local execution on devices like phones, laptops, and desktops, Gemma 4 supports complex logic, multi-step planning, and agentic workflows. The family includes 26B Mixture-of-Experts and 31B dense models for personal computers, offering frontier intelligence and local reasoning without data upload. Additionally, 2B and 4B models are optimized for memory efficiency on mobile and IoT devices, featuring combined audio/vision support, real-time processing, and native support for over 140 languages. Gemma 4 also provides native tool use capabilities and a context window of up to 1/4 million tokens, enabling analysis of entire codebases.

Key takeaway

For engineering teams evaluating local AI model deployment, Gemma 4 presents a compelling option due to its Apache 2.0 license and optimized variants for diverse hardware. Your team can leverage the 26B/31B models for secure, on-device reasoning and coding, or integrate the memory-efficient 2B/4B models for real-time, multilingual, and multimodal applications on mobile and IoT devices. Consider experimenting with its native tool use for agentic workflows.

Key insights

Gemma 4 offers a family of open-source, locally runnable models for diverse hardware, from mobile to desktop.

Principles

Method

Gemma 4 models are designed for agentic workflows, supporting complex logic, multi-step planning, and native tool use, optimizing token usage for intelligence.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Machine Learning Engineer, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Google DeepMind.