DiffusionGemma

· Source: Simon Willison's Weblog · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Intermediate, quick

Summary

Google has released DiffusionGemma, a new open-weight model under an Apache 2 license, available as "google/diffusiongemma-26B-A4B-it" on Hugging Face. This release follows an experimental Gemini Diffusion model briefly previewed in May 2025, which achieved 857 tokens/second. The current DiffusionGemma research, now openly available, demonstrates strong performance, with a test generating 2,409 tokens in 4.4 seconds, equating to at least 500 tokens/second. NVIDIA is currently providing free hosting for this model via its NIM cloud API, making it accessible for developers. This marks a significant return for the previously unannounced Gemini Diffusion research, now in an open and usable format.

Key takeaway

For AI Engineers evaluating new text generation models, DiffusionGemma offers a compelling, openly licensed option. You can integrate "google/diffusiongemma-26B-A4B-it" into your projects, leveraging its demonstrated high token generation rates. Consider utilizing NVIDIA's free NIM cloud API for initial testing and deployment, which provides an accessible way to experiment with this model without immediate infrastructure costs. This release enables you to explore advanced text generation capabilities with an open-source foundation.

Key insights

Google's experimental Gemini Diffusion model is now an open-weight, Apache 2 licensed DiffusionGemma.

Principles

In practice

Topics

Best for: NLP Engineer, AI Scientist, Research Scientist, AI Engineer, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.