TAI #206: Gemini 3.5 Flash Is Stronger, But Flash Is No Longer Cheap

· Source: Towards AI Newsletter · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Software Development & Engineering · Depth: Intermediate, long

Summary

Google I/O 2026 unveiled several AI advancements, most notably Gemini 3.5 Flash, positioned as Google's strongest agentic and coding model. It achieved scores like 76.2% on Terminal-Bench 2.1 and 83.6% on MMMU-Pro, with Artificial Analysis measuring over 280 output tokens per second. However, its pricing has significantly increased, with output tokens costing \$9.00 per million, a 30x jump from Gemini 1.5 Flash, and overall benchmark suite costs rising 5.5x from Gemini 3 Flash to \$1,552. Google also introduced Gemini Omni for multimodal video generation, Antigravity 2.0 as a multi-agent development environment, and the Managed Agents API. Other industry news included Cohere's Command A+ open-weight model, OpenAI's Codex updates with Goal Mode, and Anthropic's expanded Claude Compliance API integrations, highlighting a competitive landscape where Google's release cadence and regional availability are critical for enterprise adoption.

Key takeaway

For AI engineers evaluating large language models for production, you should carefully weigh Gemini 3.5 Flash's strong multimodal and agentic capabilities against its substantially increased cost and regional deployment limitations. Prioritize running your own evaluation suites across Gemini, GPT, and Claude models on your specific tasks, especially for vision-heavy workflows. Also, consider the operational overhead and compliance needs, as model availability and governance features are now critical for enterprise adoption.

Key insights

Google's latest Gemini models offer strong capabilities but come with significantly increased costs and deployment challenges.

Principles

In practice

Topics

Best for: CTO, VP of Engineering/Data, AI Architect, AI Engineer, MLOps Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI Newsletter.