Gemini 3.5 Flash: more expensive, but Google plan to use it for everything

2026-05-19 · Source: Simon Willison's Weblog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Emerging Technologies & Innovation · Depth: Intermediate, quick

Summary

Google launched Gemini 3.5 Flash on May 19, 2026, making it generally available without a preview phase. This model, identified as "gemini-3.5-flash", is being integrated across numerous Google products, including the Gemini app, Google Search, Google Antigravity, and Gemini Enterprise. It features a January 2025 knowledge cut-off and supports 1,048,576 input tokens and 65,536 maximum output tokens. A new Interactions API, currently in beta, also accompanies the release, offering server-side history management. Notably, Gemini 3.5 Flash comes with a significant price increase, costing \$1.50 per million input tokens and \$9 per million output tokens, which is 3x the price of Gemini 3 Flash Preview and 6x that of Gemini 3.1 Flash-Lite. This pricing positions it close to Gemini 3.1 Pro and reflects a broader industry trend of rising LLM costs, as evidenced by benchmark costs like \$1,551.60 for Gemini 3.5 Flash (high) compared to \$892.28 for Gemini 3.1 Pro Preview.

Key takeaway

For AI Engineers and Directors of AI/ML evaluating LLM deployments, be aware that Gemini 3.5 Flash represents a significant price increase over previous Flash models, costing \$1.50/million input and \$9/million output tokens. This trend, mirrored by other vendors, suggests a new pricing floor for advanced models. You should re-evaluate your cost models and consider benchmark data, like Artificial Analysis's \$1,551.60 for 3.5 Flash, to accurately forecast operational expenses and optimize your LLM strategy.

Key insights

Gemini 3.5 Flash signals a new phase of higher LLM pricing and broad internal deployment across Google's ecosystem.

Principles

LLM API costs are increasing across major providers.
Internal product integration often precedes wider API rollout.
Benchmark costs reveal true operational expense.

In practice

Use benchmark data for LLM cost comparisons.
Assess new LLM versions for price-performance.
Track API pricing trends for budget planning.

Topics

Gemini 3.5 Flash
LLM Pricing
Google AI
API Costs
Benchmark Data
Interactions API

Best for: CTO, VP of Engineering/Data, Entrepreneur, AI Engineer, Director of AI/ML, AI Product Manager

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.