Google just dropped Gemini 3.5 Flash and the price hike is pretty insane.

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Intermediate, quick

Summary

Google has released its Gemini 3.5 Flash model, which presents a substantial price increase compared to its predecessor, Gemini 3.0 Flash. The new model is approximately 5.5 times more expensive to run, with input tokens tripled to \$1.50 per million and output tokens now at \$9.00 per million. Despite a 70% speed improvement, generating 280 tokens per second, Gemini 3.5 Flash requires an average of 49 steps for complex tasks, making it practically 75% more expensive than the heavier 3.1 Pro model. It scored 55 on the IQ index, outperforming Grok 4.3 and Claude Sonnet 4.6, but its coding performance is weaker at 45. Hallucinations decreased by 31 points to 61%. This trend of increased pricing for multi-step agentic systems is also observed with OpenAI's GPT-5.5 and Claude Opus 4.7, indicating a market shift towards more compute-intensive, autonomous AI.

Key takeaway

For AI Product Managers evaluating new model integrations, you must reassess your API budget projections given the rising costs of advanced multi-step AI systems like Gemini 3.5 Flash. Your product's unit economics will change dramatically if users expect agentic behavior, which consumes significantly more compute. Consider benchmarking self-hosted alternatives to understand true operational costs and mitigate reliance on increasingly expensive commercial APIs.

Key insights

The market is shifting to expensive multi-step AI agents, challenging prior assumptions about cheap inference.

Principles

In practice

Topics

Best for: CTO, VP of Engineering/Data, AI Architect, Director of AI/ML, AI Product Manager, Entrepreneur

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.