Gemini Flash Gets Pricey, AI Act Delays, Agents Drive Online Traffic

· Source: The Batch | DeepLearning.AI · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Legal & Regulatory · Depth: Advanced, long

Summary

Google launched Gemini 3.5 Flash, a faster, mid-tier multimodal model offering improved agentic capabilities and visual understanding. It supports diverse inputs up to 1 million tokens and outputs up to 64,000 tokens at 204 tokens/second. While it excels in benchmarks like APEX-Agents-AA and MMMU-Pro, its price is three times that of Gemini 3 Flash, reflecting a trend of rising per-token costs. Concurrently, the EU amended its AI Act, delaying "high-risk" system restrictions until December 2027 and easing burdens for smaller companies, driven by competitiveness concerns. A Human Security report revealed AI-driven internet traffic nearly tripled in 2025, with AI agents showing 80x growth, primarily on product and search pages, alongside a 47% rise in malicious scraping. Finally, researchers introduced a fine-tuning method for text-to-image generators that uses a staged "plan, sketch, inspect, refine" process, improving spatial relationship accuracy and object attributes, boosting BAGEL-7B's GenEval score from 77% to 83%. The article also notes the rise of AI Forward Deployed Engineers but predicts a larger demand for generalist AI Engineers.

Key takeaway

For Directors of AI/ML evaluating team structures, prioritize hiring your own generalist AI Engineers. This maintains vendor optionality and fosters internal expertise, rather than relying on vendor-embedded FDEs. New models like Gemini 3.5 Flash offer performance gains but incur significantly higher per-token costs; plan your budget carefully. Prepare your infrastructure for rapidly increasing AI-driven internet traffic. Also, consider advanced techniques like staged image generation to improve your model output quality.

Key insights

AI is creating new roles, driving up model costs, reshaping regulations, and advancing agentic capabilities and image generation.

Principles

Method

Fine-tune a multimodal model to compose images by cycling through "plan, sketch, inspect, refine" stages, using GPT-4o to generate training data for each step.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Executive, AI Engineer, Director of AI/ML, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Batch | DeepLearning.AI.