Gemini Flash Gets Pricey, AI Act Delays, Agents Drive Online Traffic
Summary
Google launched Gemini 3.5 Flash, a faster, mid-tier multimodal model offering improved agentic capabilities and visual understanding. It supports diverse inputs up to 1 million tokens and outputs up to 64,000 tokens at 204 tokens/second. While it excels in benchmarks like APEX-Agents-AA and MMMU-Pro, its price is three times that of Gemini 3 Flash, reflecting a trend of rising per-token costs. Concurrently, the EU amended its AI Act, delaying "high-risk" system restrictions until December 2027 and easing burdens for smaller companies, driven by competitiveness concerns. A Human Security report revealed AI-driven internet traffic nearly tripled in 2025, with AI agents showing 80x growth, primarily on product and search pages, alongside a 47% rise in malicious scraping. Finally, researchers introduced a fine-tuning method for text-to-image generators that uses a staged "plan, sketch, inspect, refine" process, improving spatial relationship accuracy and object attributes, boosting BAGEL-7B's GenEval score from 77% to 83%. The article also notes the rise of AI Forward Deployed Engineers but predicts a larger demand for generalist AI Engineers.
Key takeaway
For Directors of AI/ML evaluating team structures, prioritize hiring your own generalist AI Engineers. This maintains vendor optionality and fosters internal expertise, rather than relying on vendor-embedded FDEs. New models like Gemini 3.5 Flash offer performance gains but incur significantly higher per-token costs; plan your budget carefully. Prepare your infrastructure for rapidly increasing AI-driven internet traffic. Also, consider advanced techniques like staged image generation to improve your model output quality.
Key insights
AI is creating new roles, driving up model costs, reshaping regulations, and advancing agentic capabilities and image generation.
Principles
- Step-by-step reasoning improves AI output quality.
- Overregulation can stifle innovation and competitiveness.
- Companies prefer internal AI talent for vendor optionality.
Method
Fine-tune a multimodal model to compose images by cycling through "plan, sketch, inspect, refine" stages, using GPT-4o to generate training data for each step.
In practice
- Implement staged image generation for complex spatial prompts.
- Monitor internet traffic for new AI agent patterns.
- Prioritize internal AI engineering talent for flexibility.
Topics
- AI Engineering Roles
- Gemini 3.5 Flash
- Multimodal AI Models
- AI Regulation
- AI Agent Traffic
- Text-to-Image Generation
Best for: CTO, VP of Engineering/Data, Executive, AI Engineer, Director of AI/ML, AI Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Batch | DeepLearning.AI.