Last Week in AI #335 - Opus 4.6, Codex 5.3, Gemini 3 Deep Think, GLM 5, Seedance 2.0
Summary
Anthropic released Claude Opus 4.6, a significant upgrade featuring "agent teams" for parallel task execution and an expanded 1 million token context window, enabling work over large codebases and documents. It also includes a native PowerPoint side panel for direct slide drafting and editing. OpenAI unveiled GPT-5.3-Codex, a frontier coding model available via CLI, IDE extension, web, and a new macOS app, which outperforms previous versions on SWE-Bench Pro and Terminal-Bench 2.0 while running 25% faster. Google introduced Gemini 3 Deep Think, a specialized "extended reasoning" mode for science and engineering, achieving 84.6% on ARC-AGI-2 and gold medal-level performance on international Olympiads. Chinese AI labs DeepSeek and Zhipu AI also rolled out major upgrades, with DeepSeek expanding its model's context window to over 1,000,000 tokens and Zhipu AI launching GLM-5 for "agentic engineering." ByteDance pre-released Seedance 2.0, a multimodal video generator that accepts up to 12 inputs and outputs 4–15s clips with precise reference capabilities, reportedly surpassing OpenAI's Sora 2.
Key takeaway
For CTOs and VPs of Engineering evaluating AI investments, the rapid advancements in multi-agent capabilities, expanded context windows, and specialized reasoning models from Anthropic, OpenAI, and Google necessitate a re-evaluation of current AI strategies. Your teams should explore integrating these new agentic workflows and long-context models to enhance productivity across software development, scientific research, and content creation, while also monitoring the competitive landscape from Chinese labs offering high-performance, cost-effective alternatives.
Key insights
Frontier AI models are rapidly advancing in multi-agent collaboration, extended context, and specialized reasoning across diverse domains.
Principles
- Test-time compute improves accuracy in complex reasoning tasks.
- Reinforcement learning drives rapid, significant model improvements.
- Sparse attention balances long-context performance with efficiency.
Method
Agent teams split complex tasks into parallel subtasks for faster completion. Models use internal verification to prune incorrect reasoning paths. Retrieval-aware distillation preserves critical attention heads while replacing others with SSM recurrent heads.
In practice
- Use agent teams for complex, multi-step projects.
- Leverage native AI integrations for productivity apps.
- Explore specialized models for scientific and engineering tasks.
Topics
- AI Agent Systems
- Large Language Models
- Generative Video AI
- AI for Software Engineering
- Long Context Windows
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, AI Product Manager, AI Researcher
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Last Week in AI.