Our eighth generation TPUs: two chips for the agentic era

· Source: The Keyword · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Robotics & Autonomous Systems · Depth: Advanced, long

Summary

Google has introduced its eighth generation of custom Tensor Processor Units (TPUs), featuring two specialized chips: the TPU 8t and the TPU 8i. The TPU 8t is designed for massive, compute-intensive AI model training, offering nearly 3x the compute performance per pod over the previous generation and scaling to 9,600 chips with 121 ExaFlops of compute. The TPU 8i is optimized for low-latency inference workloads, crucial for AI agents, and delivers 80% better performance-per-dollar than its predecessor. Both chips are engineered for high power efficiency, achieving up to two times better performance-per-watt, and are supported by Google's Axion ARM-based CPUs and fourth-generation liquid cooling technology. These TPUs will be generally available later this year.

Key takeaway

For MLOps Engineers and CTOs deploying or developing large-scale AI, Google's new TPU 8t and 8i offer specialized hardware for training and inference, respectively. You should evaluate these chips for their potential to significantly reduce model development cycles and improve inference performance-per-dollar, especially for agentic AI workloads. Consider requesting more information now to prepare for their general availability later this year.

Key insights

Google's new TPU 8t and 8i chips specialize in AI training and inference for the agentic era.

Principles

Method

Google's co-design approach integrates custom silicon, networking, and software, including model architecture, to optimize power efficiency and performance across the entire AI supercomputing stack.

In practice

Topics

Best for: CTO, MLOps Engineer, VP of Engineering/Data, AI Engineer, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Keyword.