Google’s $185B Infra Bet & DeepMind’s Memory Fix, India’s GPU Surge, and the Mobile RAM Crisis
Summary
Google is making a massive $185 billion infrastructure investment by 2026, primarily for data centers and compute, driven by insane AI demand and physical supply constraints. The company's Q4 revenue hit $13.8 billion, with full-year 2025 revenue exceeding $400 billion. Internally, AI agents now write 50% of Google's code, shifting engineering roles to reviewing and enabling significant productivity gains that help finance this infrastructure bet. Concurrently, Google DeepMind introduced Reinforced Attention Learning (RA) and "nested learning" with the "Hope" model to combat catastrophic forgetting in AI, allowing systems to retain long-term memory. Meanwhile, India is rapidly expanding its AI infrastructure, tendering for 25,000 additional GPUs to reach a 65,000-unit national cluster, offering subsidized pricing below $1 per hour despite global HBM memory shortages. This GPU procurement is part of a three-tiered strategy to buy, build supply chains, and eventually develop indigenous GPUs. However, the AI boom is creating a "memory squeeze," as HBM production for AI data centers cannibalizes standard RAM capacity at a 3:1 ratio, threatening on-device AI initiatives by companies like Qualcomm due to soaring memory costs and potential spec downgrades for next-gen smartphones.
Key takeaway
For Directors of AI/ML evaluating infrastructure investments and supply chain risks, recognize that AI's compute demands are creating a global resource reallocation. Your strategy should balance immediate access to high-end GPUs with long-term plans for diversified supply chains and potentially in-house AI development. Be prepared for rising memory costs impacting on-device AI initiatives and consider how internal AI-driven productivity can offset these escalating infrastructure expenses.
Key insights
AI's rapid growth is driving massive infrastructure investments and creating critical supply chain pressures across the tech industry.
Principles
- AI productivity gains can finance infrastructure expansion.
- Continuous learning requires multi-speed memory systems.
- Subsidized compute can democratize AI access.
Method
Google DeepMind's Reinforced Attention Learning (RA) and nested learning paradigm train models to optimize internal attention, allowing different components to update at varying speeds to preserve long-term knowledge and avoid catastrophic forgetting.
In practice
- Implement AI agents for code generation and review.
- Explore nested learning for persistent AI memory.
- Utilize subsidized national compute for model training.
Topics
- Google AI Strategy
- AI Infrastructure Investment
- Catastrophic Forgetting
- GPU Supply Chain
- On-Device AI
Best for: VP of Engineering/Data, Director of AI/ML, Executive, Investor, CTO, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AIM Network.