MLWhiz Weekly Recsys/ML/GenAI Newsletter # 10 - The week AI infrastructure crossed from a technology story to a financial one
Summary
The AI infrastructure landscape is undergoing a significant financial transformation, marked by a \$35 billion private financing deal led by Apollo and Blackstone for Broadcom's AI XPV Platform. This initiative aims to deliver over 20 gigawatts of AI compute capacity by 2028, with Anthropic and OpenAI as key clients, shifting AI compute funding from major tech company balance sheets to broader private capital. Concurrently, new models were released: Google's Gemma 4 12B, an encoder-free multimodal model for consumer GPUs; MiniMax M3, claiming 59.0% on SWE-Bench Pro with a 1M-token context and competitive pricing; and Microsoft's MAI-Code-1-Flash, a 137B-parameter coding model developed without OpenAI data, showing strong performance and integration into GitHub Copilot. Netflix also introduced Mult-DPO, an extension of Direct Preference Optimization for set-wise recommender system data.
Key takeaway
For AI/ML Directors evaluating infrastructure investments, the \$35 billion Broadcom deal signals a new era where private capital will drive compute capacity. You should anticipate potentially lower compute costs due to increased supply, but prepare for rising power and compliance expenses. Re-evaluate your long-term cloud strategy and consider new financing models for large-scale AI deployments.
Key insights
AI infrastructure financing is shifting from tech giants to private capital, impacting compute costs and accessibility.
Principles
- AI compute capacity is now financeable by diverse private capital.
- LLM alignment techniques can significantly improve RecSys performance.
- Investment in dev platforms compounds with AI agent adoption.
Method
Mult-DPO extends DPO to a multinomial formulation, learning that "all items in set S+ should outrank all items in S−" for recommender systems, directly handling set-wise preference data.
In practice
- Test Gemma 4 12B on consumer GPUs or laptops for multimodal tasks.
- Benchmark recommendation models against Netflix's Mult-DPO.
- Evaluate MiniMax M3 or MAI-Code-1-Flash for coding tasks.
Topics
- AI Infrastructure Investment
- Large Language Models
- Recommender Systems
- Multimodal AI
- Code Generation Models
- Direct Preference Optimization
Code references
Best for: Machine Learning Engineer, CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Investor
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by MLWhiz: Recs|ML|GenAI.