MLWhiz Weekly Recsys/ML/GenAI Newsletter # 9 - The week AI started its IPOs
Summary
The MLWhiz Weekly Recsys/ML/GenAI Newsletter #9 highlights a significant shift in the AI industry with major companies preparing for IPOs. Anthropic filed its S-1 on Sunday, targeting an October listing after securing a \$65 billion Series H at a \$965 billion valuation, with \$47 billion in run-rate revenue. OpenAI is also eyeing a September IPO at over \$1 trillion, alongside SpaceX's June 12 Nasdaq listing. This transition to public markets will expose AI companies' actual cost structures, particularly for models like Claude. The newsletter also details new model releases, including Nvidia's RTX Spark Superchip, Claude Opus 4.8 with dynamic workflows, and Liquid AI's LFM2.5-8B-A1B. Additionally, it covers research on unifying retrieval and ranking at Pinterest scale with UniPinRec, and single-stage sparse coding for efficient multi-vector retrieval, alongside discussions on LLM evaluations at Spotify and the potential for LLMs to destroy value in software development. DeepSeek also permanently reduced its V4-Pro price by 75% to \$0.87/M tokens.
Key takeaway
For AI/ML Directors evaluating build-vs-buy strategies, Anthropic's upcoming IPO will provide crucial public financial data on Claude's actual cost structure. This transparency will enable more informed decisions regarding token pricing and operational expenses. You should monitor these public filings closely to adjust your budget forecasts and resource allocation for large language model deployments. Additionally, consider integrating LLM evaluations as a pre-experiment filter to optimize your A/B testing bandwidth and ensure output quality before full deployment.
Key insights
The AI industry is rapidly maturing, transitioning to public markets while new models and research advance core capabilities.
Principles
- Public market scrutiny will reveal true AI operational costs.
- LLM evaluations should filter quality before A/B testing business impact.
- Efficient indexing methods can significantly accelerate multi-vector retrieval.
Method
Pinterest's UniPinRec unifies generative retrieval and ranking using a shared transformer, Masked Action Modeling, blended training, and cross-stage KV cache sharing.
In practice
- Consider Nvidia RTX Spark for local 120B-param LLM inference.
- Evaluate Sparse Autoencoders to replace K-means in ColBERT indexing.
- Implement LLM evals as a pre-experiment filter for output quality.
Topics
- AI IPOs
- Large Language Models
- Recommendation Systems
- Machine Learning Operations
- NVIDIA RTX Spark
- Claude Opus
Best for: Director of AI/ML, Machine Learning Engineer, Investor
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by MLWhiz: Recs|ML|GenAI.