Perplexity, CoreWeave Deal Boosts Inferencing
Summary
CoreWeave and AI search vendor Perplexity have finalized a multiyear agreement, announced on March 4, 2026, for CoreWeave to host Perplexity's AI inference workloads. This deal highlights the increasing market emphasis on AI inference over training. Perplexity will migrate its next AI inference workloads to CoreWeave Cloud, utilizing Nvidia's GB200 NVL72 clusters to power its AI model, Sonar, and its Search API ecosystem. Additionally, Perplexity will employ CoreWeave Kubernetes Services (CKS) and W&B (weights and balances) models for model management and deployment. The partnership aims to scale Perplexity's AI search and inference capabilities, providing quick, real-time responses for enterprise customers and diversifying CoreWeave's customer base beyond its major contracts with Microsoft, OpenAI, and Meta.
Key takeaway
For CTOs and VPs of Engineering evaluating AI infrastructure, this deal underscores the strategic importance of specialized inference providers. Your teams should consider dedicated AI cloud platforms like CoreWeave for high-volume, real-time inference workloads to ensure performance and cost efficiency, rather than solely relying on hyperscalers. This approach can secure high-performance infrastructure without the need for in-house development.
Key insights
The AI market is shifting focus from training to inference, driving specialized cloud partnerships for real-time AI applications.
Principles
- Inference is a continuous, high-volume workload.
- Purpose-built AI clouds can offer performance advantages.
Method
Perplexity will migrate AI inference workloads to CoreWeave Cloud, leveraging Nvidia GB200 NVL72 clusters, CKS, and W&B models for deployment and management.
In practice
- Migrate AI inference to specialized cloud providers.
- Utilize Kubernetes services for AI workload management.
Topics
- AI Inference
- Cloud Computing
- NVIDIA GPUs
- AI Search
- Model Deployment
Best for: CTO, VP of Engineering/Data, AI Architect, AI Product Manager, Director of AI/ML, Investor
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by aibusiness.