This 11-person London startup wants to make AI 100x cheaper every year
Summary
Doubleword, an 11-person London-based startup, aims to reduce AI inference costs by 100x annually, positioning itself as a significant British AI success story. The company, founded by Meryem Arik, focuses on making AI more accessible and affordable, particularly for smaller businesses and sovereign compute initiatives. Doubleword offers AI inference at $7,000 per A100 GPU per year, significantly undercutting competitors like CoreWeave and Lambda, which charge $91,000. Their strategy involves optimizing hardware and software, including 8-bit quantization and custom kernel development, to run large AI models more efficiently on less powerful GPUs. Doubleword also emphasizes building a strong talent pipeline in the UK, collaborating with universities and addressing the shortage of skilled AI engineers.
Key takeaway
For Directors of AI/ML evaluating infrastructure costs, Doubleword's approach to AI inference offers a compelling alternative to high-priced GPU providers. You should investigate their pricing model and technical optimizations, such as 8-bit quantization, to potentially achieve significant cost reductions for your AI workloads, especially if you are constrained by budget or seeking to deploy models on more accessible hardware.
Key insights
AI inference costs can be drastically reduced through hardware and software optimization.
Principles
- Cost-efficiency drives AI adoption.
- Talent development is crucial for AI ecosystems.
Method
Optimize AI inference by combining 8-bit quantization, custom kernel development, and efficient model deployment on less powerful, readily available GPUs.
In practice
- Utilize 8-bit quantization for model compression.
- Develop custom kernels for specific hardware.
- Focus on sovereign compute infrastructure.
Topics
- Doubleword
- AI Inference
- Cost Reduction
- Sovereign Compute
- Custom AI Silicon
Best for: CTO, VP of Engineering/Data, MLOps Engineer, Entrepreneur, Investor, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Sifted.