Nvidia competitor Etched hits $5B valuation, $1B in sales for AI chip
Summary
Nvidia AI chip competitor Etched, founded in 2022, has achieved a \$5 billion post-money valuation after closing an unannounced \$500 million funding round in December, bringing its total raised to \$800 million. The startup has also secured \$1 billion in contract orders for its "frontier inference clusters," full systems comprising custom chips, racks, and software designed for faster, cheaper, and more power-efficient AI inference. Etched's chips, successfully manufactured by TSMC, aim to address the significant bottleneck and cost center of AI inference. The company's technical approach focuses on low voltage inference, running at under half the voltage of GPUs to solve thermal throttling, and cluster scale memory with custom interconnects to dramatically reduce chip-to-chip latency and increase memory bandwidth for decode operations. This vertical integration strategy, from chip design to production, enables rapid development and deployment.
Key takeaway
For Directors of AI/ML evaluating future infrastructure, Etched's emergence signals a critical shift towards specialized inference hardware. You should assess whether your current general-purpose GPU clusters are creating bottlenecks and excessive costs for large-scale AI model deployment. Consider exploring vertically integrated, purpose-built inference solutions like Etched's that promise order-of-magnitude improvements in concurrency and tokens per watt, potentially enabling new, faster, and more accessible AI applications.
Key insights
Etched redefines AI chip design by optimizing for inference through low voltage and cluster-scale memory, enabling massive throughput.
Principles
- Specialized design beats general-purpose for specific use cases.
- Vertical integration accelerates product development and performance.
- Extreme urgency and risk-taking drive market velocity.
Method
Etched employs low voltage inference to prevent thermal throttling and uses cluster scale memory with custom interconnects to reduce chip-to-chip latency for high-bandwidth decode operations, disaggregating prefill and decode.
In practice
- Design for specific thermal constraints, not generic worst-case scenarios.
- Pre-fetch and parallelize development cycles to reduce time-to-market.
- Cultivate "legends" and "raw talent" for bimodal team strength.
Topics
- AI Chips
- AI Inference
- Low Voltage Inference
- Cluster Scale Memory
- Vertical Integration
- Semiconductor Manufacturing
- TSMC
Best for: CTO, VP of Engineering/Data, AI Architect, AI Hardware Engineer, Director of AI/ML, Investor
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI News & Artificial Intelligence | TechCrunch.