Nvidia competitor Etched hits $5B valuation, $1B in sales for AI chip

· Source: AI News & Artificial Intelligence | TechCrunch · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

Nvidia AI chip competitor Etched, founded in 2022, has achieved a \$5 billion post-money valuation after closing an unannounced \$500 million funding round in December, bringing its total raised to \$800 million. The startup has also secured \$1 billion in contract orders for its "frontier inference clusters," full systems comprising custom chips, racks, and software designed for faster, cheaper, and more power-efficient AI inference. Etched's chips, successfully manufactured by TSMC, aim to address the significant bottleneck and cost center of AI inference. The company's technical approach focuses on low voltage inference, running at under half the voltage of GPUs to solve thermal throttling, and cluster scale memory with custom interconnects to dramatically reduce chip-to-chip latency and increase memory bandwidth for decode operations. This vertical integration strategy, from chip design to production, enables rapid development and deployment.

Key takeaway

For Directors of AI/ML evaluating future infrastructure, Etched's emergence signals a critical shift towards specialized inference hardware. You should assess whether your current general-purpose GPU clusters are creating bottlenecks and excessive costs for large-scale AI model deployment. Consider exploring vertically integrated, purpose-built inference solutions like Etched's that promise order-of-magnitude improvements in concurrency and tokens per watt, potentially enabling new, faster, and more accessible AI applications.

Key insights

Etched redefines AI chip design by optimizing for inference through low voltage and cluster-scale memory, enabling massive throughput.

Principles

Method

Etched employs low voltage inference to prevent thermal throttling and uses cluster scale memory with custom interconnects to reduce chip-to-chip latency for high-bandwidth decode operations, disaggregating prefill and decode.

In practice

Topics

Best for: CTO, VP of Engineering/Data, AI Architect, AI Hardware Engineer, Director of AI/ML, Investor

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI News & Artificial Intelligence | TechCrunch.