Tensordyne Tapes Out LNS-Based AI Chip, Claims Huge Power Advantages

2026-06-15 · Source: Big Data & AI News - EE Times · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Emerging Technologies & Innovation · Depth: Advanced, medium

Summary

AI chip startup Tensordyne has taped out its data center inference chip, claiming an order-of-magnitude power efficiency improvement over leading GPUs. The company states its systems achieve 17x tokens per second per Watt and 13x tokens per second per rack compared to Nvidia GB300-based systems for the same workload. Built on TSMC 3 nm, the chip consumes 300 W, offers 2.1 PFLOPS (dense FP8) compute, and includes 144 GB HBM3e. Tensordyne's proprietary Pareto number system, based on the Logarithmic Number System (LNS) with dedicated hardware acceleration, underpins this advantage. Their 72-chip Napier server, air-cooled at 30 kW, holds 10 TB of HBM, sufficient for a 10T FP4 model. Full racks deliver 608 PFLOPS and 42 TB HBM at 120 kW. Development cloud access is planned by late 2026, with systems shipping by Q2 2027.

Key takeaway

For AI Architects and MLOps Engineers evaluating next-generation inference hardware, Tensordyne's LNS-based chip presents a compelling alternative to traditional GPU solutions. You should investigate its claimed 17x power efficiency and \$11 per million tokens cost for large language models. This is especially relevant if your workloads involve 10T+ parameter models or agentic AI. Consider piloting their development cloud by late 2026 to characterize performance for your specific applications before Q2 2027 system shipments.

Key insights

Tensordyne's LNS-based AI chip offers significant power and cost efficiency for large model inference via novel math and hardware.

Principles

Dedicated LNS hardware acceleration boosts efficiency.
Proprietary math systems can yield order-of-magnitude gains.
Cell-based NoC reduces tail latency in distributed systems.

Method

Tensordyne's software stack handles all LNS conversions, abstracting the proprietary math from users. AI agents can convert GPU-specific code from various frameworks.

In practice

Evaluate LNS-based hardware for 10T+ parameter model inference.
Consider cell-based NoC designs for low-latency distributed AI.
Utilize AI agents for framework-agnostic code translation.

Topics

AI Inference Chips
Logarithmic Number System
Tensordyne Napier
Data Center AI
Network-on-Chip
Large Language Models

Best for: Investor, CTO, VP of Engineering/Data, AI Hardware Engineer, AI Architect, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Big Data & AI News - EE Times.