Lumai Productizes Lens-Based Optical Computer
Summary
British startup Lumai is productizing its lens-based optical computer, marking the first time such a system has successfully run billion-parameter AI models. This technology, which computes in a 3D volume for massive parallelism, aims to accelerate matrix-multiply operations in AI inference. Lumai claims its solution can deliver 50x the performance of current GPUs with a 90% reduction in power consumption, addressing data center power limits. The system encodes input vectors into laser light sources, passes them through an electronic display for weight multiplication, and combines results with a final lens. While computation uses minimal energy, conversion between electrical and optical domains, and powering lasers/detectors, still require power. Lumai's Iris Nova server, designed for hyperscale evaluation, currently runs Llama models, with future iterations like Iris Aura and Iris Tetra planned for multi-engine and cluster-scale deployments by 2029, targeting 100 TOPS/W (INT8) and 1 exaOPS within a 10kW budget.
Key takeaway
For CTOs and VPs of Engineering managing hyperscale data centers, Lumai's Iris Nova optical computer presents a compelling solution to escalating power consumption and performance demands for AI inference. Its claimed 50x performance increase and 90% power reduction, particularly for compute-bound prefill tasks, could significantly optimize operational costs and throughput. You should consider evaluating the Iris Nova server by late 2026 to assess its fit for your Llama-based workloads and future AI infrastructure scaling.
Key insights
Lumai's optical computer offers significant AI inference acceleration and power efficiency by performing matrix multiplications in the optical domain.
Principles
- 3D volume computation enables massive parallelism.
- Optical efficiency increases with matrix size.
- Disaggregation optimizes compute-bound prefill tasks.
Method
Input vectors are encoded into laser light, multiplied by weights via an electronic display, and summed by a lens. An orchestration layer offloads matrix multiplication from a CPU to the optical system.
In practice
- Evaluate Iris Nova for Llama inference.
- Target compute-bound prefill in disaggregated data centers.
- Consider for agentic AI with long context lengths.
Topics
- Lens-based Optical Computing
- AI Inference Acceleration
- Matrix Multiplication
- Power Efficiency
- Disaggregated Data Centers
Best for: Investor, CTO, VP of Engineering/Data, AI Hardware Engineer, AI Architect, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Big Data & AI News - EE Times.