TinyTPU: SystemVerilog systolic array compiled to WASM, running live in browser - RTL golden-verified against numpy [P]

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, quick

Summary

TinyTPU is a 4x4 weight-stationary systolic array implemented in SystemVerilog, compiled to WebAssembly, and presented with a step-by-step browser visualization. This interactive tool allows users to input matrices and observe the actual hardware execution, including weights loading into Processing Elements (PEs), matrix A streaming diagonally, partial sums accumulating, and results draining. The project features three distinct levels: L1 isolates a single Multiply-Accumulate (MAC) cell, L2 demonstrates the full 4x4 array executing a real matrix multiplication, and L3 illustrates tiling for matrices larger than the hardware. The visualization directly reads state from the compiled Register-Transfer Level (RTL), ensuring accuracy and providing a concrete understanding of how matrix multiplication maps to hardware and why TPUs are efficient.

Key takeaway

For AI Hardware Engineers or Machine Learning Engineers learning hardware acceleration, TinyTPU offers an invaluable interactive tool. If you are struggling to grasp how matrix multiplication maps to hardware or the efficiency of TPUs, this visualization provides direct insight. You should explore its L1, L2, and L3 levels to understand MAC cell operations, full array execution, and matrix tiling, which can significantly deepen your comprehension beyond theoretical papers.

Key insights

TinyTPU provides an interactive, browser-based visualization of a 4x4 SystemVerilog systolic array, clarifying hardware matrix multiplication.

Principles

Method

The TinyTPU method involves compiling SystemVerilog RTL to WebAssembly, then visualizing its execution in a browser, showing MAC cell operation, full array matmul, and matrix tiling.

In practice

Topics

Code references

Best for: AI Hardware Engineer, AI Student, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.