NAND Reimagined in High-Bandwidth Flash to Complement HBM
Summary
High-bandwidth flash (HBF) is an emerging memory solution designed to significantly boost capacity and performance for AI inference workloads by vertically stacking multiple layers of NAND dies. Unlike high-bandwidth memory (HBM), which stacks DRAM for bandwidth, HBF stacks 3D NAND arrays to enhance parallel I/O. SK Hynix recently presented a hybrid memory architecture combining eight HBM3E stacks with eight HBF stacks alongside an Nvidia Blackwell (B200) GPU, demonstrating a 2.69× improvement in performance per watt over HBM-only setups. HBF leverages HBM's vertical stacking concept and a parallel sub-array architecture for high-density integration and independent read/write channels, making it ideal for read-intensive inference tasks. SK Hynix and Sandisk anticipate trial versions and samples in 2026, with commercial products expected by early 2027.
Key takeaway
For Machine Learning Engineers designing AI inference systems, HBF offers a compelling solution to the memory wall by providing significantly higher capacity at lower cost than HBM. Consider integrating HBF into hybrid memory architectures, especially for read-intensive edge AI applications or as a capacity extension for HBM in data centers, to achieve substantial performance per watt improvements and store terabytes of model data.
Key insights
HBF stacks NAND dies for high-capacity, high-throughput memory, complementing HBM in AI inference architectures.
Principles
- HBF excels in read-intensive AI inference.
- Hybrid HBM-HBF architectures improve performance per watt.
- Vertical stacking enhances memory density and bandwidth.
Method
HBF combines 3D NAND flash with advanced packaging and interconnection, using a parallel sub-array architecture for independent read/write channels to achieve high throughput.
In practice
- Use HBF for large AI model storage.
- Deploy HBF in edge AI for pre-trained models.
- Pair HBF with HBM for capacity extension in data centers.
Topics
- High-bandwidth Flash
- AI Inference
- Hybrid Memory Architectures
- 3D NAND
- Memory Capacity
Best for: Machine Learning Engineer, NLP Engineer, Computer Vision Engineer, AI Architect, AI Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Big Data & AI News - EE Times.