ChatNeuroSim: An LLM Agent Framework for Automated Compute-in-Memory Accelerator Deployment and Optimization

2026-03-11 · Source: cs.MA updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, AI Hardware Design & Optimization, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

ChatNeuroSim is a large language model (LLM)-based agent framework designed to automate the deployment and optimization of Compute-in-Memory (CIM) accelerators. It addresses challenges in conventional CIM design flows, such as complex simulator manuals and extensive design-simulation iterations, by automating tasks like request parsing, parameter dependency checking, script generation, and simulation execution. The framework integrates a CIM optimizer that utilizes design space pruning, which significantly accelerates the identification of optimal CIM configurations for various deep neural network (DNN) workloads. Evaluated on 40 request-level testbenches, ChatNeuroSim achieved 100% accuracy in script generation and simulation results using GPT-5.1. A case study optimizing Swin Transformer Tiny under 22 nm technology demonstrated that the proposed CIM optimizer with design space pruning reduced average runtime by 0.42x–0.79x compared to a no-pruning baseline.

Key takeaway

For AI Scientists and Research Scientists engaged in Compute-in-Memory (CIM) accelerator design, ChatNeuroSim offers a significant reduction in design cycle time and manual effort. You should consider integrating this LLM-based framework to automate complex design space exploration, especially for vision transformer workloads like Swin-T, where it has shown substantial runtime reductions. Leveraging its design space pruning capabilities can accelerate identifying optimal hardware configurations, allowing you to focus on higher-level architectural innovations.

Key insights

ChatNeuroSim automates CIM accelerator design and optimization using LLM agents and design space pruning for faster, more efficient exploration.

Principles

LLM agents can automate complex EDA workflows.
Design space pruning accelerates hardware optimization.
Transfer learning improves search efficiency across models.

Method

ChatNeuroSim employs three LLM agents (task parsing, parameter parsing, adjustment) and a CIM optimizer with cross-space constraint projection, Top-K pruning, and stochastic de-pruning to automate DSE.

In practice

Use ChatNeuroSim for automated CIM accelerator design.
Apply design space pruning for vision transformer optimization.
Select base models with similar architectures for pruning.

Topics

Compute-in-Memory
LLM Agents
Design Space Exploration
Hardware Optimization
Vision Transformers

Code references

Best for: AI Scientist, Research Scientist, AI Engineer, AI Architect, AI Researcher

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.