A semantic memory layer for local AI agents — no vector DB, one file, runs on CPU
Summary
A new Python library, `SemanticMemory`, offers a lightweight, local semantic memory solution for AI agents, addressing the common problem of agents lacking persistent memory without complex infrastructure. This single-file (198 lines) tool scans `.md`, `.txt`, and `.json` files, chunks them with overlap, and encodes them using the `all-MiniLM-L6-v2` model (22MB, local). It saves the indexed data to a `.semantic_index.json` file and answers queries using cosine similarity ranking. Benchmarked on an M1 MacBook Air, it indexes 205 chunks in approximately 3.2 seconds and queries in about 85ms, with an index file size of 1.4MB. It is suitable for scenarios with fewer than 10,000 memory chunks, requiring zero infrastructure and single-process operation, and integrates with local LLM setups like Ollama, LM Studio, and llama.cpp.
Key takeaway
For AI Architects building local-first agents, `SemanticMemory` provides a simple, efficient way to add persistent semantic memory without external vector databases. If your agent operates within a single process and manages fewer than 10,000 memory chunks, this library offers a compelling alternative to complex infrastructure, streamlining development and deployment. Consider integrating it to enhance your agent's contextual awareness and decision-making over time.
Key insights
A lightweight, local semantic memory solution for AI agents avoids complex vector databases for smaller datasets.
Principles
- Local-first design reduces infrastructure overhead.
- Semantic search improves recall over keyword search.
- Chunking with overlap enhances context preservation.
Method
The `SemanticMemory` library indexes text files by chunking, encoding with `all-MiniLM-L6-v2`, and storing embeddings in a local JSON file for cosine similarity-based querying.
In practice
- Use `SemanticMemory` for <10,000 local memory chunks.
- Integrate with Ollama by injecting query results into prompts.
- Install with `pip install sentence-transformers numpy`.
Topics
- Local AI Agents
- Semantic Memory
- Sentence Embeddings
- CPU Inference
- Vector Search
Code references
Best for: AI Architect, NLP Engineer, Entrepreneur, AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning ML & Generative AI News.