Towards Persistent Case-Based Memory for Autonomous Data Science: A CBR-Augmented R&D-Agent with a Locally Deployable Small Language Model
Summary
A novel CBR-augmented R&D-Agent integrates a persistent Case-Based Reasoning (CBR) layer into Microsoft's R&D-Agent framework, utilizing a custom Google AI Studio backend to deploy Gemma 4 31B Dense as its autonomous agent backbone. This system addresses the lack of persistent, cross-session memory in most top-performing agents and tests the viability of Small Language Models (SLMs) for local deployment. The CBR layer surgically overrides three R&D loop phases—hypothesis generation, code generation, and case retention—using structured case records with executable code snapshots, a five-gate quality filter, and a heuristic reuse-detection mechanism. Evaluated on two Kaggle competitions, Spaceship Titanic showed CBR achieving 0.8147 accuracy versus the baseline's 0.8098 with lower variance, while NOMAD saw the baseline achieve a larger SOTA gain. Heuristic reuse detection across 108 events revealed high semantic relevance (0.882 mean embedding similarity) and variable structural proximity (0.305 mean code-fingerprint similarity), suggesting conceptual guidance over direct code copying.
Key takeaway
For ML Engineers building autonomous data science agents in environments with limited GPU budgets or strict data privacy, you should consider adopting a CBR-augmented SLM architecture. This approach, exemplified by Gemma 4 31B Dense, provides persistent, quality-controlled memory, improving learning consistency and hypothesis quality. While it may trade some exploratory breadth for stability, it offers a transparent, locally deployable alternative to cloud-dependent frontier models, enhancing auditability and maintainability for your team.
Key insights
Persistent, quality-controlled case-based memory with a locally deployable SLM enhances autonomous data science agent performance and consistency.
Principles
- Structured case memory improves agent consistency.
- SLMs require engineering for structured output.
- Quality gates prevent knowledge base swamping.
Method
CBR integrates as a surgical subclass, overriding hypothesis, code generation, and retention phases. It uses prompt-based schema enforcement and JSON repair for SLM structured output, plus adaptive hang detection.
In practice
- Deploy Gemma 4 31B Dense for local agent backbones.
- Apply a five-gate filter for case base quality.
- Combine embedding and code-fingerprint similarity for reuse.
Topics
- Autonomous Agents
- Case-Based Reasoning
- Small Language Models
- Gemma 4 31B Dense
- Machine Learning Engineering
- Local Deployment
- Persistent Memory
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.