CoreMem: Riemannian Retrieval and Fisher-Guided Distillation for Long-Term Memory in Dialogue Agents
Summary
CoreMem is a new resource-efficient edge-cloud memory architecture designed for personalized dialogue agents, addressing severe memory and compute bottlenecks on consumer-grade hardware like 8 GB VRAM edge devices. It unifies its approach through information geometry, replacing traditional isotropic cosine similarity with Riemannian retrieval, which employs a locally adaptive Fisher-Rao metric and Mahalanobis distance with O(Ndr) Woodbury acceleration to penalize "hub memories." Additionally, CoreMem introduces Fisher-guided discrete token distillation (FDTD), a hierarchical sentence-to-token compression mechanism that uses sensitivity scores from Fisher information traces for principled compression. Benchmarked on LOCOMO and LongMemEval-S, CoreMem achieved significant accuracy gains, including +4.51 pp in Open-domain and +4.17 pp in Temporal reasoning, while consistently operating within its strict 8 GB VRAM budget.
Key takeaway
For Machine Learning Engineers deploying personalized dialogue agents on resource-constrained edge devices, CoreMem provides a robust solution for long-term memory. You can achieve significant accuracy improvements in Open-domain and Temporal reasoning, specifically +4.51 pp and +4.17 pp respectively, while adhering to strict 8 GB VRAM budgets. Consider integrating its Riemannian retrieval and Fisher-guided distillation to overcome traditional memory and compute bottlenecks.
Key insights
CoreMem uses information geometry for Riemannian retrieval and Fisher-guided distillation to enable long-term memory on 8 GB VRAM edge devices.
Principles
- Information geometry unifies memory architecture for resource efficiency.
- Fisher-Rao metric penalizes hubness in high-dimensional retrieval.
- Fisher information traces guide principled context compression.
Method
CoreMem employs Riemannian retrieval with a Fisher-Rao metric and O(Ndr) Woodbury acceleration for real-time search. It also uses Fisher-guided discrete token distillation (FDTD) for hierarchical sentence-to-token compression.
In practice
- Deploy long-term memory agents on 8 GB VRAM edge devices.
- Improve Open-domain and Temporal reasoning in dialogue agents.
Topics
- CoreMem
- Dialogue Agents
- Long-Term Memory
- Edge Devices
- Riemannian Retrieval
- Fisher Information
- Context Compression
Best for: AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.