CoreMem: Riemannian Retrieval and Fisher-Guided Distillation for Long-Term Memory in Dialogue Agents

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

CoreMem is a new resource-efficient edge-cloud memory architecture designed for personalized dialogue agents, addressing severe memory and compute bottlenecks on consumer-grade hardware like 8 GB VRAM edge devices. It unifies its approach through information geometry, replacing traditional isotropic cosine similarity with Riemannian retrieval, which employs a locally adaptive Fisher-Rao metric and Mahalanobis distance with O(Ndr) Woodbury acceleration to penalize "hub memories." Additionally, CoreMem introduces Fisher-guided discrete token distillation (FDTD), a hierarchical sentence-to-token compression mechanism that uses sensitivity scores from Fisher information traces for principled compression. Benchmarked on LOCOMO and LongMemEval-S, CoreMem achieved significant accuracy gains, including +4.51 pp in Open-domain and +4.17 pp in Temporal reasoning, while consistently operating within its strict 8 GB VRAM budget.

Key takeaway

For Machine Learning Engineers deploying personalized dialogue agents on resource-constrained edge devices, CoreMem provides a robust solution for long-term memory. You can achieve significant accuracy improvements in Open-domain and Temporal reasoning, specifically +4.51 pp and +4.17 pp respectively, while adhering to strict 8 GB VRAM budgets. Consider integrating its Riemannian retrieval and Fisher-guided distillation to overcome traditional memory and compute bottlenecks.

Key insights

CoreMem uses information geometry for Riemannian retrieval and Fisher-guided distillation to enable long-term memory on 8 GB VRAM edge devices.

Principles

Method

CoreMem employs Riemannian retrieval with a Fisher-Rao metric and O(Ndr) Woodbury acceleration for real-time search. It also uses Fisher-guided discrete token distillation (FDTD) for hierarchical sentence-to-token compression.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.