MemRefine: LLM-Guided Compression for Long-Term Agent Memory
Summary
MemRefine is an LLM-guided framework designed to address the unbounded growth of memory stores in large language model (LLM) agents operating over long-term interactions. As agent memory accumulates, it becomes filled with redundant entries, increasing storage costs and degrading information retrieval, particularly on resource-constrained platforms. MemRefine tackles this "storage-budgeted memory management" problem by using similarity metrics solely to propose candidate memory pairs. Crucially, it then employs an LLM judge to make delete, merge, or preserve decisions based on factual content, iterating until a fixed memory budget is achieved. The framework consistently meets target budgets across various memory frameworks and long-term conversation benchmarks, preserving downstream performance and outperforming rule-based baselines under tight budget constraints.
Key takeaway
For AI Scientists and Machine Learning Engineers developing LLM agents with long-term memory requirements, MemRefine offers a robust solution to manage memory growth and resource constraints. You should consider integrating an LLM-guided compression framework to maintain performance under tight memory budgets. This approach ensures that critical factual information is preserved by deferring delete/merge decisions to an LLM judge, rather than relying on less effective surface similarity metrics.
Key insights
LLM-guided compression, MemRefine, manages agent memory by using an LLM judge for factual content-based decisions, outperforming surface similarity.
Principles
- Surface similarity poorly reflects factual value.
- LLM judges can make content-aware memory decisions.
- Iterative compression can meet fixed memory budgets.
Method
MemRefine proposes candidate memory pairs via similarity, then an LLM judge decides to delete, merge, or preserve based on factual content, iterating until the predefined memory budget is met.
In practice
- Implement LLM-guided memory compression for agents.
- Prioritize factual content over surface similarity in memory management.
- Apply iterative budget-constrained memory refinement.
Topics
- LLM Agents
- Memory Management
- Memory Compression
- Large Language Models
- Resource Constraints
- Factual Content Preservation
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.