Memp: Exploring Agent Procedural Memory
Summary
$Mem^{p}$ is a task-agnostic framework designed to equip Large Language Model (LLM)-based agents with learnable, updatable, and lifelong procedural memory. It addresses the brittleness of existing procedural memory in agents by distilling past trajectories into both fine-grained, step-by-step instructions and higher-level, script-like abstractions. The framework systematically explores strategies for building, retrieving, and updating this memory, including dynamic regimens for continuous correction and deprecation. Empirical evaluations on the TravelPlanner and ALFWorld benchmarks demonstrate that as the memory repository is refined, agents achieve consistently higher success rates and greater efficiency on analogous tasks. Notably, procedural memory built from stronger models like GPT-4o can be transferred to weaker models such as Qwen2.5-14B-Instruct, yielding substantial performance gains, including a 5% increase in task completion rate and a 1.6-step reduction on TravelPlanner.
Key takeaway
For Research Scientists developing LLM-based agents, integrating a dynamic procedural memory system like $Mem^{p}$ is crucial for improving agent robustness and efficiency. You should focus on implementing memory construction that combines abstract scripts with concrete trajectories, employ semantic-aware retrieval, and prioritize reflection-based update mechanisms to ensure continuous learning and adaptation, especially for long-horizon and complex tasks.
Key insights
Procedural memory, dynamically built and updated, significantly enhances LLM agent performance and efficiency across diverse tasks.
Principles
- Procedural memory improves task accuracy and reduces execution steps.
- Abstracted scripts generalize better than raw trajectories.
- Procedural memory is transferable between models of varying strength.
Method
$Mem^{p}$ distills agent trajectories into fine-grained instructions and high-level scripts. It employs strategies for memory construction (trajectories, scripts, combined), retrieval (query, AveFact), and dynamic updating (vanilla, validation, adjustment/reflection).
In practice
- Combine abstract guidelines with concrete trajectories for optimal memory.
- Use query-based or AveFact methods for precise memory retrieval.
- Implement reflection-based updates for continuous memory refinement.
Topics
- Procedural Memory
- LLM Agents
- Memory Management
- Trajectory Distillation
- Memory Transfer
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.