SWE-MeM: Learning Adaptive Memory Management for Long-Horizon Coding Agents

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Software Development & Engineering · Depth: Expert, extended

Summary

SWE-MeM is a novel training framework designed to equip long-horizon software engineering agents with adaptive memory management capabilities. It addresses the challenge of limited context budgets and noisy interaction histories by introducing a flexible memory tool. This tool allows agents to proactively and on-demand decide when, what, and how to compress historical context based on trajectory state, task progress, and remaining context. The framework integrates memory-management trajectory synthesis, a two-stage curriculum fine-tuning process, and Memory-aware GRPO, which jointly optimizes memory decisions and task-solving performance. Evaluated on SWE-Bench Verified, SWE-MeM achieved resolve rates of 43.4% with Qwen3-4B-Instruct and 60.2% with Qwen3-Coder-30B-A3B under a 32K context budget. It consistently outperformed existing memory management baselines in both performance and token efficiency, demonstrating its ability to provide a cleaner working context and improve task-relevant information utilization.

Key takeaway

For machine learning engineers developing long-horizon software engineering agents, consider integrating adaptive memory management like SWE-MeM. Your agents can significantly improve issue resolution rates and token efficiency by learning to proactively compress irrelevant context. Implement a flexible memory tool and train with memory-aware reinforcement learning to enable agents to make intelligent, on-demand compression decisions, rather than relying on static rules. This approach not only prevents context overflow but also provides a cleaner, more focused working context for complex tasks.

Key insights

Long-horizon coding agents can achieve superior performance and efficiency through adaptive, proactive memory management.

Principles

Method

SWE-MeM trains agents using synthesized memory-management trajectories, curriculum fine-tuning, and Memory-aware GRPO, which employs trajectory splitting and step-level credit assignment for joint optimization.

In practice

Topics

Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, Robotics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.