MemRerank: Preference Memory for Personalized Product Reranking
Summary
MemRerank is a novel preference memory framework designed to enhance personalized product reranking in LLM-based shopping agents. It addresses the inefficiency of directly using long, noisy purchase histories by distilling user preferences into concise, query-independent signals. The system employs a reinforcement learning (RL) trained memory extractor, supervised by downstream reranking performance, to generate structured within-category and cross-category shopping preferences. Evaluated on an end-to-end benchmark using an LLM-based 1-in-5 selection task, MemRerank consistently outperformed baselines, achieving up to +10.61 absolute points in accuracy with the o4-mini reranker and +6.60 points with GPT-4.1-mini, especially when combined with "think tags" for explicit reasoning. This demonstrates its effectiveness as a practical building block for agentic e-commerce personalization.
Key takeaway
For AI Engineers building agentic e-commerce recommender systems, you should prioritize implementing explicit preference memory modules like MemRerank. Directly feeding raw purchase histories to LLMs is often suboptimal; instead, distill user preferences into concise, query-independent signals. Consider training your memory extractor with reinforcement learning, using downstream reranking accuracy as a direct optimization target, and employ semi-structured prompts for better extraction quality. This approach can yield substantial accuracy gains, as shown by up to +10.61 points in 1-in-5 reranking tasks.
Key insights
Distilling user purchase history into concise, query-independent preference memory significantly boosts LLM-based product reranking accuracy.
Principles
- RL training for memory extractors is crucial for aligning with downstream tasks.
- Semi-structured, evidence-grounded prompts yield superior memory extraction.
- Explicit reasoning ("think tags") complements high-quality preference memory.
Method
MemRerank extracts structured within-category and cross-category preference memory from purchase history using an LLM, then trains this extractor via GRPO with a reward function combining format adherence and downstream 1-in-5 reranking accuracy.
In practice
- Implement RL-trained memory extractors for personalized e-commerce agents.
- Design memory extraction prompts with semi-structured guidance and evidence grounding.
- Incorporate "think tags" in LLM reranking prompts to enhance performance.
Topics
- Personalized Reranking
- LLM-based Recommender Systems
- Preference Memory
- Reinforcement Learning
- E-commerce AI Agents
- Prompt Engineering
Best for: Research Scientist, AI Product Manager, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.