From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG
Summary
The paper introduces EPIC (Efficient Preference-aligned Index Construction), a novel method designed to optimize on-device Retrieval-Augmented Generation (RAG) for personal AI agents. Addressing the challenge of limited memory budgets on devices, EPIC prioritizes user preferences as a stable and compact form of personal context. It integrates these preferences throughout the RAG pipeline, selectively retaining preference-relevant information from raw data and aligning retrieval processes. Across four benchmarks encompassing conversations, debates, explanations, and recommendations, EPIC achieved a 2,404-fold reduction in indexing memory, a 20.17 percentage point improvement in preference-following accuracy, and 33.33 times lower retrieval latency compared to the best baseline. An on-device experiment demonstrated EPIC maintaining a memory footprint under 1 MB with a 29.35 ms/query latency during streaming updates.
Key takeaway
For NLP engineers developing on-device personal AI agents, EPIC offers a significant advancement in managing memory and improving responsiveness. By focusing on user preferences, your implementations can achieve substantial reductions in indexing memory (2,404x) and retrieval latency (33.33x), while boosting preference-following accuracy by over 20 percentage points. Consider integrating EPIC's preference-aligned memory construction to deliver more private and responsive user experiences within tight hardware constraints.
Key insights
EPIC optimizes on-device RAG by integrating user preferences to reduce memory and improve retrieval accuracy.
Principles
- User preferences are compact and stable context.
- Align retrieval with user preferences.
Method
EPIC selectively retains preference-relevant data and aligns retrieval towards preference-aligned contexts, integrating user preferences throughout the RAG pipeline for efficient on-device personal AI.
In practice
- Reduce RAG memory footprint on edge devices.
- Improve personal AI agent responsiveness.
- Enhance preference-following accuracy.
Topics
- On-Device AI
- Personal AI Agents
- Retrieval-Augmented Generation
- Preference Alignment
- EPIC Index Construction
Best for: NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.