SMSR: Certified Defence Against Runtime Memory Poisoning in Persistent LLM Agent Systems

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

SMSR: Certified Defence Against Runtime Memory Poisoning in Persistent LLM Agent Systems" introduces a novel, formally certified defense against Multi-Session Memory Poisoning (MSMP) in Retrieval-Augmented Generation (RAG) agent systems. This threat involves adversaries injecting malicious memories into persistent agent stores to alter future behavior without modifying model weights. Existing defenses lack formal guarantees for this dynamic runtime injection. SMSR comprises two components: Component 1 uses HMAC-SHA256 provenance tagging at write time, achieving 0% Attack Success Rate (ASR) against unsigned injection. Component 2 employs randomized memory ablation and verdict-based majority aggregation at query time, providing a certified robustness bound for authenticated adversaries. Empirical evaluation across 15 enterprise scenarios (3,150 trials) demonstrated Component 2 reduces authenticated ASR from 93–100% to 8.0% (95% CI [5.8%, 10.9%]) in a production-scale store (m=20), remaining below the δ=10.4% certificate bound. An end-to-end query-only attack further reduced ASR from 65.3% to 5.3% (n=150). The full defense maintains 85% utility.

Key takeaway

For AI Security Engineers deploying RAG agents with persistent memory, implementing SMSR is crucial to mitigate runtime memory poisoning. This defense provides certified robustness, reducing attack success rates significantly. You should integrate HMAC provenance for write-time protection and configure randomized ablation with verdict-based aggregation at query time. Carefully size your retrieval pool (m) and number of runs (n_runs) based on your assumed adversary budget (t) to achieve desired security bounds.

Key insights

SMSR provides the first certified defense against runtime memory poisoning in persistent LLM agent systems.

Principles

Method

SMSR signs legitimate memory writes with HMAC-SHA256. At query time, it retrieves top-m verified candidates, samples k entries randomly n_runs times, and aggregates LLM responses via majority verdict.

In practice

Topics

Code references

Best for: AI Architect, Research Scientist, CTO, AI Scientist, AI Security Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.