SMSR: Certified Defence Against Runtime Memory Poisoning in Persistent LLM Agent Systems

2026-06-12 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

SMSR: Certified Defence Against Runtime Memory Poisoning in Persistent LLM Agent Systems" introduces a novel, formally certified defense against Multi-Session Memory Poisoning (MSMP) in Retrieval-Augmented Generation (RAG) agent systems. This threat involves adversaries injecting malicious memories into persistent agent stores to alter future behavior without modifying model weights. Existing defenses lack formal guarantees for this dynamic runtime injection. SMSR comprises two components: Component 1 uses HMAC-SHA256 provenance tagging at write time, achieving 0% Attack Success Rate (ASR) against unsigned injection. Component 2 employs randomized memory ablation and verdict-based majority aggregation at query time, providing a certified robustness bound for authenticated adversaries. Empirical evaluation across 15 enterprise scenarios (3,150 trials) demonstrated Component 2 reduces authenticated ASR from 93–100% to 8.0% (95% CI [5.8%, 10.9%]) in a production-scale store (m=20), remaining below the δ=10.4% certificate bound. An end-to-end query-only attack further reduced ASR from 65.3% to 5.3% (n=150). The full defense maintains 85% utility.

Key takeaway

For AI Security Engineers deploying RAG agents with persistent memory, implementing SMSR is crucial to mitigate runtime memory poisoning. This defense provides certified robustness, reducing attack success rates significantly. You should integrate HMAC provenance for write-time protection and configure randomized ablation with verdict-based aggregation at query time. Carefully size your retrieval pool (m) and number of runs (n_runs) based on your assumed adversary budget (t) to achieve desired security bounds.

Key insights

SMSR provides the first certified defense against runtime memory poisoning in persistent LLM agent systems.

Principles

Write-time provenance is essential for certified defense.
Randomised over-fetch ablation resists adaptive adversaries.
Verdict-based aggregation counters the Consistent Minority Effect.

Method

SMSR signs legitimate memory writes with HMAC-SHA256. At query time, it retrieves top-m verified candidates, samples k entries randomly n_runs times, and aggregates LLM responses via majority verdict.

In practice

Store HMAC keys in HSMs or secrets managers.
Restrict memory store writes to trusted paths.
Size retrieval pool m based on adversary budget t.

Topics

RAG Systems
LLM Agent Security
Memory Poisoning Attacks
Certified Robustness
HMAC Provenance
Randomised Ablation

Code references

tarun-ks/smsr

Best for: AI Architect, Research Scientist, CTO, AI Scientist, AI Security Engineer, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.