Amnesia: A Stealthy Replay Attack on Continual Learning Dreams
Summary
A new replay composition attack, "Amnesia," targets continual learning (CL) models by exploiting replay sampling interference to maximize degradation. This attack operates under limited-privilege insider conditions, controlling only replay index selection within auditable limits, such as queue priorities and class histogram divergence from a nominal baseline p0. Amnesia involves two steps: first, computing lightweight class utilities like EMA loss or confidence to bias p0 towards harmful classes; second, projecting this bias back into a delta-ball using efficient KL or TV optimizers. A windowed scheduler enforces rolling audits. Across challenging CL benchmarks and strong replay baselines, Amnesia consistently lowers final accuracy (ACC) and worsens backward transfer (-BWT). The KL variant achieves high impact while remaining largely undetected under multiple audit schemes, whereas the TV variant is more damaging but easier to detect, exposing replay control as a practical, auditable threat surface.
Key takeaway
For AI Security Engineers or Machine Learning Engineers deploying continual learning systems, you must recognize replay index control as a critical, auditable threat surface. Your systems should incorporate robust monitoring for replay sampling distributions, even under seemingly compliant audit schemes. Consider the impact-visibility trade-off when designing defenses, as attacks like Amnesia's KL variant can significantly degrade performance while evading detection, necessitating proactive measures beyond basic telemetry checks.
Key insights
Replay sampling interference presents a practical, auditable threat surface for continual learning systems, enabling stealthy degradation.
Principles
- Insider replay index control is a viable attack vector.
- A principled impact-visibility trade-off exists in replay attacks.
- Auditable constraints do not guarantee security against all attacks.
Method
Amnesia computes class utilities (EMA loss/confidence) to tilt a nominal class histogram p0 towards harmful classes, then projects this tilt into a delta-ball using KL or TV optimizers, enforced by a windowed scheduler.
In practice
- KL variant offers high impact with low detectability.
- TV variant is more damaging but easier to detect.
- Monitor replay index selection for anomalies.
Topics
- Continual Learning
- Replay Attacks
- Catastrophic Forgetting
- Machine Learning Security
- Data Poisoning
- Auditable Systems
Best for: Research Scientist, MLOps Engineer, AI Scientist, AI Security Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.