Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

2026-05-29 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Expert, quick

Summary

Persona Attack is a novel memory injection jailbreak method designed to exploit Large Language Models' conversational memory. Unlike traditional single-prompt injections, this technique manipulates the model's context window incrementally, step-by-step. Experiments on several widely used LLMs demonstrate that as these injections accumulate, models increasingly prioritize the injected instructions over their internal safety alignment mechanisms. The attack's success rate, which can reach 95% under specific instruction configurations, varies significantly based on the model's memory implementation and the combination of instructions used.

Key takeaway

For AI Security Engineers evaluating LLM robustness, this research indicates that traditional safety training is insufficient against memory-based jailbreak attacks like Persona Attack. You should prioritize developing and implementing defenses that specifically address incremental context window manipulation and the accumulation of malicious instructions over conversational turns, rather than solely focusing on single-turn prompt injections.

Key insights

Persona Attack exploits LLM conversational memory to bypass safety alignment via incremental instruction injection.

Principles

LLMs prioritize accumulated injected instructions over safety.
Jailbreak success varies by memory implementation and instruction sets.

Method

Persona Attack incrementally injects instructions into an LLM's context window, causing the model to prioritize these over its internal safety alignment mechanisms.

In practice

Exploit LLM conversational memory for jailbreaking.
Test instruction combinations for higher attack rates.

Topics

LLM Jailbreak
Persona Attack
Memory Injection
Context Window
Safety Alignment
Prompt Engineering

Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, AI Security Engineer, NLP Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.