Context-Fractured Decomposition Attacks on Tool-Using LLM Agents: Exploiting Artifact Provenance Gaps

2026-06-08 · Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Expert, quick

Summary

Context-Fractured Decomposition (CFD) attacks represent a novel family of multi-step jailbreaks targeting tool-using LLM agents, exploiting "provenance gaps" in how artifact state is tracked. Unlike traditional jailbreaks assuming a single contiguous conversation, CFD attacks leverage the fragmented enforcement across tools, modules, and time in real agent pipelines. These attacks involve an early interaction that creates benign-looking intermediate artifacts, followed by a later interaction, potentially with a different agent instance or workflow stage, where individually innocuous tool actions combine with the earlier artifacts to elicit harmful behavior. The research operationalizes this deployment failure mode, instruments it with trace-level diagnostics, and outlines "provenance lineage tagging" as a verifiable mitigation. CFD attacks demonstrate improved success rates by up to 28.3 percentage points over state-of-the-art baselines on agent-system jailbreak benchmarks, even against strong single-turn judges.

Key takeaway

For AI Security Engineers designing defenses for tool-using LLM agents, you must move beyond single-turn conversation assumptions. Your defense strategies should explicitly track artifact provenance across all tools and workflow stages to prevent Context-Fractured Decomposition attacks. Implement provenance lineage tagging and robust cross-step composition reasoning to mitigate delayed jailbreaks that exploit fragmented enforcement.

Key insights

Context-Fractured Decomposition (CFD) attacks exploit untracked artifact provenance in LLM agents, enabling delayed, multi-step jailbreaks.

Principles

Defenses must reason about cross-step composition.
Artifact provenance gaps enable delayed attacks.
Fragmented enforcement creates vulnerabilities.

Method

CFD attacks involve creating benign intermediate artifacts in an early interaction, then later using innocuous tool actions to compose with these artifacts, eliciting harmful behavior across agent instances or workflow stages.

In practice

Instrument agent pipelines with trace-level diagnostics.
Implement provenance lineage tagging for artifacts.
Evaluate defenses against multi-step, cross-context attacks.

Topics

LLM Agents
Jailbreak Attacks
Provenance Gaps
Context-Fractured Decomposition
Artifact Security
AI Security

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, AI Security Engineer, MLOps Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.