TEMPO-Diffusion: Temporally Exposed Malicious Poisoning of Diffusion Models
Summary
TEMPO-Diffusion is a new targeted backdoor framework designed for diffusion models, addressing limitations of prior noise-based attacks that relied on input-time trigger injection, untargeted activation, and out-of-distribution target generation. This framework localizes malicious distribution shifts to a temporal, in-distribution exposure, enhancing stealthiness and practical relevance. TEMPO-Diffusion supports targeted attacks on specific classes, enables multiple sub-image backdoors to reconstruct features across various output images and locations, and facilitates in-painting with time-conditioned triggers. To evaluate practical security concerns related to backdoored diffusion models generating synthetic training data, the researchers also introduced CALISA, a balanced, region-aware dataset of Canadian and U.S. road signs. Experiments across CIFAR10, GTSRB, and CALISA datasets demonstrate TEMPO-Diffusion's ability to reliably poison class-specific synthetic data generation and achieve high attack success rates in downstream classifiers trained on this compromised data.
Key takeaway
For AI Security Engineers and Machine Learning Engineers leveraging diffusion models for synthetic data generation, TEMPO-Diffusion highlights a critical vulnerability: targeted backdoor poisoning. You must assume synthetic datasets, even those appearing in-distribution, can be maliciously manipulated to induce high attack success rates in downstream classifiers. Implement rigorous validation processes for all synthetic data sources and consider advanced anomaly detection to mitigate the risk of temporally exposed malicious shifts.
Key insights
TEMPO-Diffusion enables stealthy, targeted backdoor attacks on diffusion models by localizing malicious shifts to temporal, in-distribution exposures.
Principles
- Backdoor attacks can be localized temporally and in-distribution.
- Multiple sub-image backdoors enhance attack versatility.
- Synthetic data generation is vulnerable to targeted poisoning.
Method
TEMPO-Diffusion localizes malicious distribution shifts to a temporal, in-distribution exposure, supporting class-specific attacks, multiple sub-image backdoors, and in-painting via time-conditioned triggers.
In practice
- Poison synthetic training data for downstream classifiers.
- Generate specific malicious features within images.
- Exploit time-conditioned triggers for in-painting attacks.
Topics
- Diffusion Models
- Backdoor Attacks
- Machine Learning Security
- Synthetic Data Poisoning
- CALISA Dataset
- Targeted Attacks
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, AI Security Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.