Certified Robustness to Data Poisoning in Gradient-Based Training
Summary
A new framework provides provable guarantees on the behavior of models trained with potentially manipulated data, addressing the challenge of data poisoning and backdoor attacks in modern machine learning pipelines. Developed by Mark N. Müller, Calvin Tsay, and Matthew Wicker, this method leverages convex relaxations to over-approximate parameter updates for gradient-based learning algorithms. It certifies robustness against untargeted, targeted, and backdoor attacks, covering both input and label manipulations. The approach, demonstrated on real-world datasets from energy consumption, medical imaging (OCTMNIST), and autonomous driving (PilotNet), shows that increasing poisoning parameters (e.g., number of samples, feature/label strength) leads to looser performance bounds, while factors like model size and learning rate also influence bound tightness.
Key takeaway
For Machine Learning Engineers deploying models with public or uncurated data, this framework offers a way to quantify robustness against poisoning attacks. You can use Abstract Gradient Training to obtain provable bounds on model performance and backdoor success rates, moving beyond reactive, attack-specific defenses. This allows you to proactively assess and mitigate risks from untargeted, targeted, and backdoor manipulations.
Key insights
A framework provides provable guarantees against data poisoning and backdoor attacks in gradient-based machine learning models.
Principles
- Data poisoning can cause catastrophic model failures.
- Attack-specific defenses lead to an "arms race" without guarantees.
- Certified robustness requires bounding worst-case model behavior.
Method
Abstract Gradient Training (AGT) uses convex relaxations and CROWN-style bounds to over-approximate parameter updates, bounding the set of reachable parameters for gradient-based algorithms like SGD.
In practice
- Evaluate model robustness against untargeted, targeted, and backdoor attacks.
- Use AGT to quantify worst-case performance under poisoning.
- Consider batch size to "dilute" the effect of poisoned samples.
Topics
- Data Poisoning
- Certified Robustness
- Gradient-Based Training
- Abstract Gradient Training
- CROWN Bounds
- Machine Learning Security
Code references
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, Machine Learning Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.