Gaussian DP for Reporting Differential Privacy Guarantees in Machine Learning
Summary
The current standard for reporting Differential Privacy (DP) guarantees, (ε,δ)-DP, is often incomplete and misleading, hindering comparisons across machine learning applications. This paper advocates for Gaussian Differential Privacy (GDP) as the primary reporting method, with the full privacy profile as a secondary option if GDP is inaccurate. GDP offers a single parameter (μ), ensuring easier comparability and accurately capturing privacy for many ML applications, including DP large-scale image classification and the U.S. Decennial Census's TopDown algorithm. While other formalisms like privacy loss random variables are needed for accounting, they can be efficiently converted to GDP with minimal tightness loss. The authors provide a Python package (gdpnum) to facilitate this conversion and evaluation.
Key takeaway
For AI scientists and ML engineers evaluating or deploying differentially private models, you should transition from (ε,δ)-DP to μ-GDP for reporting privacy guarantees. This shift provides a single, directly comparable parameter, simplifying privacy budget management and cross-algorithm evaluation. Utilize numerical accountants and the provided "gdpnum" package to compute μ*-GDP and assess its fit using the Δ metric. If Δ exceeds 10^-2, provide the full trade-off curve or code for transparency.
Key insights
Gaussian Differential Privacy (GDP) offers a single, comparable parameter (μ) for reporting ML privacy guarantees, improving upon (ε,δ)-DP.
Principles
- Privacy guarantees should be concise and comparable.
- A single parameter should allow ordering of privacy strength.
- Reporting methods must accurately represent practical mechanisms.
Method
Compute a tight trade-off curve using numerical accounting, then derive a conservative μ*-GDP guarantee. Evaluate its fit using the Δ metric; if Δ < 10^-2, report μ*-GDP, otherwise share the full trade-off curve or code.
In practice
- Use the "gdpnum" Python package for GDP conversion and evaluation.
- DP-SGD with noise parameter σ ≥ 2 and T ≥ 400 iterations typically fits μ-GDP well (Δ < 0.01).
- The U.S. Decennial Census TopDown algorithm is tightly characterized by μ=2.702-GDP.
Topics
- Differential Privacy
- Gaussian Differential Privacy
- Privacy Accounting
- Machine Learning Privacy
- DP-SGD
- Trade-off Curves
Code references
Best for: Research Scientist, AI Scientist, AI Security Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.