Cross-Silo De-Anonymization Under Local Differential Privacy: Threat Model, Phase Transition, and Coordination Necessity

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Expert, quick

Summary

This paper develops an information-theoretic framework to analyze cross-silo de-anonymization under local differential privacy (LDP) when a person's records appear in k independent data silos. It introduces cross-silo person-level DP (XSP-DP), a Pufferfish-style privacy notion, and verifies that standard basic composition bounds apply. A critical finding is a phase transition for de-anonymization at k* = Theta(log n / epsilon^2), where n is the population size and epsilon is the per-silo randomized-response parameter. Below this threshold, de-anonymization fails, while above it, attacks succeed. The research demonstrates information synergy using an XOR + randomized-response construction, where individual silo outputs are uninformative, but their joint mutual information is strictly positive. For non-coordinated binary randomized-response mechanisms, de-anonymization becomes inevitable once k exceeds k*, underscoring the necessity of cross-silo coordination to prevent such attacks. These results establish a baseline threat model and a Theta-level threshold for cross-silo inference attacks under LDP.

Key takeaway

For AI Security Engineers designing privacy-preserving systems across distributed data silos, you must recognize the critical de-anonymization phase transition at k* = Theta(log n / epsilon^2). Even with local differential privacy applied per silo, your systems become vulnerable to identity inference if the number of silos k surpasses this threshold without coordination. Implement robust cross-silo coordination mechanisms to prevent information synergy from enabling successful de-anonymization attacks, ensuring your privacy guarantees hold across the entire data ecosystem.

Key insights

Cross-silo de-anonymization under local differential privacy undergoes a phase transition, making coordination essential to prevent identity inference.

Principles

Method

The paper develops an information-theoretic framework, proves Fano lower bounds and maximum-likelihood upper bounds, and uses an XOR + randomized-response construction to demonstrate synergy.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, Research Scientist, AI Security Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.