Hybrid privacy-aware semantic search: SVD-truncated document geometry and CKKS-encrypted query reranking under a restricted threat model
Summary
A novel hybrid privacy-aware semantic search scheme addresses the vulnerability of dense embeddings to inversion attacks, which can reconstruct source text from leaked vector databases. Traditional defenses, like full homomorphic encryption, are too slow for large scales, while privacy noise degrades ranking. This new approach exploits the asymmetry between static document collections and dynamic queries. Document vectors are protected geometrically by truncation onto a lower-dimensional SVD subspace and rotation with a secret orthogonal transform. Queries are cryptographically protected via CKKS homomorphic encryption for reranking, ensuring the server never sees the query or scores. The scheme maintains ranking quality, even improving it on strong encoders, with sub-second latency for one million documents. It effectively thwarts off-the-shelf inversion attacks, though document protection is an empirical obfuscation layer, not a cryptographic primitive, unlike the query confidentiality.
Key takeaway
For Machine Learning Engineers or AI Security Engineers designing privacy-aware semantic search systems, this hybrid approach offers a compelling middle ground. You can achieve sub-second latency and preserve ranking quality on large document collections by combining SVD-based document obfuscation with CKKS-encrypted query reranking. However, understand that document protection is an empirical layer, while query confidentiality is cryptographic, requiring careful threat model alignment for your specific application.
Key insights
Hybrid semantic search protects documents via SVD truncation and queries via CKKS encryption, balancing privacy and performance.
Principles
- Exploit static collection/dynamic query asymmetry for privacy.
- SVD truncation can act as a linear denoiser, improving ranking.
- Empirical obfuscation offers practical, not cryptographic, document protection.
Method
Truncate document embeddings onto a lower-dimensional SVD subspace and rotate them. Rerank queries using CKKS homomorphic encryption, with parameters benchmarked offline.
In practice
- Apply SVD truncation to protect static document collections.
- Use CKKS for secure query reranking in semantic search.
- Benchmark CKKS parameters offline for efficiency.
Topics
- Semantic Search
- Homomorphic Encryption
- SVD Truncation
- CKKS
- Privacy-Preserving AI
- Embedding Inversion Attacks
- Vector Databases
Best for: AI Architect, Research Scientist, CTO, AI Scientist, Machine Learning Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.