FEnc$^2$: Unifying Data Packing for Efficient Private Inference via Convolution and Architecture-Aware Fragment Encoding
Summary
FEnc^2 is a unified, fragment-based encoding framework designed for CKKS-based private convolutional neural network inference, specifically addressing the extreme computational and memory overheads caused by inefficient ciphertext packing in Fully Homomorphic Encryption (FHE). It comprises Conv-aware Encoding, which selects optimal fragment sizes to minimize rotations, and Arch-aware Ct Compression, which restores ciphertext density after feature or channel reduction. This approach optimizes slot utilization, rotation complexity, and ciphertext density, reducing homomorphic operations by one to two orders of magnitude. FEnc^2 achieves significant speedups over the state-of-the-art Orion, with up to 228.83x on GPU and 226.06x on CPU for LeNet on MNIST, and up to 4.55x on GPU and 9.43x on CPU for MobileNet on ImageNet, demonstrating that application-level data layout is a critical architectural design dimension.
Key takeaway
For AI Architects and Machine Learning Engineers designing privacy-preserving inference systems, FEnc^2 offers a critical advancement. Its unified fragment-based encoding framework significantly reduces FHE overheads by optimizing data packing at the application level, complementing existing primitive-level optimizations. You should consider FEnc^2's approach to achieve substantial speedups, potentially up to 228x, and improve memory utilization in your CKKS-based CNN deployments.
Key insights
FEnc^2 unifies data packing for FHE private inference, drastically reducing computational and memory overheads.
Principles
- Inefficient ciphertext packing is a primary FHE overhead.
- Application-level data layout is a first-order architectural design dimension.
Method
FEnc^2 employs Conv-aware Encoding to select optimal fragment sizes and Arch-aware Ct Compression to restore ciphertext density, reshaping encrypted workload structure and reducing homomorphic operations.
In practice
- Optimizes encrypted tensor layout before execution.
- Reduces ciphertext count and workload pressure on hardware.
Topics
- Fully Homomorphic Encryption
- Private Inference
- Convolutional Neural Networks
- Data Packing
- Ciphertext Compression
- CKKS Scheme
- Performance Optimization
Best for: Research Scientist, Computer Vision Engineer, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.