SecureRouter: Encrypted Routing for Efficient Secure Inference

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Expert, extended

Summary

SecureRouter is an end-to-end encrypted routing and inference framework designed to accelerate secure Transformer inference by enabling input-adaptive model selection under encryption. It addresses the limitations of prior privacy-preserving inference systems, which use a single, fixed transformer model for all encrypted inputs, leading to high latency and cost. SecureRouter integrates a secure router with an MPC-optimized model pool, allowing coordinated routing, inference, and protocol execution while maintaining full data and model confidentiality. The framework includes an MPC-cost-aware secure router trained to predict per-model utility and cost from encrypted features, and an MPC-optimized model pool co-trained for minimal MPC communication and computation overhead. Experiments on GLUE benchmarks show SecureRouter achieves a latency reduction of up to 1.95x with negligible accuracy loss compared to fixed-model MPC baselines, and nearly 50% lower average latency than the SecFormer framework.

Key takeaway

For AI Architects and Research Scientists deploying privacy-preserving Transformer models, SecureRouter offers a practical solution to significantly reduce inference latency and cost. By dynamically routing encrypted inputs to an MPC-optimized model pool, your systems can achieve up to 1.95x speed-up without compromising accuracy. Consider integrating this input-adaptive approach to overcome the computational bottlenecks of traditional MPC-based inference, especially in latency-critical applications like medicine and finance.

Key insights

SecureRouter accelerates encrypted Transformer inference via input-adaptive model selection and an MPC-optimized model pool.

Principles

Method

SecureRouter employs an offline training phase to optimize an MPC-cost-aware router and an MPC-optimized model pool, followed by an online inference phase where the router dynamically selects models from encrypted features using a secure argmax protocol and oblivious transfer.

In practice

Topics

Code references

Best for: Research Scientist, AI Architect, AI Engineer, AI Scientist, Machine Learning Engineer, AI Security Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.