Apple working to cram massive Gemini model into iPhone to power new Siri
Summary
Apple is working to integrate Google's Gemini AI model into Siri for iPhones, a feature initially promised in 2024 and delayed multiple times. This integration, expected later this year, will likely involve a hybrid approach, combining on-device processing with cloud-based AI, despite Apple's historical preference for local AI for privacy. While Apple's Neural Engine optimizes for efficient AI, smartphones generally lack the RAM and processing power for massive models like Gemini, which have trillions of parameters compared to the few billion in on-device models. Apple is distilling large Gemini models for local use and has reportedly partnered with Nvidia for its Confidential Computing platform to handle complex cloud-based Siri requests, addressing privacy concerns by encrypting data during processing on Nvidia GPUs, potentially under Apple's Private Cloud Compute branding.
Key takeaway
For AI Architects evaluating on-device AI strategies, you should recognize that even with optimized silicon, large conversational models like Gemini necessitate a hybrid cloud approach. Your privacy-focused solutions may require confidential computing partnerships, such as Nvidia's, to process complex requests securely off-device. Be aware that this hybrid model, while enabling advanced AI, might introduce noticeable latency for users compared to purely local processing.
Key insights
Integrating large AI models like Gemini into smartphones requires a hybrid on-device and cloud approach due to hardware limitations.
Principles
- On-device AI models are significantly smaller, often quantized.
- Smartphone GPUs can outperform NPUs for general AI tokens.
- Distillation transfers capabilities from large to small models.
Method
Distillation involves training a smaller model to mimic a larger, resource-intensive model, pruning less important weights to transfer useful capabilities.
In practice
- Use Gemini Nano for contextual on-device features like summarization.
- Implement confidential computing for cloud-based AI to enhance privacy.
Topics
- On-device AI
- Cloud AI
- Gemini Model
- Siri Integration
- Confidential Computing
- AI Model Distillation
Best for: AI Product Manager, Investor, CTO, AI Architect, Director of AI/ML, Tech Journalist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI - Ars Technica.