Google's Gemini 3.5 Live Translate delivers real-time voice translation across 70+ languages
Summary
Google has released Gemini 3.5 Live Translate, a new real-time audio translation model supporting over 70 languages. This model automatically detects languages and, according to Google, maintains the speaker's tone, pace, and pitch while translating continuously without waiting for sentence completion. It is now available for developers through the Gemini Live API and Google AI Studio as a preview. Businesses can access it in Google Meet, where it expands language support from five to over 70 languages, enabling more than 2,000 language combinations. All users can also find it in the Google Translate app on Android and iOS. Ride-hailing service Grab is reportedly testing the model for driver-passenger communication, and all generated audio is tagged with an inaudible SynthID watermark.
Key takeaway
For AI Engineers developing multilingual communication platforms, Google's Gemini 3.5 Live Translate presents a robust solution for real-time voice translation across 70+ languages. You should explore its Gemini Live API for integrating features like automatic language detection and continuous, natural-sounding translation that preserves speaker nuances. This capability can significantly enhance user experience in applications requiring seamless cross-language interaction, such as ride-hailing services or international conferencing.
Key insights
Google's Gemini 3.5 Live Translate offers real-time, natural-sounding voice translation across 70+ languages, available via API and apps.
Principles
- Automatic language detection streamlines real-time translation.
- Preserving speaker's tone and pace enhances naturalness.
- Continuous translation improves conversational flow.
Method
The model detects language, translates speech-to-speech continuously, and preserves vocal characteristics. For presentations, a speaker creates a session, and attendees join via QR/URL.
In practice
- Integrate real-time voice translation via Gemini Live API.
- Enable live dubbing for video or audio streams.
- Support multilingual presentations with shared sessions.
Topics
- Gemini 3.5 Live Translate
- Real-time Voice Translation
- Speech-to-Speech AI
- Multilingual Communication
- Gemini API
- SynthID Watermark
Best for: Machine Learning Engineer, CTO, VP of Engineering/Data, AI Engineer, NLP Engineer, Tech Journalist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.