Google's Gemini 3.5 Live Translate delivers real-time voice translation across 70+ languages

2026-06-09 · Source: The Decoder · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, quick

Summary

Google has released Gemini 3.5 Live Translate, a new real-time audio translation model supporting over 70 languages. This model automatically detects languages and, according to Google, maintains the speaker's tone, pace, and pitch while translating continuously without waiting for sentence completion. It is now available for developers through the Gemini Live API and Google AI Studio as a preview. Businesses can access it in Google Meet, where it expands language support from five to over 70 languages, enabling more than 2,000 language combinations. All users can also find it in the Google Translate app on Android and iOS. Ride-hailing service Grab is reportedly testing the model for driver-passenger communication, and all generated audio is tagged with an inaudible SynthID watermark.

Key takeaway

For AI Engineers developing multilingual communication platforms, Google's Gemini 3.5 Live Translate presents a robust solution for real-time voice translation across 70+ languages. You should explore its Gemini Live API for integrating features like automatic language detection and continuous, natural-sounding translation that preserves speaker nuances. This capability can significantly enhance user experience in applications requiring seamless cross-language interaction, such as ride-hailing services or international conferencing.

Key insights

Google's Gemini 3.5 Live Translate offers real-time, natural-sounding voice translation across 70+ languages, available via API and apps.

Principles

Automatic language detection streamlines real-time translation.
Preserving speaker's tone and pace enhances naturalness.
Continuous translation improves conversational flow.

Method

The model detects language, translates speech-to-speech continuously, and preserves vocal characteristics. For presentations, a speaker creates a session, and attendees join via QR/URL.

In practice

Integrate real-time voice translation via Gemini Live API.
Enable live dubbing for video or audio streams.
Support multilingual presentations with shared sessions.

Topics

Gemini 3.5 Live Translate
Real-time Voice Translation
Speech-to-Speech AI
Multilingual Communication
Gemini API
SynthID Watermark

Best for: Machine Learning Engineer, CTO, VP of Engineering/Data, AI Engineer, NLP Engineer, Tech Journalist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.