Introducing updated GPT Voice Models in Microsoft Foundry
Summary
Microsoft Foundry has released updated GPT voice models, now generally available via API, designed for production-ready real-time audio applications. These include "gpt-realtime-mini-2025-12-15" for low-latency voice agents with improved prosody and voice fidelity, supporting voice cloning for trusted customers. The "gpt-4o-mini-transcribe-2025-12-15" model offers up to 50% lower word error rate (WER) on English benchmarks and reduces hallucinations on silence by up to 4x. Additionally, "gpt-4o-mini-tts-2025-12-15" provides multilingual speech synthesis with 35% fewer word errors on multilingual benchmarks and up to 3x lower WER in non-English languages, also supporting voice cloning. All these enhancements are available without any changes to current pricing.
Key takeaway
For AI Architects and NLP Engineers building real-time voice applications, these updated GPT models in Microsoft Foundry offer significant performance gains in accuracy, latency, and multilingual support without increased cost. You should evaluate "gpt-realtime-mini-2025-12-15" for voice agents, "gpt-4o-mini-transcribe-2025-12-15" for transcription, and "gpt-4o-mini-tts-2025-12-15" for speech synthesis to enhance your systems and maintain brand voice consistency through features like voice cloning.
Key insights
Microsoft Foundry's new GPT voice models offer enhanced real-time performance, accuracy, and multilingual capabilities at existing price points.
Principles
- Reliability and latency are critical for production voice models.
- Voice cloning ensures consistent brand voice.
- Cost-efficiency is maintained despite performance upgrades.
In practice
- Use "gpt-realtime-mini" for low-latency voice agents.
- Employ "gpt-4o-mini-transcribe" for robust real-time transcription.
- Apply "gpt-4o-mini-tts" for natural multilingual speech synthesis.
Topics
- GPT Voice Models
- Speech Synthesis
- Speech Transcription
- Voice Cloning
- Microsoft Foundry API
Best for: AI Architect, NLP Engineer, CTO, AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.