Voxtral transcribes at the speed of sound
Summary
Mistral has released Voxtral Transcribe 2, a new family of two audio-to-text transcription models, following the original Voxtral released in July 2025. One model, `Voxtral-Mini-4B-Realtime-2602`, is open-weights under an Apache-2.0 license and available as an 8.87GB download from Hugging Face, offering real-time transcription capabilities. The second model, `voxtral-mini-latest`, is closed-weights and accessible via the Mistral API at a cost of $0.003 per minute, or $0.18 per hour. The Mistral API console now includes a speech-to-text playground for testing the `voxtral-mini-latest` model, providing diarized transcripts with options to download results in text, SRT, or JSON formats.
Key takeaway
For engineering teams evaluating speech-to-text solutions, consider Mistral's Voxtral Transcribe 2. Your choice between the open-weights `Voxtral-Mini-4B-Realtime-2602` and the API-based `voxtral-mini-latest` should depend on whether your priority is local deployment and cost control or leveraging advanced features like diarization and context biasing through a managed service. Test the API playground to assess its transcription quality for your specific audio data.
Key insights
Mistral's Voxtral Transcribe 2 offers both open-weights and API-based speech-to-text models with real-time and diarization features.
Principles
- Open-weights models foster broader adoption.
- API access simplifies integration for developers.
Method
The Mistral API allows audio transcription with diarization, context biasing, and timestamp granularities via a POST request to `/v1/audio/transcriptions`.
In practice
- Use `Voxtral-Mini-4B-Realtime-2602` for local, real-time transcription.
- Integrate `voxtral-mini-latest` via Mistral API for diarized transcripts.
- Utilize the API console's playground for quick testing and file uploads.
Topics
- Audio Transcription
- Mistral AI
- Open-weights Models
- Speech-to-Text API
- Real-time AI
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.