We’ve released an updated Nova-3 Multilingual speech-to-text model, delivering accuracy improvements across all supported languages, including a ~34% relative reduction in batch mean WER and a ~21% relative reduction in streaming mean WER, with the largest gains in code-switching scenarios.
This update focuses on improving real-world multilingual speech recognition, especially for inputs that mix languages within a single utterance or conversation.
Key improvements include:
- Lower Word Error Rate (WER) across both batch and streaming inference
- Significantly improved handling of code-switching, reducing word drops when languages are mixed
- No API or configuration changes required - the updated model is live now
Why This Update Matters
Speech recognition in the real world is messy. People switch languages mid-sentence, mix vocabulary, speak with varied accents, and move fluidly between contexts. Solving for this complexity is one of the core challenges in multilingual speech-to-text and automatic speech recognition (ASR).
Multilingual speech recognition becomes significantly more complex when languages are mixed within the same conversation, or even the same sentence. Consider a bilingual English/Spanish speaker saying:
“I was charged twice, pero solo hice una compra.” (Translation: I was charged twice, but I only made one purchase.)
In situations like this, models must correctly recognize words as speakers switch languages mid-sentence. Historically, these transitions have been challenging for multilingual systems.
Improving performance in these scenarios requires retraining and evaluation across datasets that include both monolingual and mixed-language audio.
Supported Languages
Nova-3 Multilingual supports the following languages: English, Spanish, French, German, Hindi, Italian, Japanese, Dutch, Russian, and Portuguese. For more information about supported languages and model capabilities, visit our Models & Languages documentation page.
What Went Into This Update
This release reflects a retrained Nova-3 Multilingual model evaluated across a diverse set of multilingual benchmarks.
We made advances in:
- Curriculum: the order and types of data we show the model, so that the model gets the appropriate exposure to multilingual and code-switching data during training.
- Data Curation: how we filter data and select data, so that the model trains on accurately labeled data, especially with respect to code-switching.
Overall Word Error Rate (WER) Improvement
Note: Mean WER is an unweighted average across datasets. Aggregate WER is weighted by total word count across datasets.
Code-Switching Word Error Rate (WER) by Language
The charts below show Word Error Rate (WER) by language on code-switching datasets, comparing the previous and current Nova-3 Multilingual models across both batch and streaming modes.
Multilingual Keyterm Prompting
Nova-3 Multilingual supports Keyterm Prompting. This allows developers to guide transcription toward domain-specific terminology, brand names, product names, and keywords, without retraining models or managing custom vocabularies.
Keyterm Prompting is applied dynamically at inference time, making customization fast and flexible across languages.
This capability is especially valuable for:
- Call centers and customer support systems
- Voice agents and IVR applications
- Industry-specific analytics and transcription workflows
What This Means for Builders
For developers and enterprises building multilingual speech-to-text voice experiences these improvements translate into:
- Fewer transcription errors
- Reduced manual correction
- More reliable downstream analytics
- Stronger performance in mixed-language real-world audio
Build Globally with Deepgram and Unlock Enterprise-Grade Voice AI Today
The updated model is live now and serves as the default Nova-3 Multilingual production model — no API or configuration changes required. Sign up free and unlock $200 in credits, enough to power over 750 hours of transcription or 200 hours of speech-to-text across Nova-3’s growing language suite. Explore details on our Models & Languages Overview page and experience Nova-3’s world-class adaptability for yourself.
