ASR LANGUAGE MODELS

Every voice. Heard and understood.

Deepgram provides the most accurate speech models in the industry, with dozens of languages and dialects to choose from, in numerous use case models, and model tiers. Find the best language/model fit for your need.

Dozens of Languages. Multiple Form Factors.

With our team of linguists, our AI researchers have an unfair advantage. We understand the nuances of every language we build, which in turn gives you higher accuracy. Since our language models are created exclusively with End-to-End Deep Learning, we can perform transfer learning from one language to another, and quickly support new languages and dialects to better meet your use case. Don’t see your language listed below? Contact us, as new languages and dialects are released frequently.

More Voices. Transcribed More Accurately.

Model Types Available:
=General, =Conversational AI, =Earnings, =Meeting, =Phone Call, =Videos, =Voicemail

Base and Enhanced model tiers available on most language and use case models. Model training available as well. View Documentation for details.

FREE* = Free to transcribe for a limited time on our hosted environment.

Dutch

Did you know that Dutch and Flemish, despite the different names, are varieties of the same language, just spoken in different countries? Our model works for both, regardless of where the speaker is from. Deepgram’s Dutch speech recognition model is suitable for general use cases.

 

  • Dutch—nl

    General model

English

Over 1.348 billion people speak English as a first or second language across the globe in hundreds of different dialects. That’s why we have different speech models for the most widely spoken varieties of English. Deepgram’s English speech recognition models are suitable for the following dialects and use cases:

 

  • English—en            
  • English—en-AU
  • English—en-US            
  • English—en-GB  
  • English—en-NZ  
  • English—en-IN  

French

French has a large number of homophones—words with the same pronunciation but different spellings and meanings—which is a challenge that our models can handle. Deepgram’s French speech recognition models are suitable for general audio use cases and the following dialects:

 

  • French—fr 
  • French Canadian—fr-CA

German

German and English are close cousins, but even with 60% of their vocabulary sharing a common origin, one model doesn’t work for both. Deepgram German speech recognition model is suitable for general use cases.

 

  • German—de

Hindi

Hindi has some 37 consonants, compared to only 24 in English. But that’s not a problem for Deepgram’s Hindi speech recognition models, which are suitable for general use cases. Our model can output Hindi in either Devanagari or Latin script.

 

  • Hindi—hi
  • Hindi— hi-Latn

Indonesian

You might be surprised to learn that you know two Indonesian words. Orangutan is a compound of orang ‘person’ and hutan ‘forest’. Deepgram’s Indonesian speech recognition model is suitable for general use cases.

 

  • Indonesian—id

Italian

Italian is spoken in a number of dialects up and down the peninsula. Our model was designed with this in mind, and works well with a range of standard Italian varieties. Deepgram’s Italian speech recognition model is suitable for general use cases.

 

  • Italian—it

Japanese

Japanese uses a combination of three different writing systems (four if you count romaji, or words written in Latin characters). We match pronunciation to the correct transcription so that it can be easily read by Japanese speakers. Deepgram’s Japanese speech recognition model is suitable for general use cases.

 

  • Japanese—ja

Korean

Korean’s own script, Hangul, is made of characters said to represent the shape of your mouth when you pronounce them. And rather than being written sound by sound, the sounds are stacked into blocks that represent syllables. Deepgram’s Korean speech recognition model is suitable for general use cases.

 

  • Korean—ko

Mandarin

Mandarin has the most native speakers of any language at 920 million—almost twice as many as the second language, Spanish. Deepgram’s Mandarin speech recognition models are suitable for general use cases, and can output transcripts in either Simplified or Traditional characters.

 

  • Mandarin, Simplified script—
    zh (cmn-Hans-CN)
  • Mandarin, Traditional script—
    zh-TW (cmn-Hant-TW)

Norwegian

Did you know English has borrowed several words from Norwegian, including krill, lemming, and ski? Deepgram’s Norwegian speech recognition model is suitable for general use cases.

 

  • Norwegian—no

Polish

Polish was a lingua franca from 1500 to 1700 in Central and parts of Eastern Europe, and is now spoken by 50 million people worldwide. It is also one of the most difficult languages for native English speakers to learn. Feel free to try our Polish model out and see if we accurately transcribe the language’s longest word: Dziewięćsetdziewięćdziesięci- odziewięcionarodowościowego.

 

  • Polish

Portuguese

European and Brazilian Portuguese are considerably different from each other than, say, US and British English, which  is why we have three distinct speech models (Mixed, Portugal and Brazil). Deepgram’s Portuguese speech recognition model is suitable for general use cases and dialects:

 

  • Portuguese—pt
  • Portuguese—pt-PT
  • Portuguese—pt-BR

Russian

Russian is one of the languages of space, so astronauts need to learn Russian as part of their training for the International Space Station. Luckily for Deepgram’s partnership with NASA, we can transcribe both English and Russian. Deepgram’s Russian speech recognition model is suitable for general use cases.

 

  • Russian—ru

Spanish

Spanish is an official language in 21 countries and has half a billion speakers worldwide, which means dialects matter.  We created our Spanish language model to recognize a variety of regional accents and dialects, making a great fit for the diverse variety of Spanish spoken throughout the Americas. Deepgram’s Spanish speech recognition models are suitable for general use cases. 

 

  • Spanish—es
  • Spanish Latin America—es-419

Swedish

Written Swedish is very close to Danish and Norwegian. The spoken languages are more distinct from each other, though, which is why we have a model specifically for Swedish. Deepgram’s Swedish speech recognition models are suitable for general use cases.

 

  • Swedish—sv

Turkish

In Turkish, one word can be equivalent to a whole sentence in another language. An example is gitmeyecekmişçesine ‘it’s as if (s)he won’t leave’. Luckily, our ASR model knows about all Turkish morphology to correctly transcribe roots and suffixes. Deepgram’s Turkish speech recognition model is suitable for general use cases.

 

  • Turkish—tr

Ukrainian FREE*

When we realized there were no high-quality Ukrainian speech recognition models available, we moved fast to fix it. We launched our Ukrainian model in just two weeks and have been offering it since. Deepgram’s Ukrainian speech recognition model is suitable for general use cases.

 

  • Ukrainian—uk