Different Environments Call for Different Speech Recognition Models

Every customer interaction is unique but all-in-one speech recognition models don’t understand these differences. Deepgram’s models are highly accurate for your use case because we know transcribing the conversation of an automated drive-thru doesn’t have the same requirements as transcribing the contents of an earnings call.

Try it Free

One Size Does Not Fit All

How many times have you seen speech recognition companies claim their all-in-one solution works great for all use cases and industries? How can they possibly work great for everything? At Deepgram, we know this is not possible.

We know that each customer has unique words, jargon, and terminology they use. On top of that, some situations have a high amount of environmental noise or crosstalk that needs to be filtered or separated to get the transcription accuracy you need. We have optimized our speech models for different situations to filter out different audio, identify unique terminology, jargon, noise, and other factors specific to that use case. 

Use Case Models That Get the Job Done Right

Don’t see a model below that matches your need? We can start with one of our base use case models and quickly — within weekstrain a tailored speech model for your use case.

View Documentation

Conversational AI

Created for conversational AI voicebots and for IVR applications where specific words are more important than other words to determine intent.

Earnings Calls

Created for transcribing the audio or video presentation of earnings reports and follow-on Q&A sessions.  The most important aspect of this model is the financial terms that need to be transcribed.


This is our first and most general model that can be used for general transcription needs.


Created for meetings that may have multiple speakers on one audio channel, crosstalk, and/or environmental noise.

Phone Call

Created for contact centers and other two-channel phone calls where each speaker is on different channels.


Created for single or multi-speaker videos that may have background noise.


Created for voicemail transcription where this is normally just one speaker on one channel with fairly clear audio.

This is really our first use case with Deepgram. It’s definitely been the best out of the box transcription accuracy I’ve seen and I’ve had to utilise a number of other solutions when working with our clients.”

James Iansek, Co-founder/COO of Operative Intelligence

Our partnership with Deepgram and their models in conjunction with our internal models that are trained on case-specific data get well over 90% accuracy.”

Dion Millson, CEO,

View More Stories