Speech recognition models
Flexible model options let you pick the best one for the job.
Nova
Unmatched performance and value
Our next-gen model surpasses all competitors in speed, accuracy, and cost. Compared to the nearest competitor, Nova is 22% more accurate, more than 20 times faster, and over 3x cheaper.
Whisper
Improvements you can't miss
Our fully managed Whisper APIs are faster, more reliable, and cheaper than OpenAI's. Includes built-in diarization, word-level timestamps, and an 80x higher file size limit.
Custom
Boost performance using your data
Custom trained speech models give accuracy a noticeable boost, especially on unique customer jargon. High throughput models are also available to meet enterprise scalability requirements.
Setting new benchmarks in ASR performance
All ASR providers strive to have the most accurate transcripts possible, but what about other critical features you require? We advise performing side-by-side comparisons and testing with the real-world audio you'll use in production to determine the best speech solution for your needs.
Great, fast, affordable. Pick three.
No tradeoffs required between accuracy, cost, and speed.
Up to 40X faster
Transcribe in real-time or an hour of pre-recorded audio in about 12 seconds.
<300ms latency
The fastest real-time transcription speeds for human-like conversational AI experiences, real-time analytics, and enablement.
30+ languages
Over 30 languages and dialects to choose from, in numerous use case models, and model tiers. We understand the language nuances and needs of our global customers.
>90% accuracy
Deepgram leads the industry with most accurate models in market across use case categories.
Trusted by startups and enterprises
Discover the power of our product through real stories.