Superior accuracy, speed, and cost compared to OpenAI Whisper

Innovators are switching from Whisper’s speech-to-text API to Deepgram to enable the future of intelligent voice applications. See how we compare.

COMPARE CAPABILITIES

Accurate transcription. Custom models.
Lightning speed.

Features and Capabilities
Deepgram
OpenAI Whisper
Batch processing (1hr of audio)
~12.1s
~48.5s (large model)
Streaming processing lag
<300 ms
Streaming not available
Word Error Rate (WER)
9.5%
16.2-12.4%
Diarization (separate per speaker)
Up to 10
Not available
Tailored speech models
Word level timestamps
Deep search (audio)
Paragraphs
Custom vocabulary (keyword boosting)
Redaction
Summarization
Punctuation
Profanity filter
Numeral formatting

Better Speed & Real-time

Generate a transcript in milliseconds.

The time it takes to generate a transcript can make or break your use case. Deepgram’s enhanced model is 82x faster than Whisper’s “Large” ASR model. That means you get 1-hour of pre-recorded speech in seconds, versus hours. We also offer real-time processing with the lowest latency in the industry. Whisper only offers pre-recorded processing.

Higher Accuracy

Train speech models to fit your use case.

Deepgram offers a handful of models trained on data from various use cases, including phone call data, meetings data, earnings calls, and more. Plus, we offer the option to train a custom model on the specific words that matter to you. Any further improvements on OpenAI’s Whisper models would have to be made in-house by your own engineering and research teams.

Flexible Deployment, Lower cost to scale

The not-so-hidden costs

With Deepgram, our hosted cloud service is included. Sure you could deploy Whisper to a public cloud but that will incur significant costs if you actually plan to grow. Benchmark tests showed that scaling up to just 10K hours of audio incurs over $5K in cloud computing costs for Whisper. With Deepgram, you’d save roughly 40% with higher overall quality and faster turnaround for comparable audio.

Unlike Whisper, we can also package the API for use in VPC or on-prem applications.

When is Whisper the right choice?

As an open-source software package, Whisper can be a great choice for hobbyists and researchers. But if your project involves real-time processing of streaming voice data, if you need to train a custom model, or have a variety of other business needs, Whisper might not be the right choice for you. If you’re just curious or want to dabble, you can try Whisper with Deepgram’s API. But when you need a robust business solution, give Deepgram a try.

Try Deepgram

Switching to Deepgram is easy.

APIs, SDKs, and docs? 
Why, yes we do!

We’ve made switching to Deepgram easy with APIs, detailed guides, and clear documentation. Go ahead. Take it for a spin with $200 in free credits, no credit card required.

Get a Free API Key

 

Deepgram is doing for speech what SpaceX did for space travel. Between agressive pricing structures and increased accuracy, Deepgram is leading the charge in this space today.

 
View the case study