Superior accuracy, speed, and cost compared to OpenAI Whisper

Innovators are switching from Whisper’s speech-to-text API to Deepgram to enable the future of intelligent voice applications. See how we compare.

Try Deepgram
Deepgram
Deepgram
OpenAI Whisper
OpenAI Whisper
Fully managed by Deepgram
FEATURES AND CAPABILITIES
Batch process (1hr of audio)
~30s
~230s (large model)
Accuracy (WER)
8.4
13.2
Diarization (separate per speaker)
Up to 10
Not available
Tailored speech models
Word level timestamps
Deep Search (audio)
Paragraphs
Custom Vocabulary (keyword boosting)
Redaction
Summarization
Punctuation
Profanity Filter
Numeral Formatting
PRICING
Pre-recorded per minute
Starting at $0.0043
Starting at $0.0060
Streaming per minute
Starting at $0.0059

Generate a transcript in milliseconds.

The time it takes to generate a transcript can make or break your use case. In addition to being more than 45% more accurate on average, Deepgram’s Nova model is 13x faster than Whisper’s “Large” ASR model. That means you get 1-hour of pre-recorded speech in seconds versus hours. We also offer real-time processing with the lowest latency in the industry. Whisper only offers pre-recorded processing.

Train speech models to fit your use case.

Deepgram offers a handful of models trained on data from various use cases, including phone call data, meeting data, earning calls, and more. Plus, we offer the option to train a custom model on the specific words that matter to you. Any further improvements on OpenAI’s Whisper models would have to be made in-house by your own engineering and research teams.

The not-so-hidden costs

With Deepgram, our hosted cloud service is included. Sure you could deploy Whisper to a public cloud but that will incur significant costs if you actually plan to grow. Benchmark tests showed that scaling up to just 10K hours of audio incurs over $5K in cloud computing costs for Whisper. With Deepgram, you’d save roughly 40% with higher overall quality and faster turnaround for comparable audio.

Unlike Whisper, we can also package the API for use in VPC or on-prem applications.

When is Whisper the right choice?

As an open-source software package, Whisper can be a great choice for hobbyists and researchers. But if your project involves real-time processing of streaming voice data, if you need to train a custom model, or have a variety of other business needs (including OpEx and reliable performance at scale), Whisper might not be the right choice for you. If you’re just curious or want to dabble, you can try Whisper with Deepgram’s API. But when you need a robust business solution, give Deepgram a try.

Try Deepgram

Switching to Deepgram is easy.

APIs, SDKs, and docs? Why, yes we do!

We’ve made switching to Deepgram easy with APIs, detailed guides, and clear documentation. Go ahead. Take it for a spin with $200 in free credits, no credit card required.

Get a Free API Key

Deepgram is doing for speech what SpaceX did for space travel. Between agressive pricing structures and increased accuracy, Deepgram is leading the charge in this space today.

Todd Fisher

Todd Fisher

Founder, CallTrackingMetrics

View the case study