Introducing Nova-2: The Fastest, Most Accurate Speech-to-Text API

Learn more

Language AI models to power your apps

Power your apps with world-class speech-to-text and domain-specific language models (DSLMs). Effortlessly accurate. Blazing fast. Enterprise-ready scale. Unbeatable pricing. Everything developers need to build with confidence and ship faster.

Sign Up FreeBook a Demo
Based on 154+ reviews.
Trusted by the world’s top Enterprises, Conversational AI, & Startups

Try our speech-to-text & understanding API

Play around with transcribing sample audio files or our live streaming transcription demo. Explore how our audio understanding models work.

Step 1: Input Audio

NASA: First All Female Space Walk

POST https://api.deepgram.com/v1/listen
1{
2  "url":"nasa_demo"
3}
The response will show here
Step 2: Transcription Output
The response will show here

Give it a try.

Click the mic to transcribe live in English or select another language.

Transcription

Click the mic to transcribe live in English or select another language.
Audio Input
summarize=true&punctuate=true

Alright. I’m ready. Good evening. I’m Dr. Emmett Brown. I’m standing on the parking lot at Twin Pines Mall. It’s Saturday Morning October twenty sixth nineteen eighty five one eighteen AM. And this is temporal experiment number one. Come on Einey. Hey, boy. Get in there. At a boy. In you go. Sit down. Get your seatbelt on. That’s it. Okay. Please note, that Einstein’s clock is in precise synchronization with my control watch. Got it? Right. Check Doc. Good. Have a good trip Einstein. Watch your head. You got that thing hooked up to the car? Watch this. Yeah Ok. Not me the car, the car. If my calculations are correct. When this baby hits eighty eight miles per hour, you’re gonna see some serious s**t.

“summary”: “An experiment is being conducted. The speaker is Dr. Emmett Brown and he gives his location and the date and time. Someone is traveling by car and the experiment is about to begin.”

Watch this watch this. What did I tell you? Eighty eight miles per hour. The thermal displacement occurred exactly what? One O two AM and zero seconds. Jesus Christ. Jesus Christ, doc, you disintegrated Einstein. Calm down Marty. I didn’t disintegrate anything. The molecular structure of both Einstein and the car are completely intact. Then where hell are they? The appropriate question is, when the hell are they? You see, Einstein has just become the world’s first time traveler. I set him into the future. One minute into the future to be exact. Now precisely one twenty one AM and zero seconds we shall catch up with him and the time machine. Wait a minute. Wait a minute. Doc. Are you telling me that you built a time machine out of a Delorean?

“summary”: “There is concern over the traveler’s safety but everything is intact. The event is the world’s first time travel experiment made out of a Delorean.”

 

Ready to get started?

Conversational & transcription intelligence on the world’s best speech AI platform.

Sign Up FreeBook a Demo

Unbeatable value, unmatched performance.

Extract the most value with speech-to-text and Language AI.

>90% accuracy

Deepgram leads the industry with most accurate models in market across use case categories.

3X lower cost

Optimized speech recognition and domain-specific language models are precisely tuned on our end-to-end GPU-based infrastructure to give superior, tailored performance at the lowest cost .

Maximum flexibility

We provide flexible deployment options, robust features and multilingual support, and fit-for-purpose DSLMs easily adapted to your customers' unique needs.

20X faster

Transcribe in real-time or an hour of pre-recorded audio in about 12 seconds.

Setting new benchmarks in ASR performance

All ASR providers strive to have the most accurate transcripts possible, but what about other critical features you require? We advise performing side-by-side comparisons and testing with the real-world audio you'll use in production to determine the best speech solution for your needs.

See The Full Comparison
Features and Capabilities
Deepgram
Deepgram
OpenAI Whisper
OpenAI Whisper
Google STT
Google
Batch process (1hr of audio)
~12 s
158 s
1443 s
Real-time streaming lag
<300 ms
Not available
1443 ms
Tailored speech models
Deep speech (search)
Diarization
Up to 10
Not available
Up to 6
Noise reduction
Custom vocabulary
Redaction
Punctuation

Essential building blocks for language AI.

Deepgram is a foundational AI company providing the speech-to-text and language understanding capabilities you need to make your data readable and actionable by humans…or machines.

Transcription

Create accurate, usable transcripts. It’s speech-to-text for developers, by developers.

  • Punctuation, Numerals, Redaction, Profanity Filtering

  • Utterances, Deep Search, Find & Replace, VAD, Keywords

  • Paragraphs, Interim Results Understanding Features

Explore More

Understanding

Accurately identify, extract, and summarize conversational audio built on the industry’s most accurate, speech-to-text.

  • Speaker Diarization, Entity Detection, Summarization

  • Topic Detection, Language Translation

  • Language Detection, Sentiment Analysis

Explore More

Trusted by startups and enterprises.

Discover the power of our product through real stories.

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.

Sign Up FreeBook a Demo
Deepgram
Essential Building Blocks for Language AI