Straightforward pricing that rewards you as you grow

Whether you’re just exploring or ready for commitment, our pricing plans make it easy to get started. All plans give you access to speech-to-text, audio intelligence, and text-to-speech models and endpoints.

Pay As You Go

Free $200 of credit

Then pay-as-you-go. No minimums. No expiration.No credit card required.

Sign Up Free
  • Access all endpoints and public models
  • Up to 100 concurrent requests for Deepgram speech-to-text models and 5 concurrent requests for Deepgram Whisper Cloud
  • Up to 40 concurrent connections for the WebSocket API and up to 2 concurrent requests (that typically support 480 requests/min, around 120 concurrent conversations) for the batch API for Deepgram Aura text-to-speech. For higher rate limits, fill out this form
  • Up to 10 concurrent requests for Deepgram Audio Intelligence
  • Discord and community help 
Growth

$4k+ / year

Save up to 20%

With pre-paid credits for the year. Credits are redeemed against actual usage.

Buy Now
  • Access all endpoints and public models at favorable discounts
  • Up to 100 concurrent requests for Deepgram speech-to-text models and 5 concurrent requests for Deepgram Whisper Cloud
  • Up to 80 concurrent connections for the WebSocket API and up to 3 concurrent requests (that typically support 720 requests/min, around 180 concurrent conversations) for the batch API for Deepgram Aura text-to-speech. For higher rate limits, fill out this form
  • Up to 10 concurrent requests for Deepgram Audio Intelligence
  • Discord and community help
    Enterprise

    $10k+ / year

    For businesses with large volumes, data or deployment requirements, or support needs.

    Contact Sales
    • Access all endpoints and public models with our best discounts
    • Access to custom-trained speech-to-text models
    • Priority access to new endpoints and models
    • Highest concurrency support
    • Self-hosted deployments
    • Paid Support plans available
    • Discord and community help
    Speech to Text

    Power your apps with world-class speech recognition in 30+ languages.

    Includes: Speaker Diarization, Smart formatting, Automatic Language Detection, Deep Search, Keyword Boosting, Multichannel Support, and Callbacks.

    For detailed model, language, and feature availability, please refer to our Developer Documentation.

    Pre-Recorded
    Streaming
    Model
    Pay As You Go
    Growth
    Enterprise
    Nova-2
    $0.0043/min
    $0.0036/min
    Contact Sales
    Nova-1
    $0.0043/min
    $0.0036/min
    Enhanced
    $0.0145/min
    $0.0115/min
    Base
    $0.0125/min
    $0.0095/min
    $0.0048/min
    $0.0048/min
    $0.0042/min
    $0.0035/min
    $0.0038/min
    $0.0032/min
    $0.0033/min
    $0.0027/min
    $0.0035/min
    $0.0028/min
    Custom
    Redaction
    $0.0020/min
    $0.0017/min
    Entity Detection
    $0.0013/min
    $0.0011/min

    Rates listed above opt in to the Model Improvement Program.

    Model
    Pay As You Go
    Growth
    Enterprise
    Nova-2
    $0.0059/min
    $0.0049/min
    Contact Sales
    Nova-1
    $0.0059/min
    $0.0049/min
    Enhanced
    $0.0165/min
    $0.0136/min
    Base
    $0.0145/min
    $0.0105/min
    Custom
    Redaction
    $0.0020/min
    $0.0017/min
    Entity Detection
    $0.0013/min
    $0.0011/min

    Rates listed above opt in to the Model Improvement Program.

    Text to Speech

    Responsive, natural-sounding text-to-speech to power your high throughput voicebots and conversational AI applications. Billed per character.

    Model
    Pay As You Go
    Growth
    Enterprise
    Aura
    $0.0150/1k characters
    $0.0135/1k characters
    Contact Sales

    Rates listed above opt in to the Model Improvement Program.

    Calculate the Text to Speech concurrency you need

    Advanced Options
    Get your results!
    Audio Intelligence

    Powered by task-specific language models.
    Works with or without transcription. Handles text or audio.

    Model
    Pay As You Go
    Growth
    Enterprise
    Summarization
    $0.0003/1k input tokens - $0.0006/1k output tokens
    $0.00024/1k input tokens - $0.00048/1k output tokens
    Contact Sales
    Topic Detection
    Sentiment Analysis
    Intent Recognition

    Rates listed above opt in to the Model Improvement Program.

    Voice AI at scale with an API Call

    Get conversational intelligence with transcription and understanding on the world's best speech AI platform.

    Sign UpBook a Free Demo

    Frequently Asked Questions

    When you opt into using the multichannel feature, each channel is transcribed and billed separately. The total cost when using multichannel is the single-channel cost multiplied by the number of channels. When you do not enable multichannel, Deepgram converts your multichannel audio into mono single-channel audio, and it is transcribed and billed as one channel. We especially recommend using the multichannel feature on multichannel audio in which there is cross-talk (voices overlapping or talking over each other) for the most accurate transcription and speaker detection.

    Nova is our newest and most powerful model, offering the best balance between accuracy and cost-effectiveness. Enhanced is a powerful ASR model that performs especially well with uncommon words. Base is our signature model, with a solid combination of accuracy and cost-effectiveness. Some languages are only supported by Enhanced and Base. See our Models Overview and Model docs.

    We support over 40 audio and video formats, documented here.

    You purchase credit upfront with a credit card. Credit will be deducted from your balance as you use our API. Pay As You Go credit never expires. Growth plan credit expire 1 year from purchase unless you renew or upgrade.

    Deepgram bills by the second of audio. For instance, if you transcribe 61 seconds of audio, we bill you for 61 seconds of usage, not 2 minutes (120 seconds).

    Definitely. In fact, we’ve got the fastest real-time transcription in the biz with latency times of under 300 milliseconds.

    Yes! Our streaming API is designed for low latency and will return incremental transcripts as a speaker’s sentence unfolds. You can stay up to date on our latest Conversational AI technology by subscribing to our newsletter.

    If you’re on the Growth plan and have saved a credit card, you can continue to use our API with a 10% overage fee billed at the start of each month. This is still less than Pay As You Go rates. You can renew or upgrade your plan at any point to prevent an overage fee.

    We support 30+ languages with Speech to Text and English for Text to Speech. We are actively increasing the number of languages available for our products.

    Sure thing. You can get help from our community over at Discord or GitHub Discussions. If you are subscribed to our Enterprise plan, reach out to our support team via email or Slack for assistance.

    We bill based on usage, not users. Add as many team members and collaborators as you wish!

    Yes, our Enterprise plan offers self-hosted deployments of our voice AI products, for use in cloud environments or on-premises datacenters. Contact us about getting on an Enterprise plan to expand your deployment capabilities.

    Our Audio Intelligence features such as topic detection and summarization are not dependent on the selected model. However, audio intelligence features are primarily affected by the transcript, so you should use the most accurate model available for your use case for best results.

    Audio output streaming is available in our current TTS API. Text input streaming through websockets will be available soon. 

    Higher concurrency and requests per minute is available. Please contact sales.   

    Voice cloning is currently not available for Aura but we are looking into voice cloning in the future.