Deepgram Changelog

Stay up to date with enhancements to our AI Speech Platform and ecosystem

OnPrem

Deepgram On-premises September Release (230920)

Brent George
Sep 20, 2023
Container Images (release 230920) deepgram/onprem-api:release-230920 Equivalent image tag to deepgram/onprem-api:1.102.1 deepgram/onprem-engine:release-230920 Equivalent image tag to deepgram/onprem-engine:3.58.1 deepgram/onprem-license-proxy:release-230920 Equivalent image tag to deepgram/onprem-license-proxy:1.4.2 deepgram/onprem-billing:release-230920 Equivalent image tag to deepgram/onprem-billing:1.7.2 deepgram/onprem-metrics-server:release-230920 Equivalent image tag to deepgram/onprem-metrics-server:2.0.6 deepgram/onprem-dgtools:release-230920 Equivalent image tag to deepgram/onprem-dgtools:2.1.5 This Release Contains The Following Changes Support for Deepgram Nova-2 . Please contact Deepgram Customer Success to request access to this new model architecture. Significant improvements in diarization quality for batch requests. Addresses a memory leak in onprem-engine that originated in a upstream dependency. This memory leak was only present in the August (230804) release. onprem-dgtools now accepts licensing information passed via the DEEPGRAM_API_KEY environment variable, similar to onprem-api and onprem-engine . Other stability improvements & bug fixes.šŸ›
SpeechModel

Introducing Nova-2 Early Access

Natalie Rutgers
Sep 19, 2023
Deepgram is excited to announce early access to our next-gen speech-to-text model, Nova-2. As shared in our Marketing Announcement , Nova-2: Outperforms all alternatives in terms of accuracy, speed, and cost ( starting at $0.0043/min ). Is 18% more accurate than our previous Nova model and offers a 36% relative WER improvement over OpenAI Whisper (large). Pay as You Go and Growth users may access this model immediately in the API Playground or by requesting model=nova-2-ea in their API requests. Enterprise customers can reach out to their account representative or Contact Us for access. Nova-2 Early Access supports hosted and on-prem transcription of pre-recorded and streaming English audio. Read more about Nova-2 in the Deepgram Documentation .
OnPrem

Deepgram On-premises August Release (230804)

Pankaj Trivedi
Aug 4, 2023
Container Images (release 230804) deepgram/onprem-api:1.97.1 deepgram/onprem-engine:3.53.6 deepgram/onprem-license-proxy:1.4.2 deepgram/onprem-billing:1.7.2 deepgram/onprem-metrics-server:2.0.6 deepgram/onprem-dgtools:2.1.4 Deepgram On-premises Release Tags deepgram/onprem-api:release-230804 deepgram/onprem-engine:release-230804 deepgram/onprem-license-proxy:release-230804 deepgram/onprem-billing:release-230804 deepgram/onprem-metrics-server:release-230804 deepgram/onprem-dgtools:release-230804 This Release Contains The Following Changes Summarization efficiency improvements for broader GPU compatibility. Summarization-related errors and warnings produced by API calls have been expanded and made more detailed; please see our docs on this topic . Opus compatibility improvements with multichannel audio. Added a configuration parameter for batch sizes specifically for Whisper models. Please contact your account manager for more details. Added additional error reporting for streaming-related failures when the initial request includes the debug=true query parameter. Stability improvements and bug fixes.šŸ›
Feature

Introducing New Summarization

Pankaj Trivedi
Jul 19, 2023
We're excited to announce the release of our first domain-specific language model (DSLM) for speech summarization of call center interactions. You can request our new Summarization API endpoint by adding a summarize parameter set to v2 in the API call. It will then return a summary object in the response body of the output. The summary object includes status and a concise summary of the entire conversation. The URL query to call the DSLM-powered Summarization API might look like this: https://api.deepgram.com/v1/listen?summarize=v2 Example curl request: curl --location --request POST 'https://api.deepgram.com/v1/listen?summarize=v2' \ --header 'Authorization: Token <Your API KEY>' \ --header 'Content-Type: audio/wave' \ --data-binary '@/Path to file' You can send requests to the API with an Authorization header that references your project's API key Authorization: Token YOUR_DEEPGRAM_API_KEY The output response will contain the generated summary based on the provided audio. Summarization V2 supports English and Pre-Recorded audio. Primary difference between V1 (summarize=true) and V2 (summarize=v2) V1 provides summaries per channel. V2 provides a single summary across all the channels. V1 response contains summary objects (with summary, start, and end word). V2 response contains a single object with result and short key. V2 of our Summarization offers improved performance in terms of quality, content, and readability of generated summaries. For the best results moving forward, we recommend leveraging V2 of our summarization. Learn more about using our new Summarization V2 feature. Test Summarization V2 using our API Playground. We are thrilled to get this feature into your hands and await your feedback. Please share it with us at Product Feedback or your dedicated support channel.
OnPrem

Deepgram On-premises July Release (230705)

Evan Henry
Jul 7, 2023
Container Images (release 230705) deepgram/onprem-api:1.95.0 deepgram/onprem-engine:3.53.0 deepgram/onprem-license-proxy:1.4.1 deepgram/onprem-billing:1.7.1 deepgram/onprem-metrics-server:2.0.6 deepgram/onprem-dgtools:2.1.4 Deepgram On-premises Release Tags deepgram/onprem-api:release-230705 deepgram/onprem-engine:release-230705 deepgram/onprem-license-proxy:release-230705 deepgram/onprem-billing:release-230705 deepgram/onprem-metrics-server:release-230705 deepgram/onprem-dgtools:release-230705 This Release Contains The Following Changes Support for license keys created and managed from Deepgram Console . Support for new Domain-Specific Language Model powered summarization. Learn more . The minimum supported CUDA runtime version for onprem-engine has changed from 11.0.3 to 11.3.1. Systems using NVIDIA drivers before version 450.80.02 might encounter errors when attempting to start this release of onprem-engine. Deepgram recommends installing the latest NVIDIA drivers for maximum compatibility, stability, and performance. The onprem-engine container size has been significantly reduced. Reduction in frequency of hallucinations when using Deepgram enhanced models. Improvements to accuracy of reported word times when using existing Whisper models. Duration values specified in the onprem-api configuration file can now include unit suffixes. For example, instead of writing 480 it is now possible to write 4m. Values with no suffix are assumed to be seconds. Stability improvements and bug fixes.šŸ›
Languages
SpeechModel

Introducing Nova support for the Spanish language

John Vajda
Jun 30, 2023
Nova is Deepgram's most powerful and affordable speech-to-text model. Training on this model spans over 100 domains and 47 billion tokens, making it the deepest-trained automatic speech recognition (ASR) model to date. Nova doesn't just excel in one specific domain ā€” it is ideal for a wide array of voice applications that require high accuracy in diverse contexts. This model now support the Spanish language es and es-419. Learn more about using our new Nova Model. Quickly test out this new Model using our API Playground. Learn more about Deepgram Language Support.
Feature
Notices
OnPrem

Deepgram On-premises June Release (230606)

Evan Henry
Jun 6, 2023
Container Images (release 230606) deepgram/onprem-api:1.92.2 deepgram/onprem-engine:3.48.2 deepgram/onprem-license-proxy:1.4.1 deepgram/onprem-billing:1.7.1 deepgram/onprem-metrics-server:2.0.6 deepgram/onprem-dgtools:2.1.4 Deepgram On-premises Release Tags This release marks the first official Deepgram On-premises release to include support for a release tag. Instead of specifying a specific version tag for the individual container images, all of the images now support the release-230606 image tag. deepgram/onprem-api:release-230606 deepgram/onprem-engine:release-230606 deepgram/onprem-license-proxy:release-230606 deepgram/onprem-billing:release-230606 deepgram/onprem-metrics-server:release-230606 deepgram/onprem-dgtools:release-230606 This Release Contains The Following Changes New, easy-to-use deployments with embedded default configurations in the container images. Simply add the container environment variable DEEPGRAM_API_KEY to the docker-compose.yml stanzas for the api and engine container images. For more information, refer to the on-prem deployment documentation for your specific deployment OS. Deepgramā€™s new Speaker Diarization architecture with 53.1% improved accuracy overall from the previous version, a 10X faster turnaround time, and language-agnostic support, unlocking accurate speaker labeling for transcription use-cases around the globe. Currently only pre-recorded audio is supported. We are ending support for our legacy Diarization model. Please reach out to Deepgram Customer Success to ensure you are have the latest supported Diarization model. Deepgramā€™s revamped automatic language detection feature which enables users to automatically detect the dominant language in an audio file and transcribe the output in the detected language, providing unparalleled accuracy in detecting and transcribing audio data in over 15+ languages and dialects, including English, Spanish, Hindi, Dutch, French, and German. Currently only pre-recorded audio is supported. Addresses an issue where onprem-license-proxy was inappropriately coloring logs when directed to output to a file. Addresses CVE-2020-26235 in onprem-license-proxy . Other stability improvements & bug fixes.šŸ›

Stream KeepAlive

Shir Goldberg
Jun 2, 2023
Customers can now keep Deepgram streaming connections open during periods where no audio data is being sent. Previously, if no audio was being sent over the websocket, connections would close after a short window of time. Weā€™ve introduced a new KeepAlive WebSocket message that clients can use to indicate to Deepgram that the WebSocket should be kept open even though no data is being sent through. For more information, visit our documentation .

Improved Language Detection Capabilities

Pankaj Trivedi
May 11, 2023
We are thrilled to release the enhanced version of our Automatic Language Detection feature (detect_language=true), now supporting over 15+ languages. Deepgramā€™s Enhanced Language Detection is adept at identifying the primary language in an audio file and providing transcriptions in the detected language. To utilize the Language Detection API, simply use the following URL query format: https://api.deepgram.com/v1/listen?detect_language=true&punctuate=true The API Response will output the detected language. For example ā€“ ā€œdetected_languageā€: ā€œesā€ To learn more about the various parameters you can use to customize your transcriptions with Deepgram, check out the list of Deepgramā€™s features in our documentation . If you have any questions, please reach out to us through your dedicated support channel.

Improved Speaker Diarization

Pankaj Trivedi
May 11, 2023
Weā€™ve recently released an improved version of Diarization (for PRE-RECORDED). Its advanced speaker separation accurately identifies speakers in complex audio streams, reducing errors where two speakers are identified as one. Additionally, the new diarizer can identify and count speakers more accurately, reducing instances where one speaker is split between two labels. The result is more readable transcripts. Please note that with the release of improved diarization, we will be deprecating the diarize_version parameter and will be retiring the old diarizer. Currently, you can call the old diarizer using the below URL: https://api.deepgram.com/v1/listen?tier=enhanced&diarize=true&diarize_version=2021-07-14.0 To access the latest diarizer, all you need to do is add diarize=true to your URL: https://api.deepgram.com/v1/listen?diarize=true We encourage you to switch to the improved Diarizer as soon as possible to ensure that you are taking advantage of the latest advancements in our technology. To learn more about the various parameters you can use to customize your transcriptions with Deepgram, check out the list of Deepgramā€™s features in our documentation . If you have any questions, please reach out to us through your dedicated support channel.
Api
SpeechModel
Notices
Feature
Languages
OnPrem

Deepgram On-premises April Release (230413)

Evan Henry
Apr 13, 2023
On-Premises April Release 230413 Docker Hub Images deepgram/onprem-api:1.88.0 deepgram/onprem-engine:3.45.3 deepgram/onprem-license-proxy:1.4.0 deepgram/onprem-metrics-server:2.0.4 deepgram/onprem-billing:1.7.0 deepgram/onprem-dgtools:2.1.4 This Release Contains The Following Changes Support for Deepgram Nova & Deepgram Whisper . Please contact Deepgram Customer Success to request access to these new model architectures. For more information, refer to the Model and Tier documentation. Support for a faster version of the Diarizer. This release is not backwards compatible with the old Diarizer models. If you are using Diarization, please reach out to Deepgram Customer Success for the latest Diarization model to ensure continuity of Diarization support with this release. Support for a new Prometheus /metrics endpoint in onprem-engine . Please contact Deepgram Customer Success or refer to the On-prem Metrics and Prometheus Integration guides for more information. deepgram/onprem-metrics-server is now deprecated. We recommend that you use the Prometheus /metrics endpoint instead. Resolves an issue where <unk> was incorrectly appearing in transcription results.

Introducing Deepgram Nova & Deepgram Whisper Cloud and On-Prem

Natalie Rutgers
Apr 13, 2023
We are pleased to announce the latest model releases to our speech recognition services. Deepgram Nova Deepgram Nova presents the new state-of-the-art in speech recognition. Read more about it in our announcement . Nova is available with our general and phonecall models. To access either, please use the following syntax in your request: General: model=nova or model=general&tier=nova Phonecall: model=phonecall&tier=nova Support for Deepgram Nova includes: English language support. Pre-recorded and live streaming audio transcription. Use through Deepgramā€™s Hosted API or On-Prem Deployments. Please view pricing at deepgram.com/pricing . Deepgram Whisper Cloud and Whisper On-Prem Deepgram Whisper Cloud and Whisper On-Prem integrate OpenAIā€™s Whisper models with Deepgramā€™s powerful API and feature set. Deepgram Whisper Cloud and Whisper On-Prem can be accessed with the following API parameters: model=whisper or model=whisper-SIZE Available sizes include: whisper-tiny whisper-base whisper-small whisper-medium (default) whisper-large (defaults to OpenAIā€™s large-v2 ) Note: You should not specify a tier when using Whisper models. Use of Deepgram Whisper Cloud is subject to a rate limit of 50 requests per minute or 15 concurrent requests. Support for Deepgram Whisper Cloud and Whisper On-Prem include: A selection of Deepgramā€™s transcription features, including: Diarization Word-level time stamps Language detection Redaction Diarization Smart Formatting Punctuation, Numeral Formatting, Find and Replace, Paragraphs, Utterances Multichannel Support Callback Support Summarization Topic Detection OpenAIā€™s list of supported languages . Pre-recorded transcription. Use through Deepgramā€™s Hosted API or On-Prem Deployments . Please view pricing at deepgram.com/pricing . To learn more about the various parameters you can use to customize your transcriptions with Deepgram, check out the list of Deepgramā€™s features in our documentation .

Endpointing

Shir Goldberg
Mar 15, 2023
Deepgram is excited to announce a new version of the Endpointing feature. Endpointing is designed to get customers the fastest and most accurate transcripts. It can also help determine when someone has finished talking. Users can specify the amount of silence they want to wait for when someone finishes talking before Deepgram finalizes the transcript and returns it with the flag speech_final . The Endpointing feature utilizes a powerful Voice Activity Detection algorithm to determine when someone has stopped speaking. Once speech has finished, Deepgram is able to quickly finalize a transcription of the speech and return it, along with a speech_final flag marking that an Endpoint was detected. This algorithm can be configured through our Endpointing feature. To find out more, head to our Endpointing documentation .
Realtime
Smart formatting
SpeechModel
Notices
Feature
OnPrem

Deepgram On-premises Release 230228

Evan Henry
Feb 28, 2023
Deepgram released a new version of its on-premises solution. On-Premises Release 230228: Docker Hub Images deepgram/onprem-api:1.83.0 deepgram/onprem-engine:3.43.4 deepgram/onprem-license-proxy:1.3.0 deepgram/onprem-metrics-server:2.0.3 deepgram/onprem-billing:1.5.3 deepgram/onprem-dgtools:2.1.3 Changes Improvements to Smart Formatting. Improved usage reporting for multichannel, streaming, and NLU features. TLS support for streaming callbacks. Resolves an issue when parsing Ī¼-law or A-law encoded audio data. Resolves an issue which impacted Language Detection. Migrated to the Rocky Linux base images to resolve CVEs for the following components: deepgram/onprem-api deepgram/onprem-engine deepgram/onprem-license-proxy deepgram/onprem-metrics-server deepgram/onprem-dgtools Due to the base image change, it is recommended that users prune their container images with the following command: docker image prune -a Other minor stability, error handling, and performance improvements.
Languages

Updates to General Model (en-US) Base and Enhanced Tiers

Nick Martin
Feb 27, 2023
New versions of our English (en-US) General modelā€™s Base and Enhanced tiers have greatly improved accuracy and throughput. We are incredibly excited to release these enhancements to our customers.Ā We are making the new tiers available for testing over the next two weeks so you can compare them to your current results. If you are interested in testing the new tiers please reach out to Deepgram support to enable access. We will be deploying the new tiers as defaults on March 14th, 2023. If you do not wish to use the new versions that will be deployed as the default on March 14, 2023, you may pin to previous versions by specifying the desired version as version={desired_version}. If no version is specified, we will use our latest version by default. Most recent previous versions of these tiers: Base (model=general) 2022-01-18.1 Enhanced (model=general) 2022-05-18.0 Deepgram support is always available to walk customers through updates, as well as resolve any issues that arise when upgrading.
Page 1 of 6

Stop building work-arounds for STT systems that don't work.

Start FreeTalk to an expert