Article·Announcements·Aug 27, 2020

Deepgram Pioneers Novel Training Approach Setting New Standard for AI Companies

Scott Stephenson
By Scott Stephenson
PublishedAug 27, 2020
UpdatedJun 13, 2024

Artificial intelligence has made astonishing technological advances in recent years and more companies are turning to AI to improve internal functions and unlock the potential of enterprise datasets. IDC has characterized AI as "inescapable" and estimates that by 2025, at least 90% of new enterprise apps will embed AI. But getting to the right models to effectively power AI is hard - and especially hard for speech. Building a model is tedious, requiring multiple stages of training and refinement, and deep learning expertise is hard to find. Compound that with the endless variations of speech, lack of high-quality training data, and astronomical computing costs, it's no wonder homegrown and off-the-shelf speech recognition has been slow to succeed.

Introducing the first AutoML Model Training for Speech Recognition

That's why today we're excited to announce Deepgram AutoML, a new training capability that streamlines AI model development, reducing manual cycles for data scientists while giving them the best accuracy humanly possible. With our approach organizations can deploy not only one, but 10's or 1000's of models trained to the needs of their specific company, target industries or largest customers in an automated way.

Why AutoML?

AutoML is often referred to as "AI creating other AI." Rather than relying on humans to painstakingly create and hand-tune a wide variety of AI models, AutoML is a mechanism by which new AI models can be constructed and tuned automatically. While AutoML exists for NLP, image and vision, it has never been deployed for automatic speech recognition (ASR)-until now. As the first company to offer this innovative technology for ASR, we're furthering our mission to be the de facto speech company, offering the world's fastest, most accurate and scalable speech solution. AutoML training capabilities are one of many ways Deepgram enables customers to extract value from their audio and deliver on the vision of an AI-enabled Enterprise.

Our AutoML model training functionality is another proof point in how we continue to innovate and offer advanced solutions that far surpass what our competitors provide. With this, we're solving the challenges of building and training effective AI models, while delivering over 90% accuracy, 120 times faster delivery and at half the cost of Big Tech solutions. You can now get the best ASR solution with less hassle, time and money.

About Deepgram AutoML

Our state-of-the-art AutoML for speech recognition is now available to engineers, data scientists and others looking to implement speech recognition or replace clunky ASR models that haven't worked. Deepgram AutoML utilizes GPU resources more effectively and automates processes so a speech recognition model is more effective with a smaller amount of effort. With Deepgram AutoML data scientists no longer have to:

  • Select input audio features to denoise audio

  • Tune hyperparameters of Hidden Markov Models or Neural Networks

  • Modify underlying algorithms or architectures to maintain a custom vocabulary list

  • Apply model ensembling with keyword boosting or stacking

Deepgram AutoML reduces the time and effort needed to deploy speech recognition, enabling humans to spend more cycles on overall strategy and processes to successfully integrate AI into their organization. Humans have been, and always will be, an essential part of automating speech recognition as they are the only ones who can define what accuracy means, derive intuitions about their data, and create or curate new training data. Deepgram AutoML pushes the frontier of how AI helps humans evolve next generation AI.

How Deepgram AutoML works

Customers first begin by selecting a specific audio source. Next, they select a Deepgram base model to use: general, phone call or meeting. Then, customers select a training method and submit their model for training. After the model training process completes, customers review model performance (e.g., accuracy improvement). If additional gains are required, further training teaches models to recognize specific audio examples. Finally, customers select the top-performing model and with one click deploy it to cloud.

AutoML is the next frontier for artificial intelligence to allow teams to reach unprecedented levels of accuracy needed to solve business problems. We could not be more excited to be the first to provide AutoML for ASR.

Get started with Deepgram Beginner and Intermediate models by creating a free account or contact us to get started with AutoML!

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.