Deepgram and PyTorch: The Origins of Our Foundational AI Company

At Deepgram, we love PyTorch. Our ASR and NLU models run on PyTorch. In this post, we explain why.

4 min read

By Jose Nicholas FranciscoMachine Learning Developer Advocate

Last UpdatedJun 27, 2024

First things first: What’s a foundational AI company?

Well, many businesses out there call themselves “AI companies” because they somehow folded AI into their products on the side. Maybe a publishing company has Grammarly somewhere in its editing pipeline. Maybe a marketing firm will play with a GPT model to generate snappy headlines. Sure, AI is involved, but these are still publishing companies, not AI companies, at their core.

When you hear the words “foundational AI company” think of a business whose primary goal is to discover, develop, and apply frontier technologies in machine learning to the real world. Think of businesses like OpenAI or Deepgram.

In the case of Deepgram, our main product is a Speech-to-Text API. This API calls deep neural networks—the leading edge of AI—after all. In fact, from a phonetic and technological perspective, any speech-to-text software needs to have AI running under the hood.

Why?

Well, long story short, much like how everyone has a unique fingerprint, audio waveforms are like snowflakes. The same person can say the same word, with the same voice, in the same tone, and all the spectrograms would come out different.

As a result, it’s essentially impossible to map any given spectrogram or waveform to any given word because there are an infinite number of valid waveforms that can map to the word “Deepgram.” Or the word “computer." Or the word “supercalifragilisticexpialidocious.”

And so, Deepgram’s main product is, in essence, an artificially intelligent language model. Or, to use the trendier term: a Large Language Model (LLM).

But how did we get here? Surely no company is born with an already-impressive, multibillion-parameter AI in its hands. We have to start somewhere…

That’s where PyTorch comes in.

See, in Deepgram’s early days, we had to find a framework that could handle massive amounts of intense, vector-matrix-tensor calculations in an easy-to-use yet efficient way. PyTorch is what we landed on.

When asked about PyTorch, members of our research team revealed that PyTorch was especially useful for both automated and manual model training. That is, whenever we give our models a new task to learn—such as summarization or topic detection—PyTorch is the engine under the hood that makes the magic happen.

But let’s get into some technical specifics:

Let’s say we wish to implement a topic-detection AI. This could be useful for meetings or conference calls. For example, if you had an AI transcription model listening in on a meeting for your business, you could easily pass the output transcription into the topic detection AI and instantly have a table of contents for what was discussed.

One of the ways to train that model is through PyTorch!

Without giving away company secrets, an example of a training input could look like this:

And while Deepgram doesn’t solely use any of the default training functions out-of-the-box, we do implement our own (proprietary) training code. And that training code is built on PyTorch.

And even outside of these language-specific tasks, it’s important to note that our transcription models—again, built in part on PyTorch—have led to some incredible projects.

Now, you may be asking, “Why did Deepgram start off with PyTorch specifically, instead of alternatives like Tensorflow?”

While we love TensorFlow and indeed list it as one of the best Python ML Frameworks in a previous post, Deepgram stuck to PyTorch because of its ability to give us low-level control of the underlying graphs in our model. And again, without giving away any secret sauce, it’s important to note that PyTorch’s ability to grant users enough flexibility at the low-level led us to some of the incredible ASR and NLU models we have available today.

PyTorch’s intuitive interface and well-written docs exactly suited the needs of a budding, growing Deepgram five years ago, and it equally suits our needs today.

Long story short, if you want to start your own foundational AI company, building a product on PyTorch could lead you to incredible places!

Note: If you like this content and would like to learn more, click here! If you want to see a completely comprehensive AI Glossary, click here.

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.