Article·AI Engineering & Research·Jun 13, 2024

From Zero to AI Hero: How Python Notebooks Empower Your Deepgram Journey

Learn how to work with voice data without knowing how to code using these Python notebooks from Deepgram.

Transcribe pre-recorded audio Transcribe live audio streams in real-time Create translated subtitles for any audio stream Label every speaker in your transcripts (Diarization)Conclusion

Share this guide

By Jose Nicholas FranciscoMachine Learning Developer Advocate

UpdatedJun 13, 2024

PublishedJun 8, 2023

Transcribe pre-recorded audio Transcribe live audio streams in real-time Create translated subtitles for any audio stream Label every speaker in your transcripts (Diarization)Conclusion

Listen, we’ve all been there.

If you’re a software engineer, a data scientist, or a coder in general, it can be a hassle to learn a new tool. Every API comes with a learning curve. You’ve gotta figure out new syntax, learn new parameter names, set new environment variables, download new dependencies, and navigate through forests of documentation of varying quality.

Here at Deepgram, we understand your struggle.

Or, if we’re getting personal, I understand your struggle. Me. The writer of this blog post.

As a result, I wrote some code so that you don’t have to. Below are four Python Notebooks that allow you to use Deepgram without having to write a single line of code. All you need to do is grab your API key, modify a couple of variables, and go!

Transcribe pre-recorded audio

If you have an audio file that you want to transcribe, this is the notebook for you! Whether it’s a 30 second .wav file or an hour-long .mp3, we’ve got you covered. Transcribing pre-recorded audio is the most basic functionality Deepgram has to offer.

Here’s the notebook. All the instructions you need are inside. But if you want a video guide, check out the one linked below:

Transcribe live audio streams in real-time

Here’s one spot where Deepgram shines while others don’t. Very few speech-to-text providers offer their users live-transcription, and even fewer do it well. In fact, the mighty OpenAI doesn’t even offer real-time transcription in the first place.

And so, here comes Deepgram to the rescue with this notebook! By default, we set up this notebook such that you can transcribe the BBC Radio’s livestream for 30 seconds. However, if you wish to change the stream or the length of time the program runs, you just have to change a couple variables.

Don’t worry, all instructions on how to change up the code to fit your current needs are written inside the notebook

And of course, we always strive to run the extra mile, so here’s another (optional) video-tutorial to accompany the notebook if that fits your learning style:

Create translated subtitles for any audio stream

That’s right. Not only can you transcribe audio streams, but we’ve written Python notebooks that let you translate those streams as well. Again, we’re using BBC Radio as an example, but you can swap that audio source out with whatever you want.

Again, here’s the notebook.

And here’s the accompanying video tutorial:

Label every speaker in your transcripts (Diarization)

Finally, let’s say you have a pre-recorded audio. Some .mp3 or .wav file that’s laying around. Maybe it’s a podcast you recorded. Maybe it’s an important Zoom call from work. Or perhaps even an earnings call. Whatever it is, there’s a good chance you have multiple people speaking.

Well, it seems rather important to have a transcript that distinguishes between each of those speakers. So instead of having to parse through this…

… you can simply read this:

Labeling speakers—otherwise known as diarization—is yet another domain where Deepgram shines while others don’t. Diarization is an incredibly difficult task to achieve. But luckily for you, most of the work is abstracted away in this notebook. Just fill in your API key and run!

And, of course, the video tutorial:

Conclusion

Note, this is just the beginning. I have loads of Notebook ideas in my head, waiting to come to life.

Also, if you don’t want to deal with coding whatsoever—like, if you don’t even want to look at code, let alone read it—check out our API Playground. There, you can see exactly what Deepgram is capable of before even creating an API key.

Check out this video to see what our Playground entails.

Anyway, keep an eye out for further notebooks. And if you have any questions or ideas for future notebooks we can write, feel free to reach out to us on social media! Our DMs are always open.

Check out our Github: https://github.com/deepgram

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.