Over the last ten years, the number of people who listen to podcasts has doubled. With this increase comes more ad spending. Companies must monitor media mentions from podcast ads using AI and Python more than ever to identify which companies are mentioned, either theirs or a competitor.
For example, the podcasts I listen to occasionally include ads from multiple sponsors. What if you’re a company that needs to monitor media mentions in podcasts for your competitors? You need to identify what was said about these companies versus what was paid to be said. This differentiation is an important distinction.
There are a few ways to monitor media mentions in podcasts using AI speech-to-text and Python. Let’s look at a method using diarization (FYI, there is a better way further down in this post).
Method 1: Monitor Media Mentions in Podcasts Using Diarization with AI Speech Recognition
This method is interesting but not as effective as I’ll show later in this post. As a quick review, Deepgram’s diarization feature recognizes speaker changes in a transcript. For example, if there are multiple speakers and diarization is set to True, a word will be assigned to each speaker in the transcript.
A readable formatted transcript with the speech-to-text diarize feature may look something like this:
In a podcast, there’s usually an even split time between the speakers or the hosts. The way diarization is used to monitor media mentions in podcasts is to determine if one person is a speaker for a more extended time than the other. In our above transcript example, you’ll notice that Speaker 1 talks the longest during that segment. This could indicate that’s where the ad is read on behalf of the sponsor.
I promised you a better way to monitor mentions in a podcast. Let’s look at how that would work with Python, Deepgram’s AI speech-to-text Search feature, and entity detection with SpaCy.
Method 2: Monitor Media Mentions in Podcasts Using Search and Entity Detection
I was curious how to come up with a way to monitor media mentions in podcasts that would do the following:
Search for terms in the podcast transcript like “sponsor” or “paid” that indicate an ad segment Identify the organizations that are talked about in the ad to determine the company sponsoring that segment And overall, not cause a bigger headache for me
I needed to use an AI voice recognition API that would transcribe the podcast audio. That part was easy to figure out. Use the Deepgram Python SDK. I used the prerecorded option in this scenario to transcribe the already recorded audio. I also grabbed a Deepgram API key from our console, which has gamified missions you can try to get up to speed quicker.
Deepgram is nice because it has high accuracy, and the transcript gets returned quickly. Both are important in this case. I needed accuracy to correctly flag the organizations (I’ll show you in the code), and speed is an advantage, so I didn’t have to wait long for the transcribed audio.
The Search feature from Deepgram was a lifesaver when working on this project. It searches for terms or phrases by matching acoustic patterns in audio, then returns the result as a JSON object.
I added the Search feature as a parameter in the Python code like this:
Since I wanted to find where the podcast hosts mentioned sponsorships, searching for the world sponsor made sense. Imagine them saying something like, “Now a word from our sponsor”.
After printing the results, I received a response similar to this:
The response is a list of dictionaries with the closest match for my search term indicated by the confidence. The higher the confidence, the more likely it matches the search. This feature helped tremendously since all I had to do was pass in a word to search for in the transcript to the speech-to-text Python SDK and spit out a result.
Next, I used SpaCy to handle the entity detection. SpaCy is a Python library used for Machine Learning and Natural Language Processing. I was looking for a way to tag the entities in the transcribed audio as an organization.
SpaCy labels the recognized company entities as ORG, but I also used EntityRuler to identify lesser-known organizations. You’ll see how that works in the next section when I break down the code.
Python Code Breakdown With AI Deepgram Speech-to-Text and SpaCy
The first thing I did was pip install the following Python libraries:
If you want to see the Python code that I wrote for this podcast media mentions project, please look below:
In the transcribe_ with_deepgram method, you initialize the Deepgram API and open our .mp3 podcast file to read it as audio. Then you use the prerecorded transcription option to transcribe a recorded file to text.
In the get_media_mentions method, I’m loading up the SpaCY medium model and creating an EntityRuler. This EntityRuler allowed me to create a pattern Dream Host with a corresponding label ORG. In this example, Dream Host is not a recognized company. Still, it is mentioned in the transcript, so I wanted to ensure the code picked it up as I monitored the media mentions in the podcast.
Finally, I extracted the entities and printed out the text or name of the company mentioned in the sponsored segment of the podcast and all the labels with ORG, identifying it as an organization.
Here’s what it looked like in my terminal:
As you can see, the podcast hosts mentioned the companies Google and Dream Host.
Conclusion
That wraps up this blog post on how to monitor media mentions in podcasts with Python. I hope you found this tutorial helpful. If you did or have any questions, please feel to tweet me at @DeepgramAI.
If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions .
Unlock language AI at scale with an API call.
Get conversational intelligence with transcription and understanding on the world's best speech AI platform.