In this post, we'll cover how to check podcast episodes for mentions of your brand. This can be particularly useful for ensuring sponsorship obligations are met, or to check when competitors are spoken about.
Given an input of a podcast feed URL, start/end dates, and a brand name, the script will generate a report of all mentions as detected by Deepgram's fast and accurate speech recognition API.
Before You Start
You must have Python installed on your machine - I'm using Python 3.10 at the time of writing. You will also need a Deepgram API Key - get one here.
Create a new directory and navigate to it in your terminal. Create a virtual environment with python3 -m venv virtual_env and activate it with source virtual_env/bin/activate. Install dependencies with pip install deepgram_sdk asyncio python-dotenv feedparser.
Open the directory in a code editor, and create an empty .env file. Take your Deepgram API Key, and add the following line to .env:
Dependency and File Setup
Create an empty script.py file and import the dependencies:
Load values from the .env file and initialize the Deepgram Python SDK:
Finally, set up a main() function that is executed automatically when the script is run:
Define Parameters
Above the main() function, create a set of variables with settings for your report:
Each time Deepgram returns a search result, it will come with a confidence between 0 and 1. The required_confidence value will only report results above the specified confidence level.
Fetch Episodes with Feedparser
Remove the print() statement in the main() function, fetch the podcast, and take a look at the returned data by pretty-printing it:
Try it out! In your terminal, run the file with python3 script.py and you should see a bunch of data for each episode.
Filter Episodes Within Date Range
Feedparser will take in many different date formats for when a RSS entry is published/updated and normalize them to a standard format. Using the standardized output, create a helper function just below the main() function:
This function takes in an episode, gets the date (without time), and returns True if it is within the range between and including start_date and end_date.
Remove print(json.dumps(rss.entries, indent=2)) and replace it with the following:
The episodes array now contains only episodes within the date range.
Transcribe Episodes with Keyword Boosting and Search
Inside of the main() function, extract the podcast media URL, set transcription options, and request a Deepgram transcription. Finally, extract search results from the result:
Filter Only High Confidence Results
Add the following line below search_results to filter out any values which are below the required confidence:
Save Mentions Report
Below strong_results, take each episode (and each result within the episode), and add it as a new line in a report file:
That's it! Rerun the script with python3 script.py and, once completed, you should see a new file called stellar-mentions-2022-05-01-to-2022-06-27.txt - perfect if you want to run several reports.
Extending This Project
This project should equip you with the information you need to understand if and how often brand mentions occur throughout several podcast episodes. You should extend this further by creating more complex or graphical reports, allowing several brands to be searched for in one request, or by building a UI around this logic.
As ever, if you have any questions please feel free to get in touch or post in our community discussions.
The final code is as follows:
If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions .
Unlock language AI at scale with an API call.
Get conversational intelligence with transcription and understanding on the world's best speech AI platform.