Simone Giertz has a great TED Talk where she extols the virtues of building useless things. I often find myself building useless things to teach others about new technologies and development practices. So when I started picking up Python, building another useless thing seemed like the best way to start.
Since Python is an object-oriented language, I expected to pick it up quickly. After decades of .NET and JavaScript, OOP languages are my safe space. But beyond the syntax, what type of things do I need to know? I made a list:
Loops and conditions
File access
HTTP requests
Then there were questions like "could I build an API?" and "what do Python developers do for front-ends?" Of course, Deepgram has a Python SDK so gaining experience using it would be beneficial and I could even provide feedback to the folks that are building it. That meant I needed to do something with audio. HTTP requests, audio, files, loops, and conditions... clearly, I needed to build a search engine for podcasts.
Since I'm still learning Python, I leaned on our team at Deepgram to help me speed up the process. First up, accessing a podcast's RSS feed and identifying the URL to the mp3 files.
Pulling Podcast Episodes from an RSS Feed
Neil deGrasse Tyson has a great podcast called StarTalk Radio that would provide tons of searchable words. Its RSS feed is located at https://feeds.simplecast.com/4T39_jAj, so I needed to read that and pull in individual episodes. Originally, I planned to save the data from the feed into a PostgreSQL or MySQL database but decided to keep it simple by just saving the info from one episode to a Python file.
I created a file named load.py to get the episode information and transcribe the audio. You can see the code below, but the TL;DR is that it:
Downloads the RSS feed locally to a file called theRSSfeed.xml
Parses the XML in that file to find all the mp3 URLs
Takes the first episode and feed its mp3 to Deepgram for transcription
Saves the transcript that Deepgram generates to a file named podcast_data.py
Then I ran python load.py and BAM!, I've got a podcast_data.py with the transcript of the episode. Now to start building an API that I can send search terms to.
Building the Podcast Search Engine
I spent some time reading Tonya's blog posts on FastAPI and Django, but eventually decided on Flask to build the back-end API.
Receiving and Responding to Requests
Before I could receive search terms and respond, I had to figure out how to receive any request and return a response. Luckily, the Flask documentation provides several good examples of doing that. I started by installing Flask with pip.
Flask's documentation told me that if I name my file app.py I can default to starting the server using flask run in the terminal. I started with a very basic app.py to see if I could return anything.
That little bit of Python returns a JSON object. Visiting http://127.0.0.1:5000, confirmed that it responded with the JSON I expected. Now I can receive a request and respond to it.
Next, I needed to be able to receive data that is sent via an HTTP POST request. Again, I was saved by the Flask documentation. I knew that I would eventually be sending a form field named search, so I added a new method to the app.py file:
A quick test confirmed that I could pass in form values and respond with them. With those wins under my belt, I was ready to tackle the job of searching through the transcript.
Searching the Podcast Transcript
To make sure I'm comparing apples to apples, I needed some basic text normalization. The text_normalize function lowercases everything, removes common punctuation, removes unnecessary whitespace, and flattens the string to ASCII.
Once I knew I could compare strings relatively well, it was time to look through the transcript to find a search phrase. All the magic of the search engine takes place in the search function. First, it normalizes the phrase I'm searching for and then looks through all the words in the transcript for a match.
For any matches, it creates a string containing the matching word and the five words that precede and follow the matching word. The match and its corresponding phrase are loaded to an array called query_results that is finally returned.
With the search function ready, it was time to update the POST route of my API. I passed the search phrase submitted in the POST request to my search function and then returned the result.
Just like magic, I could send requests to my API and have it return matches in the podcast. But no one wants to send cURL requests all day. It was time to build the worst user interface for a search engine ever.
Building the Ugliest User Interface
The last step was to build a user interface. Fortunately, since I was building the ugliest search engine, the bar was low on how it looked. In fact, it was a bit of a challenge to not try and improve the interface. 😁
The Search Interface
One of the reasons I chose to use Flask on the back-end was the fact that it supported Jinja2 out of the box. I had never used Jinja2, but when someone mentioned it in our Slack, I noticed how similar it was to Handlebars for JavaScript developers.
My goal was to create one HTML file that could display the search box and results. To separate it from my Python code, I created a new HTML file at templates/index.html. It was very basic with an H1 tag and a form that would send a post back to its route.
Once the HTML file was in place, I updated the original HTTP GET request to serve it. Because I'm injecting the search parameter, I needed to supply it with an empty string.
A quick flask run in the terminal served up my ugly podcast search engine. To my surprise, it was technically already working. When I entered a search phrase and pressed the 'Search' button, it sent the search phrase to the API, which returned the results as JSON. Of course, that's not what I want it to do in the end, but it was a great feeling to know I was close to the end.
Displaying the Search Results
While a JSON response would be pretty ugly, I was enjoying Jinja2 too much to not build an interface to display the results of the search. After the form in my templates/index.html file, I added an H2 and UL to list the results. If there was a search phrase, it shows any results in a list.
Once the template was ready, I needed to update my API to return the HTML. Rather than returning the results as JSON, I return render_template passing the search phrase and the query results.
There you have it. Searching works and shows all places where a word was spoken. The phrases are a nice touch because they give context to what is being said at that moment. That should be the end right? Oh no. I'm nothing if not a little extra. It was time to add a little pizzazz.
Getting a Little Fancy
We're searching through podcasts. By their nature, they are meant for audio. While I could have stopped by showing the phrase the user was looking for, I thought it would be cooler if we could play that section of audio. I started by adding an audio player to the HTML file with the podcast episode I'm searching through. Users can press play and listen to the podcast if they like, but the real fun will happen once they search.
Next, I updated the result LI elements to include an anchor tag that will call a JavaScript function. (You know I wouldn't get through all this work without using a touch of JavaScript.) When it calls the upcoming seek function, it supplies it with the timestamp of the start of the found word.
Finally, I added a JavaScript function to the head of the page called seek. It expects a timestamp parameter. It then grabs the audio player, pauses its playback, seeks to timestamp location minus eight-tenths of a second, and plays. Why eight-tenths? I found it started the audio a few words before the searched phrase so you can better hear the word in context.
Final Results
Overall, I really enjoyed dipping my toes into the Python world. I learned several things that are universal to all languages and I'm excited to learn more. If you want to build this fun, but completely useless project, the full Python and HTML files are below. Enjoy!
Learn more about Deepgram
Sign up for a Deepgram account and get $200 in Free Credit (up to 45,000 minutes), absolutely free. No credit card needed!
We encourage you to explore Deepgram by checking out the following resources:
Unlock language AI at scale with an API call.
Get conversational intelligence with transcription and understanding on the world's best speech AI platform.