The team behind TomScottPlus used Deepgram to analyze YouTube videos in real-time and provide an contextual overlays with Wikipedia links to read. I sat down with Gwendolen Sellers, Harry Langford, Maxwell Pettett, and Tim McGilly to ask them about their project.
Tom is an English YouTuber who mostly makes videos about geography, history, science, technology, and linguistics. His style is 'talk to camera' as he explains various nerdy topics, sometimes with cutaways to other experts explaining a concept.
The team took their inspiration from Tom's YouTube experience, where he shares interesting facts that inspire watchers to learn more. As they talked about learning through YouTube videos, they all agreed that it was cumbersome to learn more about topics mentioned in the videos. They found themselves often pausing videos, opening a browser tab, and searching for a mentioned topic for further reading. That's how the idea for TomScottPlus was born. TomScottPlus is a Chrome extension that aims to make this as seamless as possible by providing clickable overlay for videos with contextual Wikipedia article links in a video overlay as topics are mentioned in the video.
When a YouTube video is visited, the Chrome extension sends a request to a Python application which downloads the audio and gets a high-quality transcript using the Deepgram Python SDK and our utterances feature.
The Python application then performed basic Natural Language Processing to look for contextually-relevant words and look for matching data points on Wikipedia (which took several API requests making this quite computationally expensive even with batching). Data points were filtered based on relevance and returned to the Chrome extension, which would display data over the video.
You can check out the code for this project on GitHub.
If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions .
Unlock language AI at scale with an API call.
Get conversational intelligence with transcription and understanding on the world's best speech AI platform.