Fifteen percent of you are watching this video with subtitles enabled. But if you talk to me in real life, there isn’t a subtitle in sight. this will not stand. The planet might be boiling, cancer might be uncured, but I chose to build a hoodie that subtitles my speech in real time. Let’s hook some AI to a wearable pie and see how yours truly transformed himself into the world’s most narcissistic teletubbie. Ladies gentlemen cyborgs and Mark Zuckerberg’s virtualized connectome welcome to Voidstarlab. I don’t hone in on project ideas right from the gecko for all intensive purposes, I come up with my ideas before the time is ripe to build them. I made myself a dumping ground for the project ideas too zesty to discard, but too * to use right away. And I call it big * list of * ideas. It’s a gray area of *, which technically makes it a beige area. and it’s where ideas can ferment until some x factor transforms them from half assed to fully assed.
This little nugget popped out of a chance encounter that only half in because I am a noodle armed soy boy. When I built the snapmaker, I needed something to put it on, so I went to Home Depot to grab myself a nice solid workbench. Shortly after buying the workbench, I realized it was too solid. It didn’t fold or break down or anything. I tracked down a real man and tried to explain the situation, but he gestured to this little badge on his apron that red hard of hearing. My face was diapered and my American sign language is mostly profanity, so we were forced to communicate via eyebrows alone. The whole encounter made me realize how much it must suck when you rely on reading lips, but everyone’s either covering their mouth or a Karen. I mean, like, they are a Karen, not like, they’re wearing Karen on their face like a mask. Although, if I were to hypothetically mount a display at nip level, a mic, add some speech recognizing AI, then everyone, no matter who they are, could receive my every word, whether they want it or but that would involve writing a lot of code and a hate writing code, especially a lot of it, onto the video list it went.
A few months later. A startup named Deepgram reached out and commissioned the royal us to showcase their technology by building it into a silly project. Deepgram is a cloud hosted speech-to-text service who provides two APIs, one for prerecorded audio, one for streaming. you can upload an audio file in almost any format and Deepgram’s typewriter monkeys will transcribe it. Or you can stream audio as it arrives and get a continuous conversion. I am going to be completely honest. I mixed them up with a different startup and by the time I realized that I was working with the speech to text startup and not the 3D scanning startup, I’d already signed a contract. This isn’t even the first time happened. I once screwed up a hackathon project because I mistook the client for, and I am being absolutely serious here, their identical twin.
I’m drinking Stranahan’s in the bath. Don’t judge me. And I realized this idea had ripened like a fine cheese. It’s a process I call parmesanovation. See, Deepgram already wrote the hardest part of this idea, the speech recognition, as for putting electronics in a hoodie. I struggle to not put electronics in my hoodies. The points I’m making here are a. Ideas can wait till the right time to build them. b, I came up with this idea well before I got in touch with Deepgram, so I’m not a shill and c. Local single malt is a business expense. This stretch bar display is a leftover from the data blaster cyber deck and it is the perfect aspect ratio for dropping a sentence or two atop my man cleavage. For input, we’re gonna use a lavalier mic from the early days of the channel. It’s tiny, easy to hide, and it’s got a directional pickup. So less interference from other people in the conversation if I ever give them an opening.
The brains shall be the Raspberry Pi 3B+. It’s got onboard graphics, WiFi, and Python. And most importantly, I’ve already used them in like a brazilion products. Projects. Blah. I grabbed myself a hoodie at goodwill, doesn’t need to be anything special, just needs to fit a little more snugly than usual, so the display doesn’t flop around. Also, raspberry pies, do not have an audio input, so I hit up Micro Center for a Codec Zero, a shield style sound card with pirst farty support on PiOS. I got a shout out Micro Center for not only sponsoring a lot of the parts for this project, but also for providing VoidStarLab with a DJI RSC2 Gimbal so we can get less nauseating handheld shots. Micro Center, if you’re not aware, is the last meat space electronics retailer on Earth. and you can restock your filament, grab an individually addressable RGB pro gamer power cable and to impulse buy a two thousand dollar graphics card in person after touching it, reading the text on the back of the box. Porch pirates can’t swashbuckle your little raspberry pi gaming PC cooling tower if you bring it home yourself. Microc Center allowed me to wear this project to their store and film consenting customers and gold bricking employees. But let’s not lick the wrong end of the ice cream cone. We have to build the project first.
I jumped straight into modeling and printing enclosures. I figured there’s a higher chance that the electronics are gonna short out on desktop clutter, than of me switching them after all, if I have to order parts this project is gonna take forever. These oversized holes are designed for special fabric fasteners called Chicago nuts, more on deez nuts later. After a late screwing fnar fnar, the project looked feature complete. I talked about this in my Finish Your Projects video, sequencing the project this way gives us an emergency exit. We can end the project right here. I can just hang this all around my neck with a string, display some random gibberish on it, and I have something to show. Don’t put yourself in a position where you have to finish the whole project or you got nothing. This is the first project in a very long time that was just a bunch of off the shelf products being used for their intended purposes. This thing came together way too easily and it is creeping me out. My comfort zone is barely contained to existential dreads, so let’s return to it by writing some code.
Warning. Falling section contains programming. Viewers who find computer words boring are advised to skip to the next chapter. Unless said viewers also like watching my sanity, visibly discreet with each passing minute, which case, grab a box of Junior Mints and then the lights.
I like to start a project code by writing separate tests for each component, ideally in Python because it’s the only language that’s better than widget workshop.
I installed Pi Audio and confirmed the codec zero can, you know, work. I found myself an old school on screen display font, and I made sure the Pie game graphics library could do the job. Finally, I loaded up Deepgram’s Python SDK, provisioned my myself an API key and I tested that too. Their example code transcribes BBC Radio a la minute, which I gotta say is a pretty clever source of demo dialogue even if those Brits don’t know how to spell the word dialogue. Once we have known working code running on known working hardware, we can say now kiss, combine our scripts, stir in some business logic and watch everything go to hell in a controlled manner.
For this project, I am the content. So instead of streaming audio from the web. We got a stream audio from the face. I’ve never actually done any audio input on a pi before, so I expected the worst, but it turned out to actually be the easiest part of the project. The Pi Audio module lets you access the audio driver and that uncorks a torrent of raw wave data. We dump that straight into Deepgram’s API, zero processing required. It’s like putting the end of the toilet paper roll in the toilet and flushing it. parents, that is what you get for letting your kids watch my show. I’m not babysitter.