How I coded a website using only my voice

Developer Filip uses Deepgram to build a website with his voice.

TRANSCRIPT

I coded a website using only my voice

I can do that. What’s up everyone? and welcome to the New Year. I really hope you guys made New Year’s resolutions and you’re following along with them quite well. if for some reason you didn’t manage walking to the club.

Now what I have gotten in the New Year, it’s an extremely huge boost in confidence as a web developer when it comes to developing websites, testing my algorithmic skills. And obviously, when I saw that post, to where someone asked, could you actually build a website of your voice? I said, yeah, I can’t. We’re going to dive into this project and actually try and implement something that proved a point of concept.

Now, of course, I did my research. I made sure that no other service and no other competition is out there that actually does a similar thing. And after billions of Google searches, I haven’t actually found anything in particular that would kind of tickle me and tell me, yeah, you shouldn’t do that because obviously they do it better.

Of course, there is a lot of other things that need to be done to make sure that this project actually works. And for that, we need to do a bit of planning. So I suggest we don’t waste any more time and we jump right into it.

Hello, moderator fill up here. Look at that new beautiful MacBook. Anyway, it’s important to know the plan for this project, so let me give you a super fast rundown. Now, I almost wanted to kill myself while trying to research on how to do this and how to do that and how to implement the website that use a voice this way, in that way because Stack Overflow told me this and some other website told me that and I was absolutely going furious. But the most important thing is that I knew that in order to start this project, all my faith was in Deepagram and Deepagram was an absolute godsend.

Now no surprise I found it hard with a brain smaller than walnut us angry. It’s hard to achieve anything in life, especially a website that supposedly is going to work with the use of fans voice. So how about you just start packing a box out of So here we are a few days later and countless hours of me trying to educate myself on how Deepgram works and to make sure that I’ve understood all the concepts and to make sure that I write some basic code that will actually allow us to interact with Deepgram and get some sort of result that is the cornerstone of this whole project. So I think the best thing to do is just for me to show you, to show you the code and to explain what we did and go from here. So let’s go.

Now to put straight and not bore you too much. Let me just raise the whole code and make sure that you get everything in a very quick tire and you understand how incredibly fantastic Deepgram actually is. Now a quick fun fact. Deepgram is used by the International Space Station for their space to ground communication. and I will also be working on Deepgram to use my voice to hopefully build a website. Now before we even get started with sending the request Deepgram, we need to capture our voice, and in order to do that, we need to make use of the media capture and strings API. rather than giving you the ridiculously over complicated and stupid definition, let me put it into my own words.

It’s an API that allows you to capture video, and audio from web browsers. Yep. That’s it. So obviously, we need to get that working. Write the code and test. Now, before we dive into the code, I want to talk to you and show you something totally unnecessary that just made me really excited and I how to do it. A configuration with my development environment in webpack. I fetch my IP address that’s private to my local network. So then I can host my local application on a specific IP, meaning that I can simultaneously exit on my browser, access it on my browser on my phone and see the application hot reload on here. So let me show you. Now if we just launch our local application with NPM run dev. What you’ll see is that the website opens up in the browser and the address of that website is the local IP that has been assigned for our local project. Now, I can go on to the web and I can access this IP. from my phone. Well, I can make anything.

Hello. This is Philip. Nothing happens. Nothing happens. Watch the phone. Watch the phone. Bam. Is there? Oh, it makes me so excited. It makes me so excited. I’m so excited. I just can’t hide. Hold on. Hold on. We need to take this excitement somewhere else.

Yes. I am in a different location and I have my code set up for the web browser to capture my voice. So Let’s run the actual project with NPM run dev. And yes, success, it works. It actually is requesting to use the micron meaning in what’s up to our voice. So I’m just going to allow.

Now that we have our media stream set up, we must provide a media recorder, which will prepare the capture data. this is what it looks like. And once it’s available, we can emit that data. Here is where the beautiful functionality of Deepgram comes in. To send the prepared data, we need to open a secure websocket connection to establish the connection with Deepgram. Once that is complete, we can magically four websocket event triggers can obviously control the events.

Now in our open websocket event, we need to send the data to Deepgram. And in our on message socket event, we need to handle that data and work only with data that’s returned as final from Deepgram to share accuracy. Then we can add a cheeky little title tag with an ID, reference that ID in our JavaScript, append the transcript, and here is the result. And now I’m going to say something into the microphone. Yes. What you can see is happening is that the audio that’s being captured from the web browser is being sent over to Deepgram. Deepgram is sending it back to us and I’m taking the transcript and injecting it directly into the HTML page. Now how cool is that?

Oh, also, make sure to like and subscribe if you’re enjoying this video.

Now that we are at the point where our voice is captured, we get a transcript back of what we said. We can go ahead and manipulate that transcript by firstly converting it into an ordered array of keywords, to then be able to use those keywords under a given order to categorize and subcategorize events and actions. The top level will be action keywords such as add, delete, modify, save, structure, and so on. The keywords describing elements can be followed such as button, title, paragraph input, or keywords that add more value to the actions such as all, at index. Finally, for now at least to keep it simple, subcategory keywords such as name, placeholder, default, This list will keep expanding, but we need to prove the premise first.

So now onto the fun part. Let’s code all of this app and see how much of a fail this will be? Oh my god. That’s right. What can I say, ladies and gentlemen? Colding ain’t easy when there is nothing on stack overflow. It’s actually what causes you to start thinking. And here I am thinking thinking a lot too much one would say. and it brought me pain. A lot of pain. But somehow, finally, I had something, something that worked. something that I was willing to share with the world. Made a UI godspeed with me. At title of the name, we are going to create a simple login screen. Add a paragraph of the name, enter username. Add an input. Add a paragraph with the name, enter password. Add an input. Add a break. Add a break. Add a button with the name submit. There you go.

Now, if you guys wanna see and make sure that it actually is in the door, I can prove that to you. And if we go onto the console, we’ll just clear everything. and I’ll say the keyword structure. And right there, you can see a HTML collection with all the elements that we have added to the webpage. Now, if I don’t like what I have on the screen and I just wanna start from scratch, All I need to do is just say, delete everything. And there we go. Now, the fan doesn’t end here. every single element that I add to the page has a specific ID associated with it, meaning that we can target that element specifically. So to give you an example, I can say add a button of the name Phillip. Add a button of the name subscribe. add button of the name walnut. Add a button of the name peanuts. delete the button with the ID subscribe. There you go. Delete button with the ID walnut. and there you go.

So you can see whatever is in the dom, we can actually target specifically to delete command and do certain things to it because it has a unique identifier. Now, this project doesn’t end here. This just the beginning of what we’re going to make out of it. There’s going to be so much more to come in part two. That’s why you definitely want subscribe. So in the next part, we’re going to do some crazy things or maybe I have already done so. Who knows? It’s for you to find out and and wait for the next video. But some of those crazy things are going to be like positioning stuff. And finally, being able to use our voice to push in things in the center of a dev, horizontally and vertically. The problem that developers always struggle with. We’re just going to solve with one voice command.

The other thing we’re going to work on is styling. We’re going to give each of the elements styling. We’re going to be able to create and make them pretty and actually make those websites come to life. And the final part is I’m going to use my phone. To speak into my phone, for it to pick up the voice from my phone and make the website that way meaning that I can be a super chill developer and just to talk to my phone and get all the work done. So those are the things that are coming up in the next video. That’s why make sure you subscribe. Make sure you like the video and as always, I’ll see you in the next video. And in this case, in part two, peace.

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.