This is the final part in a series on P5.js (from here 'P5') - a creative coding library that makes working with the Canvas API much easier. In part one, we covered how to draw elements on the screen and react to keyboard and mouse input. We learned how to create common game features in part two - collision detection, entity management, and state management.
In today's tutorial, we'll bring together everything we know to create a voice-controlled game - try the game out now. A new enemy appears coming from one of four directions and begins moving towards you every few seconds. Each direction has a random word associated with it, and if said correctly, a bullet will fly in that direction. If an enemy reaches you, the game is over.
The final code for today's project can be found on GitHub.
Before We Start
You will need a Deepgram API Key - get one here.
Setting Up State
On your computer, create a new directory and open it in your code editor. Create an index.html file and add the following to it:
In the second post in this series, you learned how to keep score and show a game over screen - we are using both approaches here.
The only new thing here is translate(width/2, height/2), which moves the origin (0, 0) to the center of the canvas. This means the top-left is now (-500, -500), and the bottom-right is (500, 500). It makes sense to do this when entities often need to refer to the center position.
Create Enemies
At the bottom of your <script>, create a new Enemy class:
When an instance is created, you must provide two arguments - direction - one of 'UP', 'DOWN', 'LEFT', or 'RIGHT', and distance - which dictates how far away from the center point the enemy should spawn.
In the constructor, the enemies are initially placed, and in move() they move one pixel closer to the center. touchedPlayer() uses collision detection -- we learned about that last week -- to set gameOver to true if an enemy touches the player in the center of the canvas. Finally, the enemy is drawn at its new (x, y) position.
In your global variable section, add these line:
At the bottom of your setup() function, begin spawning enemies randomly every 2-5 seconds:
The first argument will be randomly chosen from the directions array you just created. The final step is to loop through all existing enemies and run their methods in draw(). In your game logic section, add this code:
Open index.html in your browser, and it should look like this:
Create Bullets
Currently, there's no way to defend yourself. When a player presses their arrow keys, a new bullet will be created in that direction.
At the bottom of your <script>, create a new Bullet class. It should look familiar as it works largely the same as the Enemy class:
If an enemy is hit, it is removed from the enemies array, and the bullet's this.spent value becomes true. In the global variable section, add a new array for bullets:
Underneath our enemies loop in draw(), add a loop for bullets:
If the bullet has been spent, it won't be shown or run its collision detection logic. This means a bullet can only successfully hit an enemy once.
So far, you have used the P5 preload(), setup(), and draw() functions, but there are a host more that are triggered based on user input.
Unlike the keyIsPressed variable which is true every frame that a key is pressed, the built-in keyPressed() function is triggered only once when a user presses a key on their keyboard. In order to trigger the function twice, two distinct presses need to be made - much better for bullet firing. After you end the draw() function, add this:
That's the core game finished. Here's how it looks (recording is sped up):
Add Word Prompts
Create a new file called words.js, and copy and paste the content from this file on GitHub. This is a slight reformatting of the repository of over 42,000 English words.
As a note, this is a pretty long word list and includes some pretty long and complex words. You may want to experiment with the word selection you use to alter the difficulty.
Just before the <script> tag with our P5 logic, include the words.js file:
Then, in your main <script> tag with our P5 logic, add the following:
This function gets one word at random and returns the string. You can add it anywhere, but I tend to add these utility functions to the very bottom of my <script>.
In your global variable section, store four random words:
Just after your bullet loop in the game logic section, draw the four random words to the canvas:
Finally, in the Bullet.touchedEnemy() function, where we increment the score, replace a word when an enemy is hit:
Shoot Bullets With Your Voice
It's time to create bullets with your voice! A persistent WebSocket connection will be made with Deepgram, allowing Deepgram to constantly listen to your mic to hear what you say.
This part of the tutorial will assume you know how to do live browser transcription with Deepgram. If not, we have a written and video tutorial available that explains every step in more detail.
In your global variable section, create one final value so we can display to the user what was heard:
At the very bottom of your <script>, add this:
Remember to provide your Deepgram API Key when creating the socket. At the bottom of this code, a check determines whether any of the directional words were heard and, if so, creates a bullet in that direction.
Finally, show the user what was heard just under all of the text() statements in draw():
In Summary
The fact it was so little code to integrate voice control into this game should be a testament to how easy Deepgram's Speech Recognition API is to use.
Once again, a live version of the game can be found here and the final codebase on GitHub.
If you want to deploy your own, I encourage you to also read how to protect your API Key when doing live transcription directly in your browser.
If you have any questions, please feel free to reach out to us on Twitter at @DeepgramDevs.
If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions .
Unlock language AI at scale with an API call.
Get conversational intelligence with transcription and understanding on the world's best speech AI platform.