Back in January, we supported Hack Cambridge -- a 24-hour student hackathon. The team behind Triolingo wanted to help language learners gain confidence in conversation by practicing with a bot. I sat down with Alba Navarro Rosales, Anoushka Kamazumdar, Max Johnson, and Megan Elisabeth Finch to ask them about their project.
"We were inspired by Deepgram's sponsor challenge to create something cool using speech recognition and were excited to see that it supported several foreign languages," the team told me. "Over the past three years, the coronavirus pandemic has had a significant impact on schools, and travel restrictions have limited opportunities for foreign language learning abroad. During this time, the use of online language learning platforms such as Duolingo has soared, but these platforms cannot provide practice for speaking and listening skills. We created Triolingo to cater to this niche, allowing language learners to gain confidence in conversation through practice."
How It Works
Users select a conversation topic and target language, and the Triolingo bot then begins a conversation with several topic-appropriate prompts. Users then record a verbal response sent for processing by the Deepgram Python SDK.
No two chats are the same as the multilingual chatbot is powered by GPT-3 provided by OpenAI, which dynamically responds to prompts. Finally, responses are spoken back to users using a text-to-speech API.
Care was taken to include extended topics beyond everyday and tourism-focused conversation -- prompts included culture, climate change, and politics.
Hackathon Experience
The Triolingo team had only participated in online hackathons before, so this was a new experience. As a large group of twelve people, they self-organized a random name picker and created three teams within the event's team size limit. "I've never used a GPT-3 API before, and it was both super cool and very impressive" said Alba.
I asked the team about their experience using Deepgram, and Max said that "the performance was really good and accurate, even with background noise." As their project progressed, they were visited by other teams who had fun trying it out.
Future Development
Given more time, the team would use additional Deepgram functionality such as confidence values in a Deepgram response to assess the user's pronunciation. Our keywords feature would boost recognition of words related to the conversation topic and further improve the reliability of the speech recognition function.
In terms of user interaction, the team would like to set contextual "challenges" or tasks to complete instead of just conversing without direction. For example, the user is presented with the scenario, "You are planning to watch a movie with a friend. Decide what movie you're going to watch and when and where you're going to meet." The system would keep track of whether the user and bot had agreed on these three things, and then congratulate the user when they had completed the challenge.
There are different grammatical constructs in some languages depending on who you are talking to, such as different pronouns or verb endings. As a final idea, the bot could adopt the appropriate type of language according to the situation.
You can try out a hosted version of Triolingo, and check out the code on GitHub.
If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions .
Unlock language AI at scale with an API call.
Get conversational intelligence with transcription and understanding on the world's best speech AI platform.