NASA uses Deepgram to power the next generation of space tech
The Challenge
Noisy Audio Made Transcription Difficult
If you’ve ever listened to audio from space, you know how hard it can be to clearly understand what’s being said. NASA wanted to use speech-to-text to improve their current processes and workflows. But even more than that, they wanted to be able to expand into use cases that simply weren’t possible without fast, accurate, automatic transcription of why was being said.
NASA tried all the major speech-to-text providers—and even built their own solution using an open-source tool. But nothing they tried reached the 80% word recognition rate (WRR) needed for the transcripts to be useful—until they tried Deepgram speech-to-text API.
Deepgram went head-to-head with the big guys before being selected. See how we stack up on real NASA audio:
The Solution
Flexible, Tailored Models
Deepgram’s flexibility and the ability to quickly train a new model based on the kind of audio it would be transcribing—namely, space-to-ground communications—allowed for the creation of a model that outperformed all of the options that NASA tried and reached the accuracy threshold they needed for their work.
The outcome of having an accurate, easily deployable STT system for NASA has been groundbreaking, allowing them to do things that weren’t previously possible. NASA is currently using Deepgram’s speech-to-text API for four different use cases, described below.
Space-to-Ground Communications
When the ISS and Mission Control are communicating, they have three people hand writing what is being said to reduce the chance of errors. But NASA wanted a fourth, AI system to give them input on what’s being said as well. Deepgram was able to use space-to-ground audio to tailor a model for NASA to create transcripts that are now up to 89.6% accurate.
The transcripts generated by Deepgram will help NASA address some of the most common issues that come up in communications with the ISS. These include otherwise undetected readback errors—when an astronaut reads back an instruction to Mission Control to make sure they’re about to do the right thing, but says the wrong thing, and no one at Mission Control notices.
Deepgram’s ability to search through audio also makes it possible for NASA to parse historical records of previous missions to search for specific incidents during previous missions. For example, using Deepgram, NASA was able to search the 4 days of mission audio from the Gemini 4 mission for the moment when the flight controllers commanded James McDivitt to tell Ed White “to get back in!” the spacecraft before they passed into Earthʼs shadow.
We found the one piece of data we needed—in moments instead of trying to find it blindly. Give it a listen here.
The Neutral Buoyancy Lab
The Neutral Buoyancy Lab at NASA allows astronauts to train in full spacesuits, using the water to create a close analog to microgravity here on Earth. The audio from these training missions is low quality and noisy, like the audio from space, with the sounds of bubbles and breathing gear degrading the quality.
With an accurate ASR model from Deepgram, though, NASA will be able to overcome the noise issues to create transcripts and search through audio of previous missions. Deepgramʼs latest trained model has achieved ~87% WRR on multiple NASA validation sets, including from the Neutral Buoyancy Lab.
Medical Interactive Response Intelligent System (IRIS) Chatbot
NASA’s Medical Interactive Response Intelligence System (IRIS) is designed to provide guidance during potential medical emergencies on the International Space Station (ISS). Using a chatbot powered by Deepgram, IRIS will be able to field questions from a crew member about the health of another, helping them triage and treat emergent situations.
IRIS has been created to run on a Raspberry Pi with an external NVIDIA GPU to power Deepgram’s speech-to-text in a form factor that will make deployment on the ISS possible in the future.