Article·Voice AI Trends·Sep 4, 2024
5 min read

Transforming Healthcare and Delivering ROI with the Nova-2 Medical Transcription API

5 min read
Josh Fox
By Josh Fox
PublishedSep 4, 2024
UpdatedSep 5, 2024

Healing the healthcare industry with voice AI

Leading healthtech companies are leveraging AI to revolutionize healthcare by introducing new tools aimed at enhancing patient care, boosting clinical efficiency, and delivering significant cost savings. Powered by large language models (LLMs) and voice AI, these innovators are positioned to capture a significant share of the estimated $1 trillion opportunity within the industry today through increased productivity and cost savings.

In the United States alone, more than $4 trillion is spent on healthcare annually, with approximately one-quarter allocated to administrative expenses and around three-quarters dedicated to the actual delivery of care. Studies estimate clinicians spend up to 50% of their workday on documentation tasks such as notetaking and manual entry into EHR systems. This is the kind of repetitive, mundane work that often leads to employee burnout and keeps clinicians from doing what they do best – care for patients.

McKinsey estimates that AI tools have the potential to automate nearly 50% of administrative tasks in the healthcare sector, leading to hundreds of billions of dollars in annual savings. Deepgram’s Nova-2 speech-to-text (STT) medical model is the perfect example of such a tool, and many of our customers are already using it to deliver significant benefits across a number of use cases including medical scribes, healthcare bots, and sector-specific solutions for customer service, quality assurance (QA), and enforcing regulatory compliance.

Conducting call center quality assurance is traditionally a tedious, manually intensive process that involves call monitoring, call scoring, and using analytics software to track qualitative metrics related to agent sentiment and adherence to call scripts and SOPs (standard operating procedures). In healthcare domains, these tasks can be especially problematic as conversations often involve protected health information (PHI) that requires strict handling to ensure compliance with the Health Insurance Portability and Accountability Act (HIPAA). StackAI recently helped their client deploy a HIPAA-compliant, AI solution using the Nova-2 medical model to automate their entire QA process. With Nova-2, they reduced their error rate by 24% and increased call processing by nearly 7x, resulting in an overall cost savings of 67% and providing exactly the type of transformative outcome McKinsey and others have predicted AI can deliver.


Unmatched medical transcription performance

Unlike other medical transcription tools that tend to be slow and prone to errors, Deepgram's Nova-2 medical model has been specifically fine-tuned for accurately transcribing medical terminology in real-time. It captures symptoms, diagnoses, treatments, medications, and clinical jargon with far greater precision and speed than general-purpose models, making it a superior choice for healthcare applications.

Nova-2 Medical offers a 16% relative improvement in word recall rates (WRR) for medical terminology compared to its predecessor, and a 20.5% average improvement over leading competitors. WRR measures the percentage of words in the ground truth text that were correctly predicted or matched (i.e. true positives), with higher scores indicating better accuracy. This boost in WRR means Nova-2 Medical reduces errors, delivering a more precise transcription of critical patient care details, which can improve the accuracy and reliability of medical documentation and communication.

Nova-2 Medical also achieves an 11% improvement in overall word error rate (WER) for pre-recorded (batch) transcription compared to the previous version and outperforms leading medical alternatives by 42.8% on average. With Nova-2 Medical, you will achieve outstanding accuracy for medical terminology without sacrificing general speech recognition performance. This creates a balanced and reliable solution for transcribing clinical interactions, while also enabling use cases such as customer support bots, QA/compliance analysis, and virtual clinical agents.

Deepgram also enables healthcare organizations to further enhance the accuracy of uncommon keywords, such as new drug names, through rapid custom model training services. These services build on the already outstanding performance of the Nova-2 medical model, boosting its precision in handling specific, domain-related terminology.

Nova-2’s groundbreaking architecture offers a significant speed advantage over other speech recognition solutions, providing transcriptions 5 to 40 times faster than alternative vendors with a median inference time of just 29.8 seconds per hour of audio. It is also one of the only speech recognition models on the market capable of meeting real-time application demands. With Nova-2 Medical, there’s no trade-off between accuracy, speed, and cost, making it the top choice for streaming voice applications. 

More than 20% of the U.S. population faces mental health challenges each year, but long waitlists and high costs make therapy inaccessible for far too many. In a crisis, you shouldn’t have to wait months or spend hundreds of dollars to get support. Sonia, an innovative healthtech provider, is on a mission to make mental health care available to everyone – anytime, anywhere, and anyplace. Sonia achieves this with an innovative mobile application that delivers AI-driven voice therapy sessions on-demand and at a fraction of the cost of today’s traditional services. To achieve this, they rely on Nova-2 Medical’s speech-to-text capabilities, which provide the precision, speed, and cost efficiency required to deliver affordable, high-quality mental health support to anyone in need by putting a virtual therapist you can talk to at a moment’s notice right in your back pocket.


Deepgram offers flexible deployment options, allowing you to choose between a managed service or securely self-hosted on your own VPC or on-premises infrastructure. This flexibility ensures patient information remains confidential and secure, enabling compliance with strict confidentiality regulations and HIPAA standards. These options are especially important for enterprise organizations deploying voice AI into production, a challenge we’re here to help with via our recently launched Enterprise Voice AI Accelerator Program.

We invite you to explore our API Playground or sign up to try the Nova-2 medical model firsthand. To get started, simply include model=nova-2-medical in your API calls. For more information, visit our API Documentation. Join us in transforming healthcare with cutting-edge speech recognition technology tailored to your needs. To see how the Nova-2 medical model can integrate seamlessly into your system, contact us today!


If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions or contact us to talk to one of our product experts for more information today.

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.