Introducing Nova-3 Medical: The Future of AI-Powered Medical Transcription


TL;DR
Nova-3 Medical is Deepgram's latest medical speech-to-text model, designed for clinical environments
It has unmatched accuracy in healthcare settings, filtering out irrelevant noise and capturing critical details such as medication names, diagnostic terms, and procedure details.
It offers flexible, self-service customization, with Keyterm Prompting that allows developers to fine-tune the model by adding up to 100 custom terms.
It has a HIPAA-compliant architecture with robust security measures.
It leads the field in AI-driven speech-to-text for healthcare, with a 63.7% improvement on Word Error Rate (WER) over the next-best competitor and a 40.35% improvement on Keyterm Error Rate (KER) compared to the next-best competitor.
Nova-3 Medical Offers Best-in-Class Medical Speech-to-Text
Today, Deepgram is redefining AI-driven medical transcription with the launch of Nova-3 Medical, the most advanced speech-to-text model designed for clinical environments. Built for developers at healthcare tech companies building voice AI products and solutions, Nova-3 Medical empowers teams to create highly accurate, customizable, and secure voice AI applications tailored to the complexities of healthcare workflows.
Building on the success of its predecessor, Nova-3 Medical delivers the same core benefits—ultra-low latency, enterprise-grade security, and compliance—while significantly improving transcription accuracy in challenging environments and offering enhanced self-service customization as a specialized adaptation of Nova-3. The growth of AI-powered transcription in healthcare is accelerating, driven by trends such as the adoption of electronic health records (EHR) and the rise of telemedicine. As a result, developers require robust, flexible tools to build applications that ensure accurate, real-time transcription while meeting regulatory requirements like HIPAA.
Key Advancements with Nova-3 Medical
Designed to excel in real-world medical settings, Nova-3 Medical delivers superior transcription performance while ensuring seamless integration with existing healthcare infrastructure. It offers the following benefits:
Unmatched Accuracy in Clinical Environments
Deepgram’s Nova-3 Medical is engineered specifically for challenging clinical environments where capturing clear audio is not always straightforward. Its advanced processing capabilities allow it to filter out irrelevant noise (like its base model) and focus on capturing the critical details that matter in clinical documentation—ensuring that vital information like medication names, diagnostic terms, and procedure details are not lost or misinterpreted.
Even in environments where noise is inevitable-like bustling clinics or hospitals filled with active medical equipment—Nova-3 Medical accurately captures everything from pharmaceutical names to the nomenclature of bacteria and diseases. This means that even when audio is captured from far-field devices or when ambient noise interferes (for example, when a doctor uses an iPad, laptop, or phone for clinical note-taking or when the device is left on a table with background chatter and equipment sounds), the model maintains exceptional accuracy.
Flexible, Self-Service Customization
Nova-3 Medical inherits advanced in-context learning from Nova-3, enabling real-time adaptation to specialized terminology without extensive retraining. With Keyterm Prompting, developers can fine-tune the model by adding up to 100 custom terms, ensuring precise recognition of specialty-specific terminology across fields such as cardiology, neurology, and oncology. As new drugs, procedures, or diagnostic terms emerge, the model can be dynamically updated to reflect the latest medical advancements, reducing delays and maintaining accuracy without costly retraining cycles. This immediate customization ensures that transcription quality remains high as medical language evolves.
Enterprise-Grade Security & Compliance
Deepgram prioritizes the protection of sensitive patient data with a HIPAA-compliant architecture that spans from the AI models to the underlying infrastructure. Robust security measures, including data encryption at rest and in transit, rigorous access controls, and continuous monitoring, ensure that all transcriptions and interactions meet the highest standards. Deepgram offers maximum controllability through versatile deployment options, including managed cloud, on-premises, and Virtual Private Cloud (VPC), enabling healthcare tech companies to build voice AI applications that comply with the security policies and regulatory requirements of the healthcare providers they serve, ensuring that sensitive patient data is securely stored and processed.
Optimized for Performance and Scale
Nova-3 Medical inherits Deepgram’s signature advantages, including ultra-low latency and rapid processing speeds, making it ideally suited for real-time healthcare applications such as telemedicine, digital health platforms, and clinical decision support systems. With pricing starting at $0.0043 per minute of pre-recorded audio—more than twice as affordable as leading cloud providers—developers can create high-performance applications without being hindered by escalating costs, enabling healthcare tech companies to reinvest savings into product improvements and accelerate the adoption of voice AI technology. Its scalable architecture efficiently handles increasing transcription volumes while keeping costs manageable, making it the ideal choice for AI-powered medical applications.
Benchmarking Nova-3 Medical: Key Metrics for Healthcare AI
Deepgram’s Nova-3 Medical leads the field in AI-driven speech-to-text for healthcare. We assess its performance across three essential benchmarks—median Word Error Rate (WER), Keyword Error Rate (KER), and Keyword Recall Rate (KRR)—each highlighting a critical aspect of transcription performance in clinical applications.
Our evaluation is based on an extensive and diverse medical audio dataset designed to reflect real-world clinical scenarios. It incorporates key healthcare terminology—such as drug names, conditions, and procedures—sourced from both publicly available datasets and proprietary customer audio.
Outperforming Competitors in Overall Accuracy
WER is a fundamental metric for evaluating speech-to-text models, as it measures the percentage of words incorrectly transcribed. A lower WER indicates better overall transcription accuracy, which is particularly critical in clinical environments where every word matters.
Nova-3 Medical achieves a median WER of 3.44%, which is a 63.7% improvement over the next-best competitor (see Fig. 1). This ensures that healthcare providers can rely on the AI to transcribe medical notes, prescriptions, and other clinical data with high accuracy, reducing the risk of errors that could impact patient safety or lead to costly corrections.


For clinical applications, low WER ensures that providers can efficiently document patient information without having to manually edit the transcriptions for accuracy. Given the fast-paced nature of healthcare settings, every second spent correcting transcription errors detracts from valuable patient care time.
Excelling in Capturing Specialized Medical Terminology
KER measures how often key medical terms are either missed or incorrectly transcribed. Nova-3 Medical boasts a KER of 6.79%, representing a 40.35% reduction in errors compared to the next-best competitor (see Fig. 2). In healthcare, where specialized medical terminology—such as branded drug names, Latin-derived bacteria/chemical names, and more—is frequently used, KER is crucial for ensuring that no vital information is overlooked or incorrectly recorded, as errors could have serious consequences for patient care.


For example, misidentifying a medication like "metformin" as "morphine" may only slightly impact WER, but it can have serious clinical implications. Nova-3 Medical’s low KER ensures that key medical terminology is transcribed with high reliability, making it a trustworthy solution for healthcare tech companies building voice AI applications.
Surpassing Previous-Generation Models
KER helps quantify how many transcription mistakes occur in medical terminology, but KRR adds an extra layer of insight by showing how reliably the AI captures specialized language in the first place. A higher KRR indicates that the AI is not only minimizing errors but also consistently recognizing and preserving critical terminology across transcripts. Nova-3 Medical’s KRR of 93.99% is a 10.6% improvement over Nova-2 Medical’s 84.97% (see Fig. 3). This benchmark demonstrates the model’s enhanced capability to accurately recognize and recall essential medical terminology, making it the most advanced option available.


For developers at healthcare tech companies, upgrading to Nova-3 Medical provides greater accuracy and reliability in capturing critical clinical details. This improvement reflects Deepgram’s commitment to continuous innovation, refining AI for enhanced precision to help you stay ahead with cutting-edge, high-performance voice AI.
Healthcare AI in Action: A Deepgram and AWS Demo
In this video, Deepgram and AWS showcase an innovative application that transforms healthcare administration by reducing the burden on providers and enhancing patient outcomes. The demo highlights how Deepgram’s cutting-edge speech-to-text capabilities, powered by our specialized medical models, integrate seamlessly with AWS services including Amazon Bedrock. This application captures clinical notes, processes drug dispatching commands, and manages scheduling tasks—all in real time—while easily integrating with existing EHR systems.
Watch as the AI rapidly transcribes detailed patient information with remarkable precision. It efficiently handles a variety of clinical scenarios, ensuring that critical data is recorded accurately and promptly. Additionally, the system enforces validation rules, such as verifying appointment durations to meet required thresholds, minimizing errors, and enhancing workflow efficiency.
Powered by Deepgram’s advanced speech recognition and AWS’s scalable, secure cloud infrastructure, this application exemplifies how modern technology can streamline healthcare workflows. By reducing administrative overhead, it allows healthcare professionals to spend more time focusing on patient care, ultimately improving outcomes for both providers and patients.
Getting Started
Ready to enhance your medical speech recognition capabilities? Start using Nova-3 Medical today and experience the difference. Explore our API Playground or sign up to test the model firsthand.
To integrate the model, simply use model=nova-3-medical in your API calls. For full guidance, visit our API Documentation.
Take the next step in transforming voice-enabled healthcare applications. With Nova-3 Medical, you gain access to cutting-edge transcription technology built to address the unique challenges of healthcare. Contact us today to learn how we can help you seamlessly integrate this powerful model into your system, driving more efficient, secure, and accurate healthcare applications.
Unlock language AI at scale with an API call.
Get conversational intelligence with transcription and understanding on the world's best speech AI platform.