Deepgram is proud to announce the release of Nova-3, our most advanced speech-to-text model to date. Key improvements include:

Performance Improvements

  • 54.3% reduction in word error rate (WER) for streaming audio compared to competitors (6.84% median WER)

  • 47.4% reduction in WER for batch processing (5.26% median WER)

  • Maintains industry-leading inference speed, with latency comparable to Nova-2

New Features

  • Self-serve customization through Keyterm Prompting

    • Instantly adapt up to 100 domain-specific terms without model retraining

    • Improved recognition of specialized vocabulary and technical terminology

  • Enhanced capabilities for challenging audio conditions:

    • Improved handling of background noise and overlapping speech

    • Better numeric recognition

    • Real-time redaction for up to 50 entities

    • Greater word-level timestamp precision

    • Improved English formatting and paragraph structuring

Availability

Nova-3 English is now available through our API. To access:

  • Use model=nova-3 in your API calls

  • Available for hosted use

  • Supports both pre-recorded and real-time streaming transcription

  • Multilingual and self-hosted deployments will be available in subsequent releases

For detailed information about Nova-3, please refer to our Developer Documentation.

Stop building work-arounds for STT systems that don't work.

Start FreeTalk to an expert