Deepgram is proud to announce the release of Nova-3, our most advanced speech-to-text model to date. Key improvements include:
Performance Improvements
54.3% reduction in word error rate (WER) for streaming audio compared to competitors (6.84% median WER)
47.4% reduction in WER for batch processing (5.26% median WER)
Maintains industry-leading inference speed, with latency comparable to Nova-2
New Features
Self-serve customization through Keyterm Prompting
Instantly adapt up to 100 domain-specific terms without model retraining
Improved recognition of specialized vocabulary and technical terminology
Enhanced capabilities for challenging audio conditions:
Improved handling of background noise and overlapping speech
Better numeric recognition
Real-time redaction for up to 50 entities
Greater word-level timestamp precision
Improved English formatting and paragraph structuring
Availability
Nova-3 English is now available through our API. To access:
Use model=nova-3 in your API calls
Available for hosted use
Supports both pre-recorded and real-time streaming transcription
Multilingual and self-hosted deployments will be available in subsequent releases
For detailed information about Nova-3, please refer to our Developer Documentation.