Text To Speech
May 14, 2025

Introducing Aura-2 TTS

Deepgram is proud to announce the release of Aura-2, our text-to-speech model purpose-built for realtime enterprise use cases.

Performance

  • Sub-200ms time-to-first-byte (TTFB) latency for real-time conversational interactions

  • 0.111x Real-Time Factor (RTF), synthesizing one second of audio in just over 100 milliseconds

Voice Quality & Features

  • Enterprise-optimized voice catalog with 40+ distinct voices, each designed for specific business contexts

  • Tuned for professional and transactional interactions with appropriate tone, pacing, and emphasis

  • Superior pronunciation accuracy for domain-specific content:

    • Currency and numerals

    • Dates and timestamps in varied formats

    • Email addresses, passwords, and URLs

    • Complex addresses and location references

  • Industry-leading voice clarity rated higher than competitors in customer service scenarios

Availability

  • Aura-2 is available now via REST and Websocket APIs

  • Currently available for use through our hosted offering

For detailed information about Aura-2, please refer to our Developer Documentation.

Unlock Voice AI at Scale with 
an API Call

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.

Sign Up FreeBook a Demo