By Martine Katz

Senior Product Marketing Manager

Last Updated

Deepgram has introduced a state-of-the-art monolingual Arabic speech-to-text model on Nova-3, built for how Arabic is actually spoken and used in production. The model supports broad Arabic dialect coverage across Arabic-speaking regions, including the Middle East, the Gulf, and North Africa. In benchmarking on conversational Arabic, Nova-3 Arabic delivers best-in-class accuracy, achieving up to ~40% lower word error rates compared to competing speech-to-text systems.

This release further establishes Nova-3 as an enterprise-grade speech-to-text model, delivering high accuracy, Keyterm Prompting, and production-grade performance for developers building Arabic-language voice applications globally. Arabic is also the first right-to-left language supported on Nova-3, extending support to additional scripts and writing systems used in real-world production environments.

Designed for Arabic Speech in Real-World Applications

Nova-3 Arabic is purpose-built for production use cases where Arabic speech-to-text must perform reliably at scale, including:

  • Call centers
  • Customer support
  • Voice agents and IVR systems
  • Conversational and speech analytics

The model is optimized for spoken Arabic as it appears in real production systems, across regions, dialects, and deployment environments, with support for:

  • ​​Natural spoken Arabic, as used in everyday customer interactions
  • Dialectal recognition across Arabic-speaking regions
  • Practical readability over textbook-perfect spelling
  • High-volume real-time and batch workflows

While Nova-3 supports Modern Standard Arabic, it is optimized for spoken Arabic used in conversation, returning clean and readable transcripts without diacritics (short vowel and grammatical markers) that are immediately usable in production systems.

Arabic Speech-to-Text Coverage Across 17 Regional Variants

Spoken Arabic varies significantly by region, with meaningful differences in pronunciation, vocabulary, and speech patterns across countries. In real production environments, audio is often dialect-heavy. Nova-3 Arabic supports 17 Arabic language variants across major regional dialect groups: 

Pan-Arab / Modern Standard Arabic (MSA)

  • Generic Arabic (ar)
Gulf Arabic (الخليج)

Spoken across:

  • United Arab Emirates (ar-AE)
  • Saudi Arabia (ar-SA)
  • Qatar (ar-QA)
  • Kuwait (ar-KW)
Levantine Arabic (بلاد الشام)

Spoken across:

  • Syria (ar-SY)
  • Lebanon (ar-LB)
  • Palestine (ar-PS)
  • Jordan (ar-JO)
Egyptian / Nile Arabic (وادي النيل)

Spoken across:

  • Egypt (ar-EG)
  • Sudan (ar-SD)
Maghrebi Arabic (المغرب العربي)

Spoken across:

  • Morocco (ar-MA)
  • Algeria (ar-DZ)
  • Tunisia (ar-TN)
Mesopotamian Arabic (العراق)

Spoken across:

  • Iraq (ar-IQ)
Peripheral Arabic Dialects

Spoken in:

  • Chad (ar-TD)
  • Iran (ar-IR)

Keyterm Prompting Across Dialects

Nova-3 Arabic benefits from Keyterm Prompting across all the various dialects, allowing developers to guide transcription toward domain-specific terminology, jargon, brand names, and keywords. This improves recognition without retraining models or managing custom vocabularies. Key terms are applied dynamically at inference time, making customization fast and flexible.

This capability is especially valuable for:

  • Call centers and customer support systems
  • Voice agents and IVR applications
  • Industry-specific analytics and transcription workflows

Nova-3 Arabic Speech-to-Text Outperforms Competitors

To evaluate real-world performance, Nova-3 Arabic was benchmarked against other leading speech-to-text systems on conversational Arabic speech across multiple regional dialects.

Across all evaluated dialects, Nova-3 Arabic achieved the lowest word error rates (WER), outperforming every benchmarked competitor.

Key takeaways:

  • Nova-3 provides best-in-class accuracy on dialect-heavy conversational Arabic.
  • Nova-3 Arabic maintains consistently low WER across regions, including Gulf, Egyptian, Levantine, and North African Arabic – outperforming competitors most clearly in these dialect-heavy regions.
  • Nova-3 Arabic has the lowest overall WER across dialects, reducing accuracy gaps in global production deployments.

*Arabic has multiple valid written forms for the same spoken words. To ensure a fair comparison, we normalized common spelling variants (such as diacritics, alef forms, ta marbuta, and number formats) so models are compared on what they heard, not on stylistic writing differences.

Deployment Modes: Cloud API or Self-Hosted

Nova-3 Arabic is available in two deployment modes for increased flexibility based on use case:

Cloud API (Deepgram-Hosted)
  • Fastest way to get started
  • Same Nova-3 speech to text cloud API customers already use
  • Ideal for most production workloads
Self-Hosted (Customer-Operated)
  • Run Nova-3 Arabic STT in your own environment
  • Audio never leaves your infrastructure
  • Designed for strict data residency, privacy, security, or latency requirements

Built for Developers and Enterprises

All supported Arabic variants are available through the same API developers already use today. You can test all supported languages directly in the Deepgram Playground before deploying to production.

Switching to any of the newly supported languages is simple. Update your API request with the appropriate language code:

curl --request POST \
  --header "Authorization: Token YOUR_DEEPGRAM_API_KEY" \
  --header "Content-Type: audio/wav" \
  --data-binary @youraudio.wav \
  "https://api.deepgram.com/v1/listen?model=nova-3&language=ar"

Supported language codes: 

ar, ar-AE, ar-SA, ar-QA, ar-KW, ar-SY, ar-LB, ar-PS, ar-JO, ar-EG, ar-SD, ar-MA, ar-DZ, ar-TN, ar-IQ, ar-TD, ar-IR

Build Arabic Voice Applications with Nova-3

Arabic is a critical language for global voice applications, spanning diverse regions, dialects, and deployment requirements. With the addition of Arabic speech-to-text, Nova-3 continues to serve as a production-grade foundation for global voice AI, built to support how people actually speak.

Sign up free and unlock $200 in credits, enough to power over 750 hours of transcription or 200 hours of speech-to-text across Nova-3’s growing language suite. Explore details on our Models & Languages Overview page and experience Nova-3’s world-class adaptability for yourself.

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.