Convert Vietnamese speech to text with high accuracy, low latency, and enterprise-grade scalability. Deepgram delivers real-time and batch transcription through a developer-first speech-to-text API.
Trusted by the world's top Enterprises and Startups
Get real-time Vietnamese speech-to-text in under 300 ms while maintaining high accuracy in noisy, accented, or overlapping conversations.

Speakers: 86 million native speakers (90-97 million total including L2 speakers)
Regions: Vietnam (primary, 86 million speakers), United States (1.5-2.3 million), Cambodia (1 million), Australia (320,000-335,000), France (400,000), Canada (276,000)
Dialects: Northern (Hanoi-based, standard), Central (Hue-based), Southern (Saigon-based)
Writing system: Latin-based Quốc ngữ script with diacritics for tones and vowels
Language family: Austroasiatic language family, Vietic subgroup
Vietnamese is widely used across Vietnam, the United States, and growing diaspora communities worldwide, making it a key language for call center analytics, customer support AI, healthcare telehealth services, media and gaming localization, legal and immigration services, education platforms, and multilingual voice agents.

Deepgram includes everything required to produce accurate, readable, and secure Vietnamese transcripts out of the box.
Automatically detect and label who is speaking in multi-speaker Vietnamese conversations.
Apply automatic capitalization, paragraphing, and clean transcript structure for Vietnamese text.
Instantly find words or phrases inside long Vietnamese recordings without reprocessing audio.
Segment streaming Vietnamese audio into real-time sentence-level units for voice agents.
Add accurate punctuation and capitalization to Vietnamese transcripts for easy reading.
Automatically remove sensitive data like credit cards, phone numbers, and PII from Vietnamese transcripts.

Keyterm prompting for Vietnamese
Boost recognition of brand names, product terms, and domain-specific vocabulary in Vietnamese audio to improve keyword recall and transcript accuracy.

Automatic language detection
Identify when audio is spoken in Vietnamese and transcribe it without pre-selecting a language. For mixed-language datasets, sources, and batch transcription pipelines.
Start with Vietnamese speech-to-text, then expand to 45+ languages using the same API, models, and tooling.
Start transcribing Vietnamese audio with Deepgram's speech to text API. It is fast, accurate, and built for real-time applications.