Deepgram’s award-winning voice AI goes global with Dedicated and EU-hosted deployments 🌍

Article·Announcements·Jul 30, 2025
5 min read

Deepgram Expands Internationally, Launches Managed Single-Tenant Deployment Option, and Wins Top Voice AI Award

Deepgram is expanding globally with two major infrastructure updates: Deepgram Dedicated, a fully managed single-tenant deployment, and an EU-hosted API for in-region inference. Learn more in this blog.
5 min read
Hasan Jilani
By Hasan JilaniDirector of Product Marketing
Last Updated

Voice AI is rapidly becoming foundational infrastructure across industries, powering real-time agents, compliance-sensitive workflows, and multilingual applications at scale. As global adoption accelerates, so does the demand for flexible deployment models, regional hosting, and production-grade reliability.

To meet that demand, Deepgram is announcing two major infrastructure expansions:

  • The general availability of Deepgram Dedicated, a fully managed, single-tenant runtime

  • The early access launch of our EU-hosted API endpoint, enabling in-region inference for European workloads

These launches reflect a broader shift in how voice AI is being deployed, and they come at a time of growing industry validation. This month, Deepgram Nova-3 was named a 2025 Voice AI Technology Excellence Award winner by TMC’s CUSTOMER magazine, recognizing our leadership in accuracy, real-time multilingual transcription, and self-serve customization.

Together, these milestones reinforce Deepgram’s commitment to providing voice AI infrastructure that supports enterprise-scale performance, compliance, and geographic flexibility.

What It Means to Go Global with Voice AI

Going global starts with supporting the world’s languages. Deepgram already supports over 36 languages for customers worldwide and will continue expanding language coverage throughout 2025.  

But language support is only the beginning.

For engineering teams building production-grade systems, global voice AI also requires solving for infrastructure and compliance demands as workloads expand across regions. As enterprises scale voice workloads globally, we continue to hear two common friction points: the growing complexity of managing infrastructure across regions and tightening data policies, particularly in the EU, that require stricter control over where and how voice data is processed.

These demands include:

  • Ultra low-latency inference paths. Real-time applications require models to run as close to the end user as possible to minimize round-trip time and meet interaction thresholds.

  • Data residency and legal jurisdiction. Voice data often must be processed and stored within specific geographic boundaries to meet regulatory requirements such as GDPR.

  • Single-tenant isolation for sensitive workloads. Some environments require dedicated infrastructure to enforce data segregation, meet compliance standards, or satisfy internal security policies.

  • Scalable operations without added DevOps burden. Expanding voice workloads across regions should not require a proportional increase in infrastructure engineering.

Deepgram’s platform was designed with these requirements in mind, providing the foundation needed to operationalize voice AI reliably and securely across global environments.

Introducing Deepgram Dedicated: A Managed, Single-Tenant Runtime

Enterprises adopting voice AI at scale often face a difficult tradeoff: maintain control over infrastructure and data by self-hosting, or prioritize ease of use through shared, multi-tenant cloud APIs. Self-hosting offers isolation and regional control, but introduces significant ongoing operational complexity. Managed service providers can help bridge the gap, but they often lack product-level expertise and introduce dependency overhead that slows down feature adoption.

Now generally available, Deepgram Dedicated closes this gap. It is a fully managed, single-tenant deployment of Deepgram’s voice AI platform that offers the control and flexibility of self-hosted infrastructure without the burden of operating it. Over the past six months, it has been selectively deployed with a select group of enterprise customers in early production deployments across a range of use cases, from real-time contact center platforms to globally distributed voice agents. 

Teams gain regional isolation, performance control, and compliance alignment while offloading infrastructure management to Deepgram. Deepgram Dedicated currently runs on AWS, with support for additional cloud providers on the roadmap.

Key Highlights:

  • Single-tenant architecture: Each deployment runs on isolated compute, avoiding noisy neighbor effects and supporting strict data segregation.

  • Unified voice AI stack: Run speech-to-text, text-to-speech, and speech-to-speech workloads in a single runtime with consistent API behavior.

  • Multi-cluster design: Separates real-time, pre-recorded, and agent workloads onto specialized clusters to maximize performance, ensure high availability, and enable strict workload isolation.

  • Region-specific infrastructure: Deploy in your preferred cloud region to meet compliance requirements, enable ultra-low latency, and align with internal policies, including support for country-level deployments.

  • SLA-backed performance: Optional SLAs ensure predictable uptime and latency with defined targets monitored and enforced by Deepgram.

In one modeled scenario, a customer supporting 1,000 concurrent real-time streams would spend approximately $467K USD annually if self-hosting. This includes $250K in DevOps headcount and $98K in infrastructure costs.

Running the same workload on Deepgram Dedicated lowers total OPEX by approximately $98K USD per year. It also reduces engineering overhead and improves deployment reliability through platform-managed SLAs and regional isolation, giving teams more time to focus on higher-impact work.

EU-Hosted API Endpoint: In-Region Inference for European Voice Workloads

Voice AI adoption is accelerating across Europe, driven by demand for real-time applications in finance, public services, retail, and telecommunications. To date, more than two dozen customers and prospects have expressed interest in EU-based infrastructure, highlighting growing demand for in-region processing that meets local performance expectations and regulatory requirements without compromising model quality or flexibility.

To support this, Deepgram is launching early access to api.eu.deepgram.com, a new EU-hosted speech-to-text API endpoint that delivers in-region inference with full feature parity and consistent performance. The EU endpoint is hosted in AWS EU regions, with additional hosting options under consideration.

Key Highlights:

  • Voice data stays within the EU. All processing occurs inside EU-based AWS regions, ensuring no cross-border data transfer.

  • Latency improvements for EU-based users: Localized inference reduces round-trip time for applications serving users in or near the EU.

  • No code changes required: Existing integrations can migrate by updating the base URL, with no other changes needed.

  • Supports GDPR compliance and auditability: The deployment is fully isolated within the EU legal boundary and aligned with regional data protection standards.

This endpoint is well-suited for European ISVs, compliance-focused enterprises, and global teams looking to reduce latency and streamline deployment in the EU.

Why This Matters: A Global-Ready Voice AI Platform

With these additions, Deepgram now supports a range of deployment options, including multi-tenant hosted APIs, fully managed single-tenant deployments, and customer-operated self-hosted infrastructure. This flexibility allows engineering teams to choose the right model based on their application requirements, compliance obligations, and operational preferences. For some, the hosted API provides a fast path to integration. Others may require the regional data residency of the EU endpoint or the isolation and control of a Dedicated deployment. Teams with existing DevOps capacity may opt for self-hosting to align with internal security policies or infrastructure standards.

What differentiates Deepgram is the ability to deliver true flexibility across deployment models. Teams can build and scale voice AI systems using consistent APIs and model performance, while choosing the infrastructure that fits their environment. Looking ahead, the roadmap includes customer VPC deployments, BYOC support, and expanded region availability across Asia-Pacific, EMEA, and LATAM.

Start Building for Your Environment

If you're building voice applications that require global reach, regulatory alignment, or low-latency performance, now is the time to explore your deployment options. Demand is high, and we're expanding access selectively:

Deepgram now runs where your business runs. No trade-offs. No overhead. Just voice AI on your terms.

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.