GetVocal AI

Customer Stories

GetVocal scales governed enterprise voice automation with Deepgram

GetVocal AI is a multi-channel conversational-agent platform that helps enterprises automate complex customer conversations without sacrificing control, auditability, or compliance. By pairing deterministic business logic with best-in-class speech infrastructure (powered by Deepgram Speech-to-Text), GetVocal delivers production voice agents that handle real telephony audio with the entity accuracy, concurrency, and governance enterprises require, serving 23 markets and 100+ enterprise teams.

GetVocal AI is a multi-channel conversational-agent platform that helps enterprises automate complex customer conversations with deterministic governance and telephony-grade accuracy, deployed across 23 markets and 100+ enterprise teams.

Visit:

GetVocal AI

Business Needs

GetVocal needs production-grade real-time speech-to-text optimized for 8 kHz telephony audio, with strong accuracy on structured entities like names, numbers, dates, and IDs, and the low, predictable latency required for real-time enterprise voice agents.

Solution

Speech-to-text

Visit:

GetVocal AI

Business Needs

Solution

Speech-to-text and

Key Results

Helped support platform-level outcomes of 70% deflection within three months, 45% more self-service resolutions, and 31% fewer live escalations
Enabled production-quality English voice agents with strong performance on real telephony audio
Supports a business serving 23 markets and 100+ enterprise teams
Gives GetVocal a path to hosted and self-hosted deployments aligned to enterprise residency requirements

The Challenge: Making voice automation trustworthy enough for the enterprise

GetVocal AI is focused on a problem most enterprises still have not solved: how to automate complex customer conversations without sacrificing control, auditability, or compliance.

According to GetVocal, most conversational AI platforms force a false choice. Companies can either use rigid scripted systems that struggle with natural language, or rely on black-box LLM systems that may drift, hallucinate, and become difficult to audit. That is especially problematic in regulated industries and complex operational environments where every interaction must remain traceable and governed.

Voice raises the bar even further. In live phone conversations, every layer of the stack has to work on real-world audio conditions, including 8 kHz telephony, background noise, accents, and multilingual interactions. If speech recognition misses a name, phone number, booking reference, date, or amount, the entire downstream workflow can break.

For GetVocal, that made speech recognition a foundational requirement rather than a plug-in feature.

“Voice is the oldest interface we have. Enterprises are now rediscovering it as the richest modality for customer trust, but only if every layer of the stack is best-in-class. Deepgram is a core part of how we deliver that for our English-language deployments, and we’re partnering closely on expanding that into the languages our European customers operate in.” - Roy Moussa, Co-Founder & CEO, GetVocal AI

The Solution: A speech layer built for real-time enterprise voice workflows

GetVocal integrated Deepgram’s real-time streaming speech-to-text into its voice pipeline to support production voice agents across customer deployments.

Today, GetVocal primarily uses Deepgram for English-language voice interactions on telephony audio, where it has seen the strongest production performance. The team also uses capabilities such as smart formatting for numbers and dates, vocabulary customization through keyword prompting, and telephony-tuned configuration to improve structured entity capture and conversational flow.

The company evaluated a wide range of speech providers, including hyperscalers, Whisper-based offerings, and specialized voice vendors. For GetVocal, the selection criteria were straightforward:

high accuracy on structured entities that drive downstream workflows
low and predictable latency in real-time conversations
clean streaming APIs and strong developer experience
scalable concurrency for enterprise call volumes
pricing and deployment options that support enterprise growth

The turning point came in side-by-side testing on real customer telephony recordings.

“In voice AI, the entire customer experience hinges on the first 300 milliseconds. If STT mis-hears a booking number or a customer name, no amount of downstream LLM reasoning can save the interaction. That’s why we’re rigorous about who we put in the front of our pipeline, and why Deepgram earned that spot for English.” - Antonin Bertin, Co-Founder & CTO, GetVocal AI

Technical Implementation

GetVocal uses Deepgram as the speech layer at the front of its voice orchestration stack.

Call intake: A PSTN call comes into the GetVocal platform and audio is streamed to Deepgram Speech-to-Text in real time.
Transcription and entity capture: Deepgram transcribes the call with configuration optimized for telephony performance, including formatting and vocabulary handling for critical entities.
Contextual reasoning: The transcript flows into GetVocal’s ContextGraphOS, where deterministic business logic, policies, workflows, and customer knowledge govern how the AI can respond.
Response generation: An LLM handles natural-language reasoning only within those guardrails, and the response is returned through text-to-speech.
Human oversight: If confidence drops or an edge case appears, GetVocal routes the interaction to its Control Center for human validation or intervention, then hands the conversation back to the AI.

This architecture allows GetVocal to be deterministic where required and generative only where permitted, a design choice that maps closely to the expectations of enterprise and regulated customers.

“Most AI pilots fail because enterprises lack the governance to make AI trustworthy. We pair deterministic business logic with best-in-class AI infrastructure, and speech is the most visible part of that contract with the customer. Deepgram helps us keep that contract.” - Roy Moussa, Co-Founder & CEO, GetVocal AI

Outcomes & Impact: Better voice performance, stronger enterprise readiness

On deployments where Deepgram powers the English voice pipeline, GetVocal says it has seen the combination of conversational flow, reliable entity capture, and concurrency stability required for enterprise-grade voice automation.

That performance contributes to broader platform-level outcomes GetVocal reports across deployments:

Higher automation outcomes: GetVocal reports 70% deflection within three months, 45% more self-service resolutions, and 31% fewer live escalations compared with existing enterprise solutions.
Scaled operational growth: GetVocal now serves 23 markets and 100+ enterprise teams. In one example, Glovo scaled from 1 to 80 AI agents across 5 team functions and 22 countries in under 12 weeks.
Production-grade voice experiences: In English-language telephony use cases, Deepgram helps GetVocal deliver smooth turn-taking, strong recognition of structured entities, and the responsiveness enterprises expect in brand-sensitive interactions.

Customer Examples

Hospitality: Altis Hotels

In hospitality workflows, GetVocal voice agents support booking and guest-service interactions where accuracy on dates, room types, loyalty status, and customer details matters. GetVocal says its deployment at Altis Hotels handles 70% of routine guest questions automatically, delivers guest replies 94% faster, and returns 35–55 hours per week to front-desk teams.

Events: Terrapinn

GetVocal also used voice automation to help Terrapinn re-engage past event attendees through an outbound campaign. The company reports 1,000 calls, 63% engagement, 70%+ real conversations, 27% conversion from engaged calls, and 122 confirmed registrations, significantly above the original conversion target.

Looking Ahead

GetVocal sees the next wave of growth in four areas:

expanding production coverage into French, Spanish, and Portuguese
deepening the connection between voice, LLMs, and governed tooling
evolving its human-AI Control Center model for real-time collaboration
growing into larger deployments across contact centers and regulated industries

This roadmap aligns with Deepgram’s broader speech-to-text roadmap, including ongoing work around conversational models like Flux and expanded language support.

For GetVocal, the long-term opportunity is clear: one trusted voice stack that can serve enterprise customers across multiple languages, regions, and compliance environments.

Advice from GetVocal AI

For teams building enterprise voice agents, GetVocal’s perspective is practical:

Start with the entity layer: If the system cannot reliably capture names, numbers, dates, and IDs, the rest of the workflow will not hold.
Benchmark on real audio: Test on real telephony calls with background noise, accents, and concurrency. Demo audio won't surface the production issues that matter.
Keep governance in the architecture: Be deterministic where you need auditability and generative only where flexibility adds value.
Choose a true infrastructure partner: The best speech providers go beyond shipping APIs to engage on roadmap, language quality, and deployment requirements.

Visit:

GetVocal AI

Business Needs

Solution

Speech-to-text

Unlock language AI at scale with an API call.

Book a Free Demo