Telnyx is a global communications and connectivity platform that operates its own carrier-grade network. To make voice AI feel truly real-time for developers and enterprises, Telnyx embedded Deepgram Flux directly into its media plane, treating speech as critical infrastructure rather than an external add-on. By running Flux on Telnyx-managed GPUs at its Points of Presence (PoPs) worldwide, Telnyx delivers ultra-low-latency voice experiences with natural turn taking that keeps pace with live human conversation.
“We chose Deepgram Flux for its high accuracy and consistent real-time performance, running directly on Telnyx-managed GPUs at the edge so our voice experiences feel instant; latency is constrained by physics, not external APIs.” — James W., Head of Product, Telnyx
Telnyx is a next-generation communications platform that powers secure, scalable voice, messaging, and AI-driven connectivity over a private global IP network.
Visit:
TelnyxTelnyx is a next-generation communications platform that powers secure, scalable voice, messaging, and AI-driven connectivity over a private global IP network.
Visit:
TelnyxTelnyx's embedded deployment of Deepgram Flux has enabled:
Most voice platforms still follow a similar pattern: terminate PSTN or SIP calls at a regional PoP, send audio over the public internet to third-party AI APIs, and wait on remote transcription before resuming the call. On paper, many of these APIs support streaming STT, but in practice Telnyx saw unpredictable latency, unstable partial transcripts, and brittle end-of-turn behavior, especially under load.
For Telnyx's customers, latency is the primary constraint. Once round-trip delay climbs beyond a few hundred milliseconds, conversations feel mechanical and "queued," regardless of word-level accuracy. Users notice jitter, delayed barge-in, and inconsistent behavior far more than small transcription errors.
On top of that, Telnyx needed to support noisy telephony audio, short utterances, numbers, and interruptions at carrier scale, without forcing audio off-net to external providers. The traditional "ship media to remote APIs" model was fundamentally misaligned with Telnyx's role as a carrier.
To solve this, Telnyx reframed speech as a physics problem. Rather than adding more buffering or heuristics around legacy APIs, the company chose to bring the AI as close to the media as possible and to treat speech-to-text as a first-class, real-time network service.
Telnyx selected Deepgram Flux as its primary real-time speech-to-text engine and deployed it on Telnyx-owned GPUs that are physically colocated with its telephony PoPs in key regions around the world. Instead of routing audio across the public internet, Telnyx runs Flux inside its own network perimeter, on the same low-latency fabric that carries call media.
When an inbound PSTN or SIP call lands on a Telnyx PoP, audio never leaves the Telnyx network. Media is streamed directly to Deepgram Flux running in the same region, where it is transcribed in real time. Flux provides both stable partial transcripts and final results, which Telnyx feeds into its call control and agent orchestration layer. That orchestration drives LLM reasoning and text-to-speech (TTS) responses, which are then streamed back to the caller over the same media path.
This architecture keeps telephony, STT, LLM, and TTS tightly synchronized. Because Deepgram Flux is running at the edge, inside the media plane, Telnyx can deliver sub-second end-to-end latency under load while preserving the reliability and observability of its carrier network.
From a product and engineering perspective, Deepgram Flux is now treated like any other core Telnyx service:
On-net, real-time streaming — Inbound PSTN and SIP traffic terminates at a Telnyx PoP, and audio is streamed to Flux on Telnyx-managed GPUs in the same region. Flux is configured for telephony-grade audio, including 16 kHz input and common telephony codecs.
Turn-taking and orchestration — Flux's partial and final transcripts feed Telnyx's call control and agent orchestration layer. The system relies on accurate end-of-turn detection to know when a speaker is truly finished, enabling natural barge-in and rapid back-and-forth exchanges even with short utterances, overlaps, and background noise.
LLM and TTS in the same loop — Once a turn is complete, the orchestration layer calls into LLM-based reasoning and then into TTS, streaming synthesized responses back over the same media path. Because everything runs inside the Telnyx network, timing across STT, LLM, and TTS is tightly controlled.
Regional isolation and scale — Telnyx operates this pattern in multiple regions, aligning speech workloads with its global PoPs across North America, Europe, the Middle East, and Asia-Pacific. Each region can be scaled and monitored independently, matching the company's existing approach to core network services.
This approach allowed Telnyx to meet strict requirements around sub-second latency, high concurrency without rate shaping, real-world accuracy on telephony audio, and deployment control on Telnyx infrastructure via Deepgram's self-hosted solution.
Since embedding Deepgram Flux in its media plane, Telnyx has seen a step-change in the quality and reliability of voice AI experiences.
For callers and end-users:
For internal teams:
Because Deepgram Flux runs on Telnyx-owned GPUs inside Telnyx PoPs, audio never leaves the Telnyx network perimeter for transcription.
Looking ahead, Telnyx is continuing to expand this architecture:
For Telnyx, the takeaway is clear: real-time voice AI only works when it is architected like the rest of the network. By co-localizing Deepgram Flux with its media plane, Telnyx has turned speech from a distant API call into core infrastructure that powers the next generation of global, low-latency voice experiences.
Learn more about Deepgram's voice AI platform and Telnyx's voice AI agents.

Telnyx is a next-generation communications platform that powers secure, scalable voice, messaging, and AI-driven connectivity over a private global IP network.
Visit:
TelnyxUnlock language AI at scale with an API call.
Book a Free DemoTest your own audio files or quickly explore its capabilities with our pre-recordings. Try it now for a seamless audio API experience!