Deepgram's Voice Agent API
A single, unified conversational AI API for building real-time, enterprise-ready, and cost-effective voice AI agents. Combines the simplicity developers want with the orchestration control enterprises need. No stitching together STT, TTS, and LLM orchestration. No black box limitations. Priced at $4.50/hr.

Voice Agent API features
Powered by the industry’s fastest, most performant speech recognition and voice synthesis models, our voice agent stack delivers unparalleled performance and scale.
Unified Voice Agent API
One API that combines speech-to-text, LLM orchestration, and text-to-speech in real time. Simplifies development by eliminating the need to stitch together multiple services.
Conversational control
Built-in barge-in detection, turn-taking prediction, function calling, and mid-session control ensure smooth conversations without pauses or interruptions.
Full model ownership
Deepgram controls the full voice stack across STT, TTS, and runtime orchestration for optimized latency, model tuning, and tightly synchronized speech-to-speech flow.
Deployment flexibility
Deploy fully managed, dedicated single-tenant, in VPC, or self-hosted. Supports HIPAA, GDPR, regional data residency, and isolated runtimes for enterprise compliance.
BYO LLM & TTS
Easily integrate your own LLM or TTS provider while retaining Deepgram’s orchestration, streaming pipeline, and real-time responsiveness.
Scalable cost optimization
Flat-rate pricing at $4.50/hr with Deepgram’s full stack, plus built-in rate reductions for BYOM. Optimized compute efficiency lowers TCO for large-scale deployments.