Table of Contents
Voice AI builders have a new default development environment, and it isn't a browser tab. It's an AI coding tool. Claude Code, Cursor, Windsurf, Codex, Aider. The agent reads your repo, writes the integration, runs the tests, and ships the PR. The bottleneck is no longer typing speed. It's how well your AI coding tool understands the APIs you're trying to use.
Most voice AI builders hit the same wall. The agent gets to the speech layer and stalls. It guesses at endpoint shapes. It hallucinates parameters. It writes against a curl example from two model versions ago. You end up pasting docs into the prompt, copy-pasting from the dashboard, or writing scaffolding by hand and letting the agent fill in the rest. Every voice agent integration eats more developer time and more agent tokens than it should at the part of the stack that should be the easiest.
We fixed that.
Three Tools Shipped Together
In April we shipped three pieces of agentic engineering tooling that work together as one platform layer for voice AI builders.
The dg CLI. A terminal interface for Deepgram with 25+ commands. Transcribe a file, a URL, a microphone, or a piped audio stream. Generate speech with Aura. Run text intelligence on a transcript. Manage projects, keys, members, and usage. Auto-detects Claude Code, Aider, and Codex and switches to JSON output and stderr-routed status without flags. UNIX-friendly by design, with structured stdout, proper exit codes, and pipe support. MIT license, Python 3.10+. Install at cli.deepgram.com.
The MCP server. A built-in Model Context Protocol proxy that connects your AI coding tool to Deepgram's API. Start it with dg mcp. One tool surface, with the agent able to transcribe audio, generate speech, list models, manage projects, and more. Auth handled locally via dg login. Plugs into Claude Code, Cursor, Windsurf, or any MCP-aware tool.
The deepgram/skills repo. Agent skills are markdown instruction folders that your AI coding tool loads on demand. Six product-level skills cover API reference (api), docs navigation (docs), runnable starter apps (starters), feature-specific recipes (recipes), third-party integrations (examples), and MCP setup (setup-mcp). Per-language SDK skills layer on top for Python, JavaScript/TypeScript, Java, Go, Rust, Swift, Kotlin, .NET, and browser TS. The CLI handles the core install for you, and a single command brings in the rest (more below).
What You Can Build With Them
A voice agent prototype before lunch. Pull a starter app with a skill. Wire it to your LLM. Pipe a test audio file through the CLI to confirm transcription. Generate speech for the agent's response. Iterate without leaving the terminal or your AI coding tool's context window.
An integration that doesn't drift. Skills update with the product. Your AI coding tool reads the current API surface, not a stale model-trained guess. The recipes it pulls are real recipes, not hallucinations.
A multi-language stack on one platform. The product-contract skills tell the agent what Deepgram does. The SDK skills tell it how to call Deepgram in your language. Same platform, same primitives, different language idioms.
A faster eval-to-prototype loop. Transcribe a sample, generate speech, inspect a project, all from the same shell. The CLI is also useful when you want to test a hypothesis quickly without writing app code.
How It Works
Install the CLI:
curl -fsSL deepgram.com/install.sh | sh
Log in. The CLI detects which AI coding tools you have installed (Claude Code, Codex, Gemini CLI, Cursor, Cline) and offers to install the four core Deepgram skills (api, docs, starters, setup-mcp) into each.
dg loginFor the full skill set, including recipes and integration examples, use the universal installer:
npx skills add deepgram/skills
Or for Claude Code natively, register the plugin marketplace:
/plugin marketplace add deepgram/skills
/plugin install deepgram@deepgram-agent-skillsIf you skip the dg login prompt or add a new AI coding tool later, run dg skills install to set up the core skills on demand.
Transcribe a file:
dg listen call.mp3 | jq '.results.channels[0].alternatives[0].transcript'
Generate speech to your speaker:
dg speak "Hello from Deepgram" | ffplay -nodisp -autoexit -
Start the MCP server for your AI coding tool:
dg mcpYour AI coding tool now has structured Deepgram knowledge loaded as skills, can pull the right starter for a given use case, and can call the API directly via MCP when you want it to.
Get Started
The story for voice AI builders is no longer "Deepgram is an API you integrate." It's "Deepgram is a platform your AI coding tools already understand." That's what changes when the bottleneck moves from typing to context. We're going to keep building toward this. Skills are easy to extend, MCP capabilities are growing, and the CLI is going to get better the more we hear from builders. Tell us what you want next.









