By Dan Mishler
Last Updated
Today we're releasing deepclaw, an open-source integration that lets you call your OpenClaw AI assistant over the phone using Deepgram's Voice Agent API.
Why we built this
As OpenClaw took the world by storm, we knew it could benefit from the most natural interface: voice. Our Voice Agent API combines Flux speech-to-text, Aura-2 text-to-speech, and intelligent turn-taking into a single, streamlined solution.
What makes Deepgram different
| Capability | ElevenLabs | Deepgram |
|---|---|---|
| Turn detection | VAD-based | Semantic and acoustic (Flux) |
| TTS latency | ~200ms | 113ms TTFA (per Coval) |
| TTS pricing | $0.050/1K chars | $0.030/1K chars |
| Self-hostable | No | Yes |
The key differentiator is Flux's native turn detection. Instead of waiting for silence (VAD), Flux understands when you're done talking. This means fewer awkward interruptions and faster responses. Ramble to your OpenClaw, it won’t interrupt!
Dead simple setup
We built deepclaw so your OpenClaw can set it up for you. Drop in one skill file, tell your OpenClaw "I want to call you on the phone," and it walks you through everything—Deepgram account, Twilio number, configuration, all of it.
No code to write. No complex integration. Just conversation.
Open source
deepclaw is fully open source. The entire voice agent server is ~400 lines of Python. Fork it, modify it, self-host it.
What’s Next
We’re exploring ways to make the setup even faster and easier, so even non-technical users can get the most out of it. Also, we noticed higher than desired latency from OpenClaw itself, so we’re going to find a way to get latency down as much as possible, to make it feel like a natural conversation.
Get started
- Copy the skill to
~/.openclaw/skills/deepclaw-voice/ - Tell your OpenClaw: "I want to call you on the phone"
- Follow the prompts
- Call your new number
Voice AI should be fast, affordable, and natural. That's what we're building.


