Real conversations aren't static. A support call moves from identity verification to troubleshooting to scheduling a follow-up. A healthcare call shifts from intake questions to medication names to billing. Each phase has different intents, different critical phrases, and different tolerance for how quickly the system should detect a turn ending.
Today, most teams configure their ASR once at connection time and live with it for the entire call. They load every keyterm they might need upfront, diluting biasing effectiveness across the board, or they keep the list minimal and accept lower accuracy on critical phrases. When the conversation shifts enough that the configuration truly doesn't fit, the options are disconnecting and reconnecting mid-call or managing multiple concurrent streams and swapping between them. That's real engineering complexity for what should be a simple problem: the call changed, so the config should change too.
Now it can.
Introducing the On-the-Fly Configure Message
When we launched Flux, it gave developers conversational speech recognition with a built-in understanding of complex conversational dynamics. Developers have loved it and readily adopted it. But they kept asking for more: greater contextual awareness, and the ability to shift context throughout a conversation, not just at connection time.
So we built it. Flux now supports on-the-fly configuration through a new Configure message in the v2 /listen WebSocket API. Update keyterms and end-of-turn thresholds mid-stream, without disconnecting or reconnecting.
What you can configure:
keyterms: update the list of words and phrases that get extra attentioneot_threshold: adjust how aggressively Flux detects end of turneager_eot_threshold: control early turn detection sensitivityeot_timeout_ms: set the timeout window for turn completion
Updates apply immediately to all subsequent audio and persist until the stream ends or you send another Configure message.
Why This Matters for Voice Agents
With on-the-fly configuration, you can:
Dynamically bias toward task-critical phrases. Collecting a name? Add it to keyterms right before you ask. Moving from scheduling to pharmacy? Swap in medication names and medical terminology. Handling a product inquiry? Load the specific product names and feature terminology relevant to that conversation. You're no longer stuck with a generic keyterm list that's "good enough" for the whole call or loading irrelevant terms upfront.
Loosen turn detection during auth flows. When you're collecting a password or OTP, you don't want Flux cutting off the user mid-utterance. Loosen EOT thresholds for that segment so the system waits longer, then tighten them back when you're back to open conversation.
Keep it all on one connection. No reconnecting, no managing multiple concurrent streams. One connection, dynamic behavior.
How It Works
Send a Configure message on your existing WebSocket connection:
{
"type": "Configure",
"config": {
"thresholds": {
"eot_threshold": 0.8,
"eot_timeout_ms": 5000
},
"keyterms": ["John Smith", "Memorial Hospital"]
}
}That’s it. No ceremony.
Flux responds with ConfigureSuccess or ConfigureFailure, so your application (or LLM orchestrator) can react programmatically:
{
"type": "ConfigureSuccess",
"config": {
"thresholds": {
"eot_threshold": 0.8,
"eot_timeout_ms": 5000
},
"keyterms": ["John Smith", "Memorial Hospital"]
}
}Key details:
- Updates are immediate: the next audio batch uses the new configuration
keytermsupdates replace the current list (send[]to clear, omit the field to keep existing)
What This Isn’t
To set expectations clearly:
- Not a model change. On-the-fly configuration adjusts parameters on your current Flux stream. It doesn’t swap models, languages, or deployments.
- Not a session reset. Configuration changes don’t clear internal state or redefine turns. They tune how Flux processes subsequent audio.
- Not a guarantee. Dynamic keyterms and EOT tuning improve control and can reduce errors for task-critical phrases, but inherently difficult entities (unusual names, accented speech) may still require additional validation in your application logic.
Get Started
On-the-fly configuration is available now in the Flux v2 WebSocket API.
- Get started with the Flux quickstart →
- Read the Configure message docs →
- Try Flux in the Deepgram playground →
Your ASR configuration can now shift with the conversation. No more choosing between loading every keyterm upfront or accepting lower accuracy. No more static configuration that's "good enough" for the whole call. One connection that adapts as the call unfolds.
On-the-fly configuration is one step in a larger direction. We're moving Flux beyond getting the words right to reading the room: understanding who's speaking and what matters, maintaining context across turns, being right on the moments that actually drive outcomes, and surfacing richer signals that help your systems make better decisions. This is where we're headed, and we're just getting started.

