Flux On-the-Fly Configuration: Shift Context as Conversations Evolve

Listen to article04:06

Introducing the On-the-Fly Configure Message
Why This Matters for Voice Agents
How It Works
What This Isn’t
Get Started

Listen to article04:06

Real conversations aren't static. A support call moves from identity verification to troubleshooting to scheduling a follow-up. A healthcare call shifts from intake questions to medication names to billing. Each phase has different intents, different critical phrases, and different tolerance for how quickly the system should detect a turn ending.

Today, most teams configure their ASR once at connection time and live with it for the entire call. They load every keyterm they might need upfront, diluting biasing effectiveness across the board, or they keep the list minimal and accept lower accuracy on critical phrases. When the conversation shifts enough that the configuration truly doesn't fit, the options are disconnecting and reconnecting mid-call or managing multiple concurrent streams and swapping between them. That's real engineering complexity for what should be a simple problem: the call changed, so the config should change too.

Now it can.

Introducing the On-the-Fly Configure Message

When we launched Flux, it gave developers conversational speech recognition with a built-in understanding of complex conversational dynamics. Developers have loved it and readily adopted it. But they kept asking for more: greater contextual awareness, and the ability to shift context throughout a conversation, not just at connection time.

So we built it. Flux now supports on-the-fly configuration through a new Configure message in the v2 /listen WebSocket API. Update keyterms and end-of-turn thresholds mid-stream, without disconnecting or reconnecting.

What you can configure:

keyterms: update the list of words and phrases that get extra attention
eot_threshold: adjust how aggressively Flux detects end of turn
eager_eot_threshold: control early turn detection sensitivity
eot_timeout_ms: set the timeout window for turn completion

Updates apply immediately to all subsequent audio and persist until the stream ends or you send another Configure message.

Why This Matters for Voice Agents

With on-the-fly configuration, you can:

Dynamically bias toward task-critical phrases. Collecting a name? Add it to keyterms right before you ask. Moving from scheduling to pharmacy? Swap in medication names and medical terminology. Handling a product inquiry? Load the specific product names and feature terminology relevant to that conversation. You're no longer stuck with a generic keyterm list that's "good enough" for the whole call or loading irrelevant terms upfront.

Loosen turn detection during auth flows. When you're collecting a password or OTP, you don't want Flux cutting off the user mid-utterance. Loosen EOT thresholds for that segment so the system waits longer, then tighten them back when you're back to open conversation.

Keep it all on one connection. No reconnecting, no managing multiple concurrent streams. One connection, dynamic behavior.

How It Works

Send a Configure message on your existing WebSocket connection:

{
  "type": "Configure",
  "config": {
		  "thresholds": {
		    "eot_threshold": 0.8,
		    "eot_timeout_ms": 5000
		  },
	    "keyterms": ["John Smith", "Memorial Hospital"]
  }
}

That’s it. No ceremony.

Flux responds with ConfigureSuccess or ConfigureFailure, so your application (or LLM orchestrator) can react programmatically:

{
  "type": "ConfigureSuccess",
  "config": {
		  "thresholds": {
		    "eot_threshold": 0.8,
		    "eot_timeout_ms": 5000
		  },
      "keyterms": ["John Smith", "Memorial Hospital"]
  }
}

Key details:

Updates are immediate: the next audio batch uses the new configuration
keyterms updates replace the current list (send [] to clear, omit the field to keep existing)

What This Isn’t

To set expectations clearly:

Not a model change. On-the-fly configuration adjusts parameters on your current Flux stream. It doesn’t swap models, languages, or deployments.
Not a session reset. Configuration changes don’t clear internal state or redefine turns. They tune how Flux processes subsequent audio.
Not a guarantee. Dynamic keyterms and EOT tuning improve control and can reduce errors for task-critical phrases, but inherently difficult entities (unusual names, accented speech) may still require additional validation in your application logic.

Get Started

On-the-fly configuration is available now in the Flux v2 WebSocket API.

Your ASR configuration can now shift with the conversation. No more choosing between loading every keyterm upfront or accepting lower accuracy. No more static configuration that's "good enough" for the whole call. One connection that adapts as the call unfolds.

On-the-fly configuration is one step in a larger direction. We're moving Flux beyond getting the words right to reading the room: understanding who's speaking and what matters, maintaining context across turns, being right on the moments that actually drive outcomes, and surfacing richer signals that help your systems make better decisions. This is where we're headed, and we're just getting started.

Listen to article04:06

Introducing the On-the-Fly Configure Message
Why This Matters for Voice Agents
How It Works
What This Isn’t
Get Started

Listen to article04:06

Now it can.

Introducing the On-the-Fly Configure Message

What you can configure:

keyterms: update the list of words and phrases that get extra attention
eot_threshold: adjust how aggressively Flux detects end of turn
eager_eot_threshold: control early turn detection sensitivity
eot_timeout_ms: set the timeout window for turn completion

Updates apply immediately to all subsequent audio and persist until the stream ends or you send another Configure message.

Why This Matters for Voice Agents

With on-the-fly configuration, you can:

Keep it all on one connection. No reconnecting, no managing multiple concurrent streams. One connection, dynamic behavior.

How It Works

Send a Configure message on your existing WebSocket connection:

{
  "type": "Configure",
  "config": {
		  "thresholds": {
		    "eot_threshold": 0.8,
		    "eot_timeout_ms": 5000
		  },
	    "keyterms": ["John Smith", "Memorial Hospital"]
  }
}

That’s it. No ceremony.

Flux responds with ConfigureSuccess or ConfigureFailure, so your application (or LLM orchestrator) can react programmatically:

{
  "type": "ConfigureSuccess",
  "config": {
		  "thresholds": {
		    "eot_threshold": 0.8,
		    "eot_timeout_ms": 5000
		  },
      "keyterms": ["John Smith", "Memorial Hospital"]
  }
}

Key details:

Updates are immediate: the next audio batch uses the new configuration
keyterms updates replace the current list (send [] to clear, omit the field to keep existing)

What This Isn’t

To set expectations clearly:

Not a model change. On-the-fly configuration adjusts parameters on your current Flux stream. It doesn’t swap models, languages, or deployments.
Not a session reset. Configuration changes don’t clear internal state or redefine turns. They tune how Flux processes subsequent audio.
Not a guarantee. Dynamic keyterms and EOT tuning improve control and can reduce errors for task-critical phrases, but inherently difficult entities (unusual names, accented speech) may still require additional validation in your application logic.

Get Started

On-the-fly configuration is available now in the Flux v2 WebSocket API.

Flux Now Shifts Context as Conversations Evolve in Real Time

Table of Contents

Table of Contents

Introducing the On-the-Fly Configure Message

Why This Matters for Voice Agents

How It Works

What This Isn’t

Get Started

You may also like...

Unlock voice AI at scale with an API Call

Unlock voice AI at scale with an API Call

Table of Contents

Table of Contents

Introducing the On-the-Fly Configure Message

Why This Matters for Voice Agents

How It Works

What This Isn’t

Get Started

You may also like...

Unlock voice AI at scale with an API Call

Unlock voice AI at scale with an API Call