Article·AI & Engineering·Dec 9, 2021

Sparking the Future of Conversation Design - Braden Ream, CEO, Voiceflow - Project Voice X

Claudia Ring
By Claudia Ring
PublishedDec 9, 2021
UpdatedJun 13, 2024

This is the transcript for the session “Sparking the Future of Conversation Design” presented by Braden Ream, CEO of Voiceflow, presented on day one of Project Voice X.

The transcript below has been modified by the Deepgram team for readability as a blog post, but the original Deepgram ASR-generated transcript was 94% accurate.  Features like diarization, custom vocabulary (keyword boosting), redaction, punctuation, profanity filtering and numeral formatting are all available through Deepgram’s API. 

[Braden Ream:] Alrighty. Hi, everyone. Thanks for having me here. So I’ve given a bunch of talks on conversation design in the past, and the one I most often give is how to structure a conversation design team, so how to build one internally, build out the best practices. I thought what may be kinda fun is… you know, we learn a lot about conversation design through the practices of actually building Voiceflow. And so I thought it could be interesting to talk about the challenges of building a conversation design tool, which will, hopefully, in effect, teach you a little bit about conversation design itself. So there’s gonna be a lot of product shots, talk about a lot of a a specific customer problems, you know, without naming any names, and kinda run you through the process of actually building a tool. I’ll try to keep it fairly short to sync to the point. And, you know, if you wanna chat a little bit more about Voiceflow, we’ve got a social later today as well as a booth at the back. Love chatting product, but I’ll try to keep it fairly succinct ’cause I know we’re going into into lunch here.

So this is thoughts on building a conversation design tool. First time running this, so bear with me. Little bit a… yep. We already went through the goal here. Got ahead of myself. Little bit about Voiceflow, so we’re used by over eighty thousand teams now. Lots of great customers out there, actually, lots of great customers here as well. I’ve raised about twenty five million dollars from some great investors, including Amazon and Google, who are are present as well as some other awesome design tool folks, including Figma and InVision and and others. Cool. So why does conversation design matter? A lot of folks might actually not be familiar with it. You know, it’s really come to prominent sites over the past two years.

And so the challenge for a lot of teams today creating conversational AI is the tools, frankly, aren’t there on the collaborative side. In fact, the most common tool stack we see is going to be Word docs, Excel sheets, and Visio flowcharts. Now there’s lots of flowcharting softwares out there, and I’ve certainly heard people say, well, we don’t use Visio. We use Miro all in the same bucket. It is typically… next slide here. You’re gonna see spreadsheets are used to manage the NLU design. You’re gonna see, essentially, flowcharts are managed… manage the state machine. It manages the flow of the conversation. Word docs are often doing the script of the conversation, so a very lightweight, almost like a wire frame just to give people the feel. And the three of these are used in conjunction to create a conversation design. Now the problem with this is… well, two things. One, you don’t have a single source of truth. And so companies often have… you know, when you’ve one conversation designer, you can usually work with spreadsheets and flowcharts. That’s when you have two or more or ten or, you know, a hundred in some of these larger companies now, where they’re now getting to these very large teams. And when they have to communicate conversations across different teams and organizations, it might take a week just to understand the design. In fact, when I work with some customers, I’ll go into a design, and we’ll be chatting about it, and it might take me an hour or two just to understand the design before I can even comment on it. And so it gets really, really messy as its scales, have multiple sources of truth. And then further, you’re not able to prototype.

This is a big challenge for a lot of companies where if they’re coming from Visio flowcharts or, again, Miro, whatever your flowcharting tool of choice is, you’re not able to actually turn that into a usable prototype to run user testing. So a lot of companies will actually go straight from Word docs, Excel sheets, flowcharts, whatever it might be to a Jira ticket and then out to production. That’s a super common workflow we see all the time. Sometimes WoZ testing, which is Wizard of Oz testing, is implemented, and that’s essentially acting as though you are the assistant. So we hear companies where, you know, they’ll have a human on one end the line. They’ll call the other end if they’re trying to simulate a call center experience, and they’ll act as though they are the assistant. So that’s really the low fidelity testing we have available with the existing tool stack. So this is where conversation design tools come into play like Voiceflow. So a streamline way to go from design, prototype, user testing, and then ultimately handing off to development, trying to give them artifacts that are more battle-tested because you’ve actually done the prototyping, the user testing before handing it off to development as well as handing them in artifact is ultimately more readable as it’s all in one place without having to dig through a spreadsheet, like, you know, a flowchart, etcetera. Some nice quotes, but I’m gonna skip through these here and say conversation designs tools ultimately give you a single source of truth. And so this is the intro to a conversation design if you weren’t familiar with it and sort of what it is and what conversation design tools are.

I wanna spend a little bit time about talking about the role of conversation design. So I think it’s fairly misunderstood what a conversation designer actually does on a daily basis. It’s essentially a mix of a few different a few different roles. So it’s going to be traditional UX. Most conversation designers actually come from a traditional UX background. It’s probably the most common one we see. We also see copywriting as a as a fairly common background, and then you have the NLU model. When you actually look at these in terms of structure, you really have conversation designs sitting on top of the NLU model. It sort of stands on on its shoulders there, and these three practices together create what we talk… call the conversation designer. And then you’re now starting to see, as these teams are getting larger, they’re getting more sophisticated. The role of the conversation designer is actually starting to split within companies. You’re starting to see AI trainers become a more common role, starting to see copywriters with a… you know, just very specific copyright and title, and the conversation designers really sort of the flow architect. And so you’re starting to see increased specialization on the UX side of these conversational AI teams. Cool.

So when we look at, you know, a conversation design… so here’s just a a sample when I pulled in from Voiceflow. You might have four different rules all working within this one design. You’re gonna have the copywriter who is focused primarily on the responses that the assistant is going to give. They’re going to be responsible for the persona often as well. From there, you’re going to have the NLU designer. They’re working with the different intents to make sure the utterances aren’t conflicting, the model’s actually working, and it’s gonna provide the best design experience. From there, the UX designer’s often responsible for the entire flow. What is the flow of the conversation? What intents are we handling, and how do they structure into each other? So that’s really sort of the role. They’re almost like the the flow architect almost. And then lastly is the UI designer. A lot of experiences are becoming multimodal. Voice is not… and I think, as the  previous presentation went over, voice is not the be-all, end-all interfaces one of many, and it’s incredibly good input, but it’s not very good at output. You know? If you’re choosing a Netflix movie, you would certainly would not want to have to be read the thirty different options before you make a choice. It’s gonna be crazy cognitive overload, but we’ve all sat there trying to use the the keyboard on the remote to try to pick a Netflix movie. And if you know what you want, it’s an amazing input interface. So you’re starting to see the rise of multimodal as well as these visual designers are being added to the conversation design teams. So one, you know, thing that, I I think, a lot of folks misunderstand about even voice or other conversations design tools is that we’re not content experts.

You know, a lot of folks who come from linguistics backgrounds, they might understand, you know, Grice’s Maxim is thrown around quite a bit coming from sociolinguistics. These are sort of the folks who are the best at creating the ideal responses. That’s not asset at Voiceflow. We actually view ourselves as giving you the tools to create the best responses. But as far as we’re concerned, we’re thinking about structure. We’re thinking about collaboration and the workflow. That’s really where we spend most of our time thinking in terms of what is actually said to the user. You know, we’re average average Joes when it comes to the actual content. And so what do I mean by content? Well, that’s going to be what’s actually said. We focus on the content structure. We focus on how can designers visualize the structure of the conversation. How can they actually put this together in a readable, and how can they ultimately collaborate? That’s really where we spend most of our time thinking. So some examples, on the far top top left there, you’ll see, like, a normal response. What the designer actually puts in there… I’m not a very good copywriter myself. You know, that’s that’s not our forte, but what we’re thinking about is… k. How can we add response variance? How can we do localization, text markup, speech SSML? How can we allow for all these different structures to be applied to the content more so than what is actually said inside the content itself?

So if you work at Voiceflow, probably, the number one thing you hear preached constantly is conversation design documentation. It is the number one thing we think about. And the reason for that is, ultimately, the art of conversation design is the art of conversation documentation. You know what you wanna say as a conversation designer, and if you’re a team of one, that’s where you’re able to put this into a… any kind of artifact you’d like, and how it’s actually presented is not that important. When you start to work in larger teams, it’s all about how you present it in a readable format so that other people can actually use it. So conversation design is the art of conversation documentation. And in order for a conversation design to be useful, it must be easily readable, and I think we’ve all seen flowcharts that look like this. This is… you know, oh, it’s a little blurry on the screen there, but you can kinda get the point. It is a crazy mess of nodes, and this is how a lot of internal design artifacts look at large enterprises as well as smaller teams. Frankly, there’s not much thought put into the actual design structure, and often what you’ll see is folks come in, and they’re unable to read it, and then they will often opt to completely redo it themselves. You just create a ton of redundancy.

And so the best practice here is often to start out. Create a prototype. It might look really messy and spend a lot of time actually refactoring the designed to be highly readable so that other people can collaborate on it with you. So I wanted to spend some time going through essentially customers’ stories that is all along this theme of documentation, so some things that we’ve had to think about as a conversation design tool that you might not think about being a conversation designer. This was a big one. So when we were working with a large automaker, they were switching over their documentation base from Visio to Voiceflow, and we had a lot of seasoned VUI designers, voice user interface designers, sort of a branch within conversation design. Say they… you know, they’ve been working in Visio for twenty years now, and they’ve always done top to bottom, and therefore for Voiceflow is incompatible and will never work. This is something we hadn’t thought about. Voiceflow was always a left to right structure. That’s just… you know? In the early days of the company, it was just something we did. It was left to right. That was it. We didn’t really think much about it.

As the company evolved and now we work with customers that use all different orientations, you actually need to be able to mold the tool to mesh to people’s existing documentation systems so that when they’re going from reading a Visio chart to a Voiceflow chart, it’s going to look comparable, and they’re gonna be able to understand the lay of land instantly. So problem was not being able to go to left to right. Solution was fairly simple, being able to go vertical. Another problem we had was conversation designs aren’t just about the functional nodes on the canvas. It’s not just about the lines. It’s also about the surrounding contacts. There are sticky notes. There are little notes that designers jot down. There are comments. A lot of this wasn’t present in Voiceflow. We just had the functional nodes. We thought, well, if you can build it, that’s good enough. There’s not… you know? No need to add any sort of additional context. And what we found is design teams were going from Visio to Voiceflow to prototype, and then back to back to Visio to document. So then they would they would document their findings from Voice’s prototypes, and Voiceflow was just used as a prototyping system. So we ask why that was, and it was because they couldn’t add sticky notes. It was the simplest thing, but we just hadn’t thought about it as… and as a conversation design tool, we learned that the surrounding context is as important as the content itself. And so we added sticky notes, labels, images, all these things that allow your canvas to be high fidelity and flushed with context. Response variations take too much space. You don’t wanna hear the same thing from your system over and over and over again, and so response variations are a big thing in conversation design. Having five, six, you know, seven responses, or more to say the same thing just to keep the experience fresh. And so in Voiceflow, we had this feature. We thought this was great, and we allowed you to stack them all like this. Well, some companies might have a hundred, and they might have a… you know, two dozen. And it started to completely clutter the canvas, and so companies were again going back to spreadsheets to put all the response variations and then linking to Voiceflow and saying, you know, this should… is what the variant should have.

And so we saw this, and we added the ability to choose your different visibilities, ’cause, again, it’s all about being able to either… essentially, layer your visibility. Are you gonna share everything, or are you just gonna share what’s needed to be seen by that particular individual? So we have… often see this a lot with even executive stakeholders. When they come and look at a conversation design, the designer will change the visibility to be just the bare essentials of the conversation. Anything more and the executives, you know, go asking about, you know, variation fifteen, what’s going on there? It’s way too much. You just wanna share just the bare minimum to understand the conversation flow. But then when you’re sharing with maybe a developer, you wanna be able to show all those different variations to really give the give the scope. So selective visibility has been… become incredibly important. Customers were adding a ton of designs to make the prototype work. This is a really curious one. So we were chatting with a bunch of customers and they were adding a ton of work to the designs, only to make the prototype work, and so the best example of this was personas. They wanted you test out. What happens if we have a a user who’s logged in versus logged out? So what they would do is they would actually add inside the design a whole bunch of logic and variables and things to actually create these different personas that would then be used inside the prototype. Now the problem here is by doing this, the… this… developers, when they saw the design, weren’t sure is this part of the design, or is this part just… to make the prototype work?

And so this has become a super common problem, especially with the enterprise where they testing, you know, two personas, five personas when they’re doing user testing, and so we added the ability to test different personas. And so you start seeing a thing where it’s very customer led and that conversation design is such a new field. The UX side has been around on the IVR side for about twenty years, but in terms of the UX, sort of, adopting what’s the modern UI practices to conversation design now, that’s only really come in the past two years. And so we’re essentially building a new category of tooling, and we’re learning all this stuff as we go. Few more here. Customers were creating multiple projects for the same assistant. Designs are getting very, very large. If you run a contact center today, you’re going to have massive flows, and so we saw our customers creating tons and tons of projects just to handle different variations, so we added folders. I could go off this. Yeah.

Last one here. Different stakeholders need different information. What we found is that the same node on a canvas might be used differently by two different people even on the same team. So the developer is gonna look at this and go, oh, great. There is the JavaScript logic I needed. I can now reference that easily into my actual code, but the designer looks at that and has no idea what’s going on often. And so we added the ability to do role-based views. You have a developer view or a designer view. It’s really up to you. And so this is sort of the evolving state. As these teams get larger and larger and different roles are being added, you’re gonna see these tools essentially become multifaceted in how they can actually display information depending on the rule. Since a conversation is such an abstract concept, it is going to be different to every stakeholder that views it. And so you need to be able to chunk and break down that layered visibility. It’s very similar too. In a design tool, you know, you might see the color red, and the developer needs to see the hex code. And so you’re gonna have that ability to break down a conversation by all of its different elements as you go through. Cool. I think we can go through this. Yeah. Some problems, I think, the industry has yet to solve, you know, intent manager… intent management. We call them an… essentially, an IMS internally.

Some customers have thousands of intents, tens of thousands that you’re… you know, at some of these larger companies, almost having what you have for a content management system and applying that to the intent space, especially as you deal with different locals, different NLPs and NLUs, having a robust system, the industry has yet to solve that, but there are some exciting start-ups popping up. Content management. Content management, as it relates specifically to conversational assistance, there’s a couple start-ups that popped up that are exciting, but this is something that’s going to be increasingly important as these assistants are deployed, not just in one local, but in multiple locals, getting to the point where it’s not just countries but even down to the city level. Maybe you have a restaurant in a… one city that has a very different, you know, phrase for good morning than another, we wanna have conversational assistance, meet customers where they are, and actually be able to get down to that local level. So content management’s gonna be incredibly important.

Then lastly, persona management. You know, certainly brands want to have one assistant that is going to be able to interact with customers across all channels, and so having a way to manage that persona in terms of its vocabulary and how it actually interacts with customers, I think you’re gonna start to see a a persona management software pop up as well. So that’s some exciting stuff there, but lots more in the industry to solve on the collaborative side. And that is it on my end. I wanted to essentially have a bunch of different product product shots to kinda run through some different solutions. From the conversation design tool side, hopefully, you learned a little bit about conversation design through that process. We’d love to chat with a bunch of folks. If you wanna chat about conversation design practices, the talk I’ve given previously on how to actually structure our team, happy to do that as well. We’ve got a booth at the back in a social today at four thirty, and there will be free drinks. So… and it’s a beautiful venue right on the beach. So feel free to come by. Say hi, and we’ll give you a ticket for that. And with that, thanks so much.

If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions .

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.