
AI Minds #065 | Eliott Hoffenberg, Co-Founder & CEO at Vocca

Eliott Hoffenberg, Co-Founder & CEO at Vocca. Vocca helps medical clinics run 24/7 by automating their phone reception with AI.
It handles scheduling, triage, follow-ups, and patient prep through smart inbound and outbound calls. Thousands of providers use Vocca to reduce costs, save time, and deliver instant service—without the wait.
Listen to the episode on Spotify, Apple Podcast, Podcast addicts, Castbox. You can also watch this episode on YouTube.
In this episode of the AI Minds Podcast, Eliott Hoffenberg, Co-Founder and CEO at Vocca, about his journey from business school to leading a conversational AI startup.
Eliott shares how his family background and time in San Francisco shaped his focus on healthcare tech, driving the creation of Vocca—an AI voice agent that automates clinic communications with a human touch.
He dives into the challenges of conversational design, the role of psychological nudging, and how AI voice agents can improve patient access while easing staff workloads.
The discussion highlights the framework for building agentic companies and explores Vocca’s impact on transforming patient interaction in healthcare.
Listeners get a front-row seat to the future of AI-powered healthcare communication from one of the industry’s innovative leaders.
Show Notes:
00:00 Tech Family Background and Career Frustration
03:31 Passion for Lifelong Healthcare Innovation
06:43 "Critical Healthcare Needs Human Touch"
10:58 Challenges in Conversational Design
13:39 Optimizing AI Voice Interactions
19:25 AI's Human-Like Interactions
20:44 "AI-Driven Healthcare Communication"
25:35 User Knowledge in Clinic Scheduling
29:46 Handling Overlapping Conversations Efficiently
31:01 Voice AI Interruptions Simplified
More Quotes from Eliott:
Demetrios:
Welcome back to the AI Minds Podcast is a podcast where we explore the companies of tomorrow being built AI First. I am your host Demetrios and this episode, like every other episode, is brought to you by Deepgram. The number one speech to text and text to speech API on the Internet today. Trusted by the world's top conversational AI leaders, startups and enterprises like Spotify, Twilio, NASA, Citibank. In this episode I have the pleasure of sitting down with the co founder and CEO of Vocca. Eliott, how you doing man?
Eliott Hoffenberg:
All good. It's a free day today in Paris. It's the day of labor.
Demetrios:
Oh and you are working hard. That is the life of a startup.
Eliott Hoffenberg:
Champ during a podcast with you is not working. So it's fine.
Demetrios:
Just wait till you see my questions. Now I gotta go into a little bit of your story and then I want to talk at length with you about conversational design, what you see the future of AI in healthcare looking like and a few other topics as we meander about. I'm going to come clean right now that you told me something on a call probably two months ago and ever since I have been repeating what you said when it comes to conversational design and thinking very deeply and very intentionally about the phrases that voice agents say. But before we get into all of that and we can rehash that conversation hopefully for the better, I want to hear about how you started in tech.
Eliott Hoffenberg:
So my family is in tech already, so I grew up with a dad who started his company in tech when he was probably 25 or something. So always had AI in mind at my house, we're speaking about it at somi dinners 10 years ago. So I always had this in mind that I didn't know I was going to it was going to have such an impact a few years later. But we were speaking about this all the time. Then I went to study at a business school. So in London at the lse, which was a very corporate finance bro school, I was very frustrated about being in that space. So you don't strike me as the type work at. I started an incubator there so raised some money, started working with some startups and again I got a bit frustrated about working with startups, not doing a startup.
Eliott Hoffenberg:
So when I graduated I directly went to sf, spent some time there, learned about all the great ideas and how to make them great companies and change the world that way. That what inspired me in working on vocab.
Demetrios:
So you got the bug. You said entrepreneur is A French word. I am going to start a company and move back to Paris. And where did the inspiration for VOCCA come from? You mentioned being in sf, but that doesn't necessarily lead you into the health care space nor the voice agent space. So why did you land in that?
Eliott Hoffenberg:
I have a brother who started his company at again very young and who worked on it for almost all his life. And I wanted to find something where you a topic where you can dedicate your life and it still makes sense. There isn't that many topics in like industry that make sense, like will never bore you and healthcare is one of them. Then when I was in sf I was looking a lot in what was happening there. Everyone was speaking about AGI all the time and the different every single week had a new tech being released and it was the time that you had the first voice orchestrator that were being released. So everyone was like demoing at hackathons the first thing you could build with like the VAPI retail blend, like all these tools that you could try to build and put a prompt and you start speaking with AI which sounded 100 times better than anything that could exist with IVRs or only just a few months back. So it makes sense that you had to that this was a space that was going to change the world in the coming few years. especially as most of the conversations you're having and most of the things that most of the actual content that exists in the world is more said is more spoken than written.
Eliott Hoffenberg:
So you just have this immense potential in building conversational, multivoice conversations and a lot of the use cases that make sense. A lot of people who were interested were people in healthcare because this is where you have the most repeatable, problematic and in healthcare it's the industry that is the most run by voice. You don't want to have like a text chatbot or like an intercom on your clinic. This is not the way you want to work. People always tend to call their clinic, they want to create that link. And it just makes sense that not only you could make clinic much more efficient because this is a pain that they all have. None of them managed to hire enough and at the same time just a scope that is immense in building conversations that works in healthcare.
Demetrios:
It's funny that you say the culture isn't there for folks to talk through a chatbot because I a hundred percent, if I ever want to talk to my clinic, I call and I don't. I didn't realize that until you said it. But it is not something that I want to go and get a chatbot for. I want to talk with somebody. And even if that's just a quick five minute question and I potentially could find it on the website, I still call. Do you have any idea why that is?
Eliott Hoffenberg:
It's not clear even. I think the part that was even more surprising is when you think about comparable companies. All those companies, you should be able to find their emails somewhere on their website. And for healthcare, many of those don't display their email. You don't really know when you're going to get an answer when you send an email. And it's just the main channel is voice for some reason I think just people are used to having receptionists at the front desk when they go in a clinic. So they know that because there's someone their call might be picked up. While if you buy on the Shopify store, there's no one that is sitting at the front desk of your Shopify store.
Eliott Hoffenberg:
I also think that like the most important, the most subtle and critical the conversation has to be about you don't want to use a chatbot. So if it's something about your health that you don't fully understand, your medication, your the thing that you don't do that often, like booking a radiology appointment, you don't know exactly what you want or what you need. So you're looking also for advice and for chatbots use it for only support. But most of the time it's things that is that aren't critical that aren't most like so important. And the last thing is also that either ways healthcare has to be face to face like for the or at least most of it, you go through the doctor. So if you aren't happy about your healthcare experience, you're probably gonna manifest it and tell it to your provider who then is going to complain to the institution or the facility. So while if you buy on the Shopify store, there's no one that you're actually going to shout on about the issue that you had with your support or the questions that you had that.
Demetrios:
Weren't answered, now's probably a good moment for you to explain a little bit more about what you've built.
Eliott Hoffenberg:
So we build a product that helps any clinic automate all their conversation with AI. So a lot of those clinics are overwhelmed by calls. They're getting inbound and outbound calls that they have to do for of course, patient scheduling, patient intake, a lot of just support questions, explaining medical protocols. And they also spend a lot of their time just calling the patients for either preparing them for certain type of appointments, like explaining and answering all their questions, or following up with them to make sure that they're adhering well to the whole, plan and their patient journey. So all of this is what is supposed to be made by a receptionist. And a lot of the time she's just, or he's just not able to do the work. And this is where we come and we assist with conversational AI that works and works with the receptionist.
Eliott Hoffenberg:
So they still have like a human in the loop for any crucial topic.
Demetrios:
Now, do you have data on the majority of the calls? Because I know that nine times out of ten when I call a receptionist, it's probably some simple question and it's like no, I'm supposed to fast, but am I supposed to stop eating at 8pm or is it 8am? I can't remember.
Eliott Hoffenberg:
It depends a lot on the type of clinic that you have. So in the radiology practice, probably 20% of the questions are about the protocols that you have to do before coming to the clinic. So do I have to fast? I have a pacemaker. What should I do? Or a lot of those just questions that aren't necessarily out there or that you just want to check with the exact way that your clinic works. Because everyone has their own specialty and their own way of working. Every provider. Then of course, there's probably anywhere between, depending on the clinic, between 40% to 60% that are for patient scheduling or anything that is like cancellation confirmation.
Eliott Hoffenberg:
A lot of people, surprisingly, actually probably 10% call just because they forgot the time of their appointment. And even if it's like somewhere in their email or whatever, they just still call to make sure that they noted the thing. That's amazing. And then that's me too.
Demetrios:
I've done that.
Eliott Hoffenberg:
Depending on the size of the clinic, it varies a lot as well and the nature. So for primary care, there's a lot of prescription refills. There is a lot of questions that, you just don't know if you should go to see your provider for X or Y reason because you're having some, you're bleeding or you're having some stomachache and you just don't know if it's a good time to come. So there's a lot of triage that comes there and knowing when is necessary to see your provider and when it's not and deciding this is a lot of the work that the receptionists do as well, working as a counselor.
Demetrios:
So let's talk for a minute about the conversational design, because that is my favorite, especially since there are certain things that are notoriously hard when it comes to this type of agent, and I think one of them is complex names like my name is Demetrios. It's kind of hard for these agents to understand me, especially when I say my full name, which is Demetrios Pattison Brinkman. Then it's like, what? I didn't get that. John. I have a friend who is also building a voice agent in a different space. And his name is Ben. And he didn't even realize how complex it was with the names and, and being able to spell them until he put his voice agent in production and got people using it. So what are some of these gotchas that you found and how have you attacked them from different ways and make.
Demetrios:
Made it more reliable?
Eliott Hoffenberg:
It's the hardest part about AI is that you're getting a demo that makes sense so easily, but the last five, 10% are horrible to you to just like get it done and go to perfection. Name is of course one of the toughest parts. And now try to imagine email, like gathering email is such a tough part because the way that you say it to dictate it is very different. Depending on the person, there is a lot of different characters.
Demetrios:
Emails with underscore or dash or whatever it may be that it. And then the letters. And you don't know if it's a letter or If it's a number. All of that.
Eliott Hoffenberg:
So one of the issue we have is imagine that your email is the matrios one. And you say it that way. How do we make sure that one is O, N, E or is it just a number? And these are like the tiny details that you can spend hours, days or weeks trying to fix or just to optimize. And so on this part, there's a lot of selection in picking the right models at the right node. And the way that we think about conversational AI is that sometimes the less AI or the less generated you put, the better the conversation becomes. Because the main thing you're trying to optimize for, especially in a space like ours, which is healthcare, which is mission critical. And that you don't, you really don't want to have any hallucination of any kind.
Eliott Hoffenberg:
You need to have some deterministic parts with. But while keeping the experience of having you Know a great conversation when you feel that you're being heard and you're being understood. And this is really what's clear. So what we're doing this is a lot of mapping is. And I think a lot of the all the people who are succeeding right now and working, getting agent production in AI and voice agents are a lot doing it with a mixture of creating a spine that you need to build with some flows, with some trees, and with a lot of different agents that you make work together. So one agent, its only job will be to collect and make sure that it has the right name. And at each step of the conversation you might use different models. So you might choose models that are slower but with higher accuracy and that are specialized at the node at the moment, that is for a name or email recognition and a different model like for speech recognition, when you're picking some other words that is just conversational, that just learns to pick it fast if you want it to work.
Eliott Hoffenberg:
And then there's a lot of small artifacts and just engineering stuff that you do to make it like there's no waiting time. And you make sure that everything got well cleared by putting fillers, words and putting sentences that will speak while you're actually processing something. That takes more time. But you have to have this thing where you're nudging someone to speak in the way you want them to speak. So in the format that you actually want, specialize a model for that format and take the time and create the artifact so that the person do not even feel that it's taking longer for that harder task.
Demetrios:
What does that look like in practice?
Eliott Hoffenberg:
In practice as a patient or in practice in value.
Demetrios:
What would I hear? How would I be nudged?
Eliott Hoffenberg:
Giving a date would ask you to give your date of birth. But there's many ways in giving you date of birth and people's sciences will. The amount of people who would forget to give the year of the date of birth is surprising when you think that is so simple. So you just like you have to say it in a sentence like give me the date in that format, please. And just so you make sure that you're getting in that format. And we tend to not give it at first, Just give me your birth date.
Eliott Hoffenberg:
If we get it straight up, we don't even ask, but if there's a doubt, we just say like we create the sentence that, we're not sure we got it on the right format. Do you just mind giving me in the following format or just in that way. So you just really need to have to nudge the person that you're actually speaking in the way he has to speak. And you have to do it in a way that still feel very natural.
Demetrios:
I like that you do it as a fallback, not as you don't lead with that. Because there's a potential that I am going to say the date and the year I think about. If somebody asked me, yeah, my birth date, I would say it, but then I would probably say the last two numbers of the year, not the full year. And that could potentially confuse the hell out of a model. One other thing that I think about is you. You mentioned how you don't want folks to feel like it's going unnecessarily slow. There are times when I can imagine you are using a bigger model and it is taking longer. And what happens with a call and voice agents, that is distinctly different than when you're interacting with a computer agent or whatever we want to call them API agents, Whatever is, you don't get to see it.
Demetrios:
Thinking.
Eliott Hoffenberg:
Yes.
Demetrios:
And so how do you approach conversational design? You're not telling the patient, I imagine, okay, I am the model. I'm going to grab this tool and now I'm going to start looking at your records. Or are you giving it updates periodically?
Eliott Hoffenberg:
That's what we do. So just give me a second. I'm going to be looking at the availability if you don't mind waiting for a few seconds. But models are still quite fast now, and even the largest one will be like at most a couple seconds. So just you find the explanation and show that the person is thinking. That's a great way of work and just like communicating with the patient. The other thing that works quite well, if ever you need to have some pose, is having some background noise. Because even if there's some background noise, you know that there's still something that is happening.
Eliott Hoffenberg:
It's not so if you just have like a complete blank, some of the patient could freak out that you just like. Maybe the problem is that we got disconnected.
Demetrios:
That's a nice little hack. Wait, background noise as in background music or just like people shuffling around in the background or birds playing, chirping.
Eliott Hoffenberg:
People shuffling around. Like you feel like you're calling in the clinic so you have like the.
Demetrios:
Oh, my God.
Eliott Hoffenberg:
Mild noise. That is happening. That would feel very natural that you're.
Demetrios:
In a clinic that is incredible. That is such a nice little hack. And it's so funny how all of this stuff that you're talking about is really trying to help the end user understand what they are dealing with and make it more natural. It's like you're coming to them instead of asking them to come to you. And I can imagine that's why the experience feels magical.
Eliott Hoffenberg:
And as humans we do this all the time and the end goal and even now like a lot of the like end user who are calling do not even realize it's an AI yet. There's a lot of things that are made that are very deterministic. But we as humans, when we work on a very specific task, such as like scheduling, you want it to be some kind of deterministic, so you're expecting something. And as a human that know your job very well, you're also going to communicate a way that you prefer because you already know and you predict how the other person is going to answer. And the great thing is with AI, you can do this with maximal data. And once it's learned and when it's made, it's going to always like stay with the same quality. And anytime there's something that goes a bit on the side or that isn't predicted, you can always like take a look and see how you can evolve to prepare for after a few years having the most complete reception that you could ever dream of.
Demetrios:
Let's talk about that for a second. Where do you see this going? If you execute properly, what is the next few years look like?
Eliott Hoffenberg:
The way we see healthcare is that it's two main components. It's diagnosis and communication. And anything that goes like around diagnosis is just people telling you and reassuring you and explaining you and just like doing stuff for you. And many of this can be done with conversational AI without even you having to move completely from, going to the clinic to get any knowledge that you didn't get or you just can, don't have to even speak with the doctor for something that isn't required. So our goal and the way we see this company is that if you manage to get all the information that you should get on the clinic, understanding how it works, understanding how you should behave and just do all the tasks that are quite standard, that will help the patient get all the information he wants, get all the care he needs and make sure that you, you still do everything that is needed to have the best patient journey and experience. So this is really how we see is 247 agents available with all the knowledge and fully available with all the use cases that can be done in the medical space.
Demetrios:
I love the vision. Now you did tell me before we hit record that you have a bit of a framework. What is this framework for creating agentic companies?
Eliott Hoffenberg:
I guess there's a lot of misconception the way people think about agents. It's even like our customers, a lot of them like the first questions they have is like what's your latency.
Demetrios:
Or whatever.
Eliott Hoffenberg:
And this is really not what will make like we differentiate a great company, a great agency company versus a average one or my one. You can build a great demo in the seconds and to build a great product it's a lot of the details that you build in four main layers as we think. It needs to have in mind the first one is kind of the tooling layer. So you need to give all the integration that you can to your agents and the depth of this integration and how you think about all the edge cases is really something that matters the most. You can't have a receptionist, she doesn't know how to use your EHR or any software that you use.
Eliott Hoffenberg:
The second part that is truly important is kind of the alignment layer is how do you create a product that creates a collaboration between the people who are still going to use it. So for in our case is the receptionist will still be able to be pinged whenever there is going to be any emergency or anything that is any task that requires human assistance. And there's many like some billing changes that can be required. So you need to have this alignment layer where alignment and collaboration where you can design the agents in the way you want but not in a way that is in a way that is very industry specific. So knowing that you don't have to create a prompt, you just click on a button depending on the different configuration you want. The third part is kind of the industry layer. So having all the knowledge and all the differences and intricacies that you have on each industry for on our case, the medical space, having the keywords boosted for each of the main vocabulary that exists in that space.
Eliott Hoffenberg:
Having all the knowledge bases that make sense for each of the vertical. And the fourth one is the observability layer where you could have a really good monitoring play, you could have a really good just post ops that helps you not only tracking that you're making no mistakes in the space where you can make no mistakes, but all the things that aren't perfect. You need to have a way of like only seeing them because you're doing so much volume, like we're doing millions of calls. I need to know where are the place where you can improve and where there could be any bug.
Demetrios:
I love this framework.
Eliott Hoffenberg:
That's kind of the framework that we have. So to sum it up, it's the tooling, alignment, industry and observability.
Demetrios:
I already told you that I'm going to steal this and parrot it. But I will make sure to give you credit because this framework is very special in the way that you are thinking not only about the agent itself, but this is the first time I've heard someone talk about how the humans interact with the agent and making sure that whatever your software is, the agent is almost abstracted away so that the human who is the end user of this software, not the end user of the agent. Because you almost have like many different stakeholders.
Eliott Hoffenberg:
And I think this is something that people do not realize. How important this is is that our end user is like an admin person at a clinic. And they know everything of how the receptionists work and how this provider can only have appointments in the morning, but like on Tuesday he only wants to have surgeries so he needs to pick it up. And that for this kind of surgeries you shouldn't be able to take an appointment a few days before. So you have to like block it for three days. And a lot of this has to be translated in the prompt. But if you ask them to touch a prompt, be sure that your agent will never work.
Eliott Hoffenberg:
So you need to find a way of like having them aligning what they want and the way their way of working with the way that the agent understands the world and make sure that it's going to behave that way. So instead of like adding some deterministic layer but also prompting some of the parts and all of this has to be abstracted. And we're really an abstraction company. And I think a lot of the agents companies are abstraction companies.
Demetrios:
Wow. Thinking about that and how you help the what I guess traditionally has been referred to as the subject matter expert, that user of your product, not the end user who's interacting with the agent, the user who is almost building the agent, but in this abstracted way. You want to make it very easy for them in the GUI to create, to click around. And what is happening on the back end is you are creating those prompts that you know are hyper optimized and will get things right.
Eliott Hoffenberg:
My co founder, Hugo, he's obsessed with Apple and he was speaking about me. Like today we're working on some of the config panels. And the great side about Apple is that I don't have the exact number in mind, but they have if you take all, if you combine it all together, there's 20,000 different ways of setting up your iPhone. But 60% of the users never changes one setting. And this is a type of thinking that we need to have and that we're having all the time is how do we build the ideal setups so no one will have ever have to touch anything. But if they want, every single thing is detailed. That's our kind of our goal.
Demetrios:
Wow, impressive. We spoke previously about architecting conversations in different ways. You mentioned all these almost psychological ways to help nudge the user. One of the tricks, I think I remember you saying was that you sometimes make the agent sound less capable so that people will speak more clearly and slow down in their speech pattern. Can you go into that?
Eliott Hoffenberg:
So there's a few things that is kind of counterintuitive that technology has enabled, but if you want it to work well, you have to disable it. So one of the main thing that was shocking to us is we disabled interruption. So a lot of people thought that interruption was one of the great things that would make conversational voice, like conversational design better. That if someone speaks, the end user speaks, the AI will stop. But it would create a lot of problems of the agent didn't really know if he would have to repeat the whole thing he was saying from the beginning. If he doesn't have to repeat it because the end user said that basically he got it if he has to continue from the moment that he stopped. So it's a lot of choices that would create a lot of problems. Especially when we give medical instruction.
Eliott Hoffenberg:
You want to make sure that everything is being said. And the truth is I think us humans, even if we quite often speak on each other.
Demetrios:
We speak over each other a lot.
Eliott Hoffenberg:
We speak over each other. So if I'm in the middle of my sentence and you say a few words, you say okay, that's right, I will still continue my sentence. And so we stop the instruction, which isn't so intuitive, or we really restrained it. Also, the agent will limit the amount of information that it collects when it is speaking. So when the agent is saying stuff and you're speaking on top of him, we Realize that a lot of patients or end user in our case would speak to the person next to them about something you're taking an appointment for your daughter and I'm giving you the appointment date and you start speaking with her like 9pm Works for you? Or like you free 10am or whatever. So we have to add a lot of all this that you still take the information, but you in almost 90% of the case, disregard whatever is said while you're speaking. So there's a lot of little tricks like this that you create so that you can actually have a great experience that is reliable.
Eliott Hoffenberg:
And we made dozens of things like this that really makes a difference one by one.
Demetrios:
These little hacks that add up. And I love how you call out. It's unintuitive because I never would have thought that the getting rid of the interrupt would make a better experience. The interrupt is, like you said, one of the best parts about voice AI is you can interrupt it or you can do these, like, these things that you could never do before. And really then you come to find out it only adds more complexity and more headache. And so considering you're not going on these monologues for 20 minutes, I imagine. Exactly. It's okay.
Eliott Hoffenberg:
Short sentences.
Demetrios:
Always doesn't matter if the interrupt is turned off because it's only going to be a few seconds before you stop talking.
Eliott Hoffenberg:
So there's always some arbitrage. We read. We put the interrupt only in some places where you have to give a very long speech, but it's very rare, so we almost never put it.
Demetrios:
All right, well, I appreciate this conversation. As you know, I'm a huge fan and I believe you guys are hiring. What roles are you open right now?
Eliott Hoffenberg:
At vocca, we're looking for software engineers who like small tricks. So this is really the type of people we like hackers in their mindsets, who wants to do product end to end and not only play with the technology. So we're hiring quite a lot. We're so feel free to reach out directly on my email. It's Eliott eliott@voccaai. Anyone who'd be interested would love to have a channel.
Hosted by

Demetrios Brinkmann
Host, AI MindsDemetrios founded the largest community dealing with producitonizing AI and ML models.
In April 2020, he fell into leading the MLOps community (more than 75k ML practitioners come together to learn and share experiences), which aims to bring clarity around the operational side of Machine Learning and AI. Since diving into the ML/AI world, he has become fascinated by Voice AI agents and is exploring the technical challenges that come with creating them.