About this Episode
"There’s so much value locked in conversations, but most of it just disappears the moment the call ends. We’re trying to fix that." — Andy Paul
Andy Paul, Founder at Voyce AI, is on a mission to turn spoken conversations into durable, searchable knowledge for teams. Voyce AI uses Deepgram Voice AI to capture, structure, and surface insights from calls and meetings so people can focus on talking, not typing.
Listen to the episode on Spotify, Apple Podcast, Podcast addicts, Castbox. You can also watch this episode on YouTube.
In this episode of the AI Minds Podcast, Demetrios sits down with Andy to unpack how Voyce AI was born from years of seeing important information lost in unrecorded or unstructured conversations. Andy walks through his journey across sales and startup roles, and how repeatedly watching teams struggle to recall “who said what, when” crystallized the problem he wanted to solve.
He explains how Voyce AI captures calls, transcribes them with high accuracy, and layers structure on top - so users can search, summarize, and repurpose insight-rich moments without digging through raw audio. The conversation also explores why voice is such a natural interface for work, how to design AI that quietly supports rather than distracts, and what it takes to find genuine user pull in a crowded AI tools market.
Andy and Demetrios dig into real-world workflows where voice AI shines, from customer calls and discovery interviews to internal syncs and content creation pipelines. They also look ahead at where the voice AI ecosystem is heading, why reliability and data quality matter more than hype, and how infrastructure improvements are making more ambitious products possible.
Listeners will learn about:
- Why so much critical context from sales calls, customer interviews, and internal meetings gets lost and what it takes to capture it without extra work.
- How Voyce AI turns raw conversations into structured, searchable knowledge that teams can actually reuse.
- The product and UX decisions behind building voice-native workflows that support people instead of distracting them.
- Real-world use cases where voice AI has a clear ROI, from customer calls to content creation pipelines.
- How Andy thinks about accuracy, structure, and trust as the foundations of any serious voice AI product.
Show Notes:
00:00 Startup Journey: Tech Roots & Early Projects 06:40 Y Combinator & Team Relocation 14:15 Early Focus: Energy Flexibility Optimization 21:05 Discovering the Communication Bottleneck 28:03 Building AI Voice Agents for Real-World Operations 34:55 80/20 of Field Service Calls: Bookings & Scheduling 41:13 Network Effects and Community Building 45:01 Why Vertical Focus Wins—For Now 52:00 Onboarding, Baseline Data, & Success Metrics
More Quotes from Andy:
"If AI adds more work instead of removing it, people will stop using it - no matter how impressive the demo looks." — Andy Paul
"Voice is the most natural interface we have. The challenge is turning what’s said into something teams can search, share, and act on.." — Andy Paul
"We don’t want to replace human judgment; we want to give people a perfect memory of what actually happened in the conversation." — Andy Paul
Transcript
Demetrios Brinkmann [00:00:07]: Welcome back, everyone. This is the AI Minds podcast. I am your host, Demetrios, and we are exploring the companies of tomorrow being built AI-first. This episode, like every episode, is brought to you by Deepgram, the number one text-to-speech, speech-to-text, and voice API on the internet today—trusted by the world's top enterprises, conversational AI leaders, and startups like Spotify, Twilio, NASA, and Citibank.
We're joined today by Andy, the CEO of Voyce AI. How you doing, dude?
Andy Paul [00:00:48]: I'm doing great, thanks. Thanks for inviting me, Demetrios.
Demetrios Brinkmann [00:00:52]: You’ve had a bit of history in the consulting realm. What were you doing in data analytics that brought you to San Francisco?
Andy Paul [00:01:02]: Yeah, so I'm from the UK, as you can probably tell from the accent. I’ve been in the tech space for quite a while. I studied a Master’s in Computer Science at university and was super interested in data analytics even back then—actually from school days before university.
Fun fact: I built an Uber-like app when I was about 17 or 18, in Delphi, a really old-school language. It obviously wasn’t Uber, and it definitely doesn’t look anything like what Uber is now, but it was a pretty fun project.
Anyway, data analytics. I’ve been in that space for most of my career in consulting. I moved to San Francisco in 2015 after having worked in London for about four years in an analytics role. The work was in people analytics—how people data can be used beyond just pay benchmarking or headcount by location and grade.
It was quite a new space at that time because using machine learning and data science in HR was still novel. Around 2015 it was really picking up—you had the likes of Andrew Ng at Stanford talking a lot about data science. With my technical background, I was trying to build statistical models, mostly regression models, to predict the likelihood that somebody might leave an organization.
That’s super valuable for a company—if you can figure out, “Is Demetrios going to leave, and when?” and get ahead of it, you can potentially retain that person and have a proper conversation. “Hey, just checking in—are you doing okay?”
Demetrios Brinkmann [00:03:22]: So it’s like churn prediction, but churn on the company side.
Andy Paul [00:03:27]: Absolutely—employee churn.
Demetrios Brinkmann [00:03:31]: And what, were you just monitoring what they were saying on LinkedIn?
Andy Paul [00:03:36]: No, this was more about data from internal systems and different inputs: tenure, for example. If you’ve been at the company for a certain number of years, that might be a good predictor that you’re thinking of leaving soon. It could also be employee surveys where people respond in certain ways and indicate they’re not very satisfied with their work.
Demetrios Brinkmann [00:03:57]: Last time they got a raise, maybe?
Andy Paul [00:03:59]: Exactly—time since last raise, how long they’ve been in the role, and so on. These are indicators that someone might want to move on. The question is: how do you retain them if you want to keep them?
Demetrios Brinkmann [00:04:12]: So that was your time in San Francisco. You went back across the pond, went home, and decided to do what?
Andy Paul [00:04:20]: I moved back with the same company and continued working in consulting, but now for UK clients, for a couple of years. Then I think I just picked up the entrepreneurial startup spirit from San Francisco.
During COVID, I set up an online business. I think everyone was questioning their lives around that time. I decided to follow an entrepreneurial path on the side of my job. It was super cool because I made all the mistakes and learned a lot about how to do business—how to sell, trade, understand buyer behavior online, and things like that.
When I came back to the UK, I still had that entrepreneurial bug. The business ran for about 18 months after COVID and then died, to be frank. It was super interesting and I learned a lot. I knew it was going to die, but I wanted to do something closer to home—closer to my core skill set.
So in 2022, I set up an AI and automation company. You had tools like Zapier, Make.com, Power Automate—these automation tools doing really cool stuff. I thought, “What would it look like if I sold these to small businesses?”
AI was still new. I had a first use case around machine learning and OCR to read purchase orders. A company asked if we could train a model to extract data from purchase orders because they kept processing invoices manually. We built a combination of a machine learning model and an automation workflow to generate invoices automatically in something like QuickBooks. It was pretty cool and quite novel at the time.
Demetrios Brinkmann [00:06:28]: Super valuable.
Andy Paul [00:06:29]: Super valuable, and this was pre-ChatGPT, pre-OpenAI as we know it now. LLMs weren’t widely available yet, but it was really cool to be part of that early wave.
Demetrios Brinkmann [00:06:43]: And then what happened?
Andy Paul [00:06:44]: Then AI really took off. ChatGPT came along—OpenAI launched its GPT models. At first, tools like Jasper were using them to help companies write blogs. That was already cool. But once they opened the platform, it became super viral. They reached a million users in just a handful of days—the fastest-adopted software in history.
For me, the timing was perfect. Companies I worked with in the UK started asking, “Andy, what’s AI? What is this thing? How can we use it?”
Demetrios Brinkmann [00:07:33]: And you were the guy.
Andy Paul [00:07:35]: I kind of became that guy. It fell into my lap and I timed it well—but there was definitely chance and luck involved too, if I’m honest.
So I went around figuring out what large language models were for myself, built random stuff, and became a consultant. I started speaking at events: “Here are the use cases. Here’s how you can use AI in retail, in legal, in different industries.” It was fun to both speak about it and build things for people.
Fast forward to last summer, and I started to see voice AI really take off. First you had large language models, then more voice-based applications. You had tools where you could send text and they’d read it out with realistic voices.
We all knew Siri and Alexa, which—let’s face it—are okay but not amazing. Then you had voice models that could speak much more naturally. That whole category of voice + AI was super fascinating. I started seeing YouTube videos about infrastructure companies like VAPI and LiveKit really taking off.
So I decided to build a voice agent for myself. I showed it to my customers and at my talks, and people were blown away. By November, I had my first MVP of a voice agent for customer support. That was the start of the product that became Voyce AI.
Demetrios Brinkmann [00:09:31]: Okay, so tell me more about what Voyce AI is now. What did that morph into?
Andy Paul [00:09:36]: Voyce AI—V O Y C E—is a voice agent for customer support functions in businesses and organizations. We help unlock 24/7 service instead of just in-office hours. The agent can answer calls and handle lower-tier, tier-one type inquiries.
For example, we sell into government. There’s a local authority business support hotline for local businesses in a certain region of the UK. Businesses call to ask things like: “Can I get a grant?”, “I need mentoring support,” “I need help with sales and marketing.” These are very common questions.
They have staff manning those phones from 9 to 5 or 5:30pm, but after that the phone lines are closed and no one can help. That’s where the voice agent steps in. It can be preloaded with FAQs and also send resources via SMS. If a caller says, “I want information about grants,” the agent replies, “No problem, can I send you a text to this number?” Then it sends a link to the relevant website with resources.
It also offers to connect them to a human later. It can say, “If you’d like to speak to someone, I can take your details and have someone call you back during office hours.” It captures details like name, phone number, email, and so on.
That information is captured as a lead and automatically sent into the office so the team can follow up. It’s also a good qualification mechanism because not all businesses qualify, so there are qualification questions too. You end up with a very hands-off but effective service layer for organizations.
Demetrios Brinkmann [00:12:02]: What I find fascinating about all of this is human nature and how at ease we are when we talk to something, as opposed to just searching.
You could get the same information by searching the internet and finding the FAQ, but a lot of us like to call and talk to someone, explain what we’re going through, and then—even if we just get a text with the FAQ—we still feel heard and understood.
Andy Paul [00:12:43]: People ask things in different ways. If you type into Google Search or even ChatGPT, you may not always get the same answer you would expect. But the power of AI and large language models is that you can ask in many different ways and there’s a good chance it will still surface the right resource.
Previously, on old-school phone systems, you’d “press 1 for this, press 2 for that.” It’s rigid and the experience is very friction-heavy.
Demetrios Brinkmann [00:13:23]: That’s one way of describing it.
Andy Paul [00:13:25]: Yeah. With a voice agent that can actually speak to you and sound almost human—let’s be honest, it’s not perfect yet, but it’s close—it’s a much better experience.
Demetrios Brinkmann [00:13:36]: And it can ask qualifying questions and clarify what you’re looking for. If you phrase something in a way that wouldn’t match the “right” FAQ on the first attempt, it can ask a follow-up question to get closer to what you need.
Andy Paul [00:14:01]: Absolutely. The idea of reasoning in models is super fascinating. Voice models are starting to get there. If you use the right model, it might take a bit more processing time, but it can say things like, “Hang on a sec, let me look that up,” or “Let me think about that.” It pauses, thinks, and gives you a more thoughtful response, instead of something instantaneous but low quality.
The market is moving in that direction: tier-one questions will be handled, and then tier-two, where more context is required—like referencing a booking system to look up available appointments—is the next step up. That’s where a lot of this is heading.
Demetrios Brinkmann [00:14:48]: And when you think about it, we could go to a website and book something—say a hair appointment—but 99 times out of 100 I’d rather call the salon.
On the website, I might forget my password or not be registered yet, so I have to sign up. Then I’m clicking around, checking dates, jumping back to my calendar, getting distracted—and the booking just doesn’t get done.
When I call, I’m locked in and actually complete the task.
Andy Paul [00:15:33]: Exactly. That was one of our first prototypes. We integrated with a big hairdressing booking platform. You’d call and say, “I want to book at this time, with this barber, for this haircut,” and it would handle the whole booking process.
It’s phenomenal because you can automate that experience. Barbers don’t like walking away from a client mid-haircut to answer the phone. It’s not great for the customer in the chair, and they don’t really want to answer the phone anyway. But what if AI could do it?
So we’re building across sectors like hair salons, clinics, and similar service businesses.
Demetrios Brinkmann [00:16:21]: Have you discovered any nice conversational design tricks?
You mentioned earlier that when a reasoning model is thinking, you’ll have it say something like, “Let me look that up” instead of staying silent. It’s high stakes: if you make the wrong move, the customer hangs up. If nothing happens for 30 seconds, the customer wonders if they were disconnected.
We have no visual cues on a call, so the only feedback is voice and sound. Some people use background noise so callers know something’s still happening in the background while the model is working. Have you discovered anything there?
Andy Paul [00:17:28]: Yeah, absolutely. We’ve been on a journey. We’ve tested things, put them in production, and they didn’t work perfectly the first time. We can’t paint a perfect picture, and anyone who does is frankly lying.
There’s a lot of testing and a lot of evals being monitored constantly. People ask things in different ways, they have different accents, English may not be their first language—there are many variables.
One big learning: tell callers up front that they’re speaking to an AI assistant on behalf of the business or organization.
Demetrios Brinkmann [00:18:11]: Oh, really?
Andy Paul [00:18:12]: Yes. The moment I know it’s an AI, I don’t hold it to the same standard as a human. That alone resets expectations. People think, “Okay, this might not be perfect, but let’s give it a chance.” That’s been huge for us.
We also use background sound—like an office background—so it feels like a call center. Then there’s how the agent speaks: some voices are very robotic, but we care about natural pacing—when to speak fast or slow, where to leave gaps.
Start/stop speaking configuration is a constant area of optimization, depending on how fast you want to respond. Handling interruptions, turn-taking—there are many nuances.
But on the front line, simply telling people “I’m an AI assistant” lowers the expectation immediately. That’s been a small but super impactful learning.




























