Podcast·Feb 21, 2025

AI Minds #055 | Jonathan Hunt, Co-Founder & CTO at Elefant AI

AI Minds #055 | Jonathan Hunt, Co-Founder & CTO at Elefant AI
Demetrios Brinkmann
AI Minds #055 | Jonathan Hunt, Co-Founder & CTO at Elefant AI AI Minds #055 | Jonathan Hunt, Co-Founder & CTO at Elefant AI 
Episode Description
In this episode, Jonathan Hunt explores AI agents in 3D worlds, their impact on gaming, real-world applications, and building a foundation model for behavior.
Share this guide
Subscribe to AIMinds Newsletter 🧠Stay up-to-date with the latest AI Apps and cutting-edge AI news.SubscribeBy submitting this form, you are agreeing to our Privacy Policy.

About this episode

Jonathan Hunt, Co-Founder & CTO at Elefant AI. Elefant AI—an intelligent companion for Minecraft players. Elefant AI automates tasks like mining, crafting, and building, so you can focus on creativity and strategy. With customizable AI-powered bots, you can enhance your gameplay and explore new possibilities. Transform the way you play Minecraft with Elefant AI.

Listen to the episode on Spotify, Apple Podcast, Podcast addicts, Castbox. You can also watch this episode on YouTube.

In this episode of the AI Minds Podcast, Jonathan Hunt, Co-Founder & CTO at Elefant AI, shares his fascinating journey from neuroscience to pioneering AI in virtual 3D spaces.

Jonathan discusses his transition from academia to industry, where his expertise in neural networks led him to work on early AI applications in robotics. He reflects on his time at Google DeepMind, tackling deep reinforcement learning and its real-world challenges.

Now at Elefant AI, he and his team are pushing the boundaries of AI-driven behavior generation in gaming environments. By leveraging Minecraft as a testing ground, Elefant AI is developing intelligent agents capable of navigating complex 3D worlds and interacting dynamically with human players.

Throughout the conversation, Jonathan offers insights into AI’s role in gaming, the challenges of training AI for open-ended virtual environments, and the importance of community engagement in advancing AI technology.

Fun Fact: Jonathan Hunt, despite beginning his career in neuroscience, realized during his PhD that his true passion lay in machine learning, which he found more appealing than neuroscience itself.

Show Notes:

00:00 Spiking Neural Networks Prelude

05:03 Deep Reinforcement Learning Challenges

09:08 Reinforcement Learning's Evolving Success

10:36 3D Behavior Model Development Initiative

13:25 Minecraft-LLM Integration Explored

16:56 FIFA Gaming with Friends Challenges

21:15 "Teammate AI vs. Winning AI"

More Quotes from Jonathan:

Transcript:

Demetrios:

Welcome to the AI Minds podcast. We are back for another episode. This is a podcast where we explore the companies of tomorrow. Built AI first. I'm your host, Demetrios, and this episode is brought to you by Deepgram, the number one voice API on the Internet today. Trusted by the world's top enterprises, conversational AI leaders and startups, some of which you may have heard of, like Spotify, Twilio, NASA and Citibank. And today I have the pleasure of getting to speak with the co founder and CTO of Elefant AI.

Demetrios:

Jj, how you doing, man?

Jonathan Hunt:

Oh, hey, thanks for having me.

Demetrios:

So let's go through your colorful past before we really dive into what elefant AI is doing and some of these difficult challenges that you're working on. Because it's fun that you were born in New Zealand, you traveled around a bunch, eventually landed in the US working at a job that is doing robotics these days. Because you fell in love with machine learning in your. Was it the PhD phase of your life?

Jonathan Hunt:

Yeah, that's right. I did a PhD in neuroscience. And the main things I learned were I actually ended up using a lot of machine learning in that. In that role to analyze data. And I realized I love machine learning and I would never be a very good neuroscientist.

Demetrios:

So just go straight for the machine learning. And how did you end up at a robotics company?

Jonathan Hunt:

So they were actually initially doing like, spiking neural networks. So They were one of the few companies that was interested in my neuroscience background. a few kind of tech and machine learning companies interested in my neuroscience background. So that's how I went there. And then over time, they sort of pivoted away from spiking to just more focused on, applied robotics.

Demetrios:

I don't even know what spiking neural networks is, and I feel like I am. I've heard it all. So what is it exactly?

Jonathan Hunt:

So first of all, I'm pretty old. So this is kind of before deep learning took over the world. This is when machine learning courses still talked about, SVMs, and I don't know if you remember, like ferns and things. And so there was this goal that you probably know, in a deep neural network, like if you have a fully connected layer, you have a large matrix of numbers and like every forward pass, you've got to kind of multiply every number. And the idea of spiking neural networks is inspired by the brain, is that instead of kind of constantly outputting a value that neurons kind of can choose when to spike and only then do they need to communicate with each other.

Jonathan Hunt:

So they send like an event like I just spiked. Similar to the neurons in your brain. They don't constantly transmit like a floating point value. They send like specific events. And so there was and still is to some extent interest in trying to build both hardware and like algorithms that can mimic the brain in that way. And so that was. Know we were. I would say like back then this was.

Jonathan Hunt:

Oh my, like 14. No, 2012 was when I joined. There was quite a trendy thing and there were several companies working on it. IBM was building a chip. I would say it still exists, but it's maybe less, kind of Deep learning ate their lunch.

Demetrios:

So you saw the writing on the wall with deep learning and you went to the mecca of deep learning and joined DeepMind right when it became Google DeepMind.

Jonathan Hunt:

So actually I have an old email. Because you probably know, two of the founders of DeepMind also came out of neuroscience backgrounds. And so I actually have an email from when they very first started saying, two of Peter Diane's postdocs are starting a company. Do you want to talk to them? And I was now I just accepted this other offer. So I never did until much later. so I joined there just. they were like a really interesting place at that time.

Jonathan Hunt:

Pretty small. I was around like the 70th employee. They were, the kind of capital of like deep reinforcement learning and doing a lot of interesting stuff. So I was pretty attracted to go there.

Demetrios:

And what did you work on when you were there?

Jonathan Hunt:

So mostly like deep reinforcement learning. And I guess I was particularly focused on, well, the technical term would be like continuous action spaces, but I guess like basically kind of using deep reinforcement learning in applications such as robotics.

Demetrios:

Okay, cool. So you had that robotics thread and then what kind of puzzles me is your next move? And you decided to go on to work on the recommender system at Twitter, which feels like a bit of a pivot.

Jonathan Hunt:

So actually there is a bit of a story there. So I would say, first of all, I had a really good time at DeepMind and it was just transformational. I got to learn from some, amazing people there Nicholas Hess and David Silver, who just, are really like built the foundations of the field of deep reinforcement learning. And I was. And I learned a lot. I kind of did become a little bit frustrated towards the end, and it kind of remained true until relatively Recently, I would say, is that deep reinforcement learning had been like a super successful academic field. DeepMind, for example, and OpenAI had like amazing demos of agents learning to play complicated games and lots and lots of really cool papers, but very few products or applications. nobody was using deep reinforcement learning.

Jonathan Hunt:

And I kind of became interested in why, because that wasn't true of deep learning in general. deep learning is used in like all sorts of places, but deep reinforcement learning, and five years ago now, I would say it's probably fair to say it really was not actually used in almost any product, despite being like a hugely important academic field and research area. And I kind of became interested in that and why that was. And that's part of what led me to want to look into applications of deep reinforcement learning. And then around that time, Twitter became interested in kind of using deep reinforcement learning to improve their recommendation systems. And so that's sort of, that's the connection, I would say.

Demetrios:

Oh, interesting. They were using, they were trying.

Jonathan Hunt:

It turned out it was very hard to make it work well. So I would say I joined kind of bait to work on that. But I ended up just working more generally on like recommender systems. We did some work with that. I have some papers from that time. I would say that none of them were like a huge success.

Demetrios:

So it's funny that you mentioned there wasn't much of the actual tangible applications of this technology, because I, when I started in ML, there was a lot of people that would talk about how at the universities you got to learn all these cool techniques and you got to see like these hardest problems on the cleanest data and everything was really pristine. And it had an answer that you were going towards. And then you get out into industry and it's just a shit show. There's no clean data anywhere. If you can even get access to the data that you're looking for and not spend like six months running around in circles trying to find it. And then there's no real answer. You don't know if what you've created is actually the best that you could do or it's actually adding the business value that you're looking to add. So you're trying to like make those connections.

Demetrios:

And the idea that you're proposing is, it just reminds me of how you have this disconnect of I want to work on this really super advanced thing. And then you go into the industry and you're l maybe I'll just do like random forest. Even random forest might be a little much for this one.

Jonathan Hunt:

No, that's true. I think one of the things is like deep reinforcement learning for a long time, Like it was really successful in games And games have this property that they're very cheap to run. You can play many times a game. They also, there's no cost to making mistakes, you just start the game again. And so you can use very large amount of like experience And that's why if you look at some of these games, like OpenAI's Dota and some of the like DeepMind breakthroughs, they are amazing. I'm not knocking them but they used like massive amounts of experience. And it just turned out, I think in many applications mistakes are costly. And so in a specific example is like self driving cars people were interested in it and it clearly is going to be important to use like reinforcement learning and self driving cars.

Jonathan Hunt:

But unlike in a if you have a real car, you can't collect experience of like driving like a maniac on the footpath side sidewalk for Americans that's just like you're not getting and so where, if you're in a game, that's fine, drive on the footpath, play Grand Theft Auto It's all good. But and I think some of those differences between games and many applications we care about is what, made it harder. I would say that this has also changed since I joined Twitter. Obviously the big and extremely public success of reinforcement learning now has been its use in like the later stages of like large language models of refining them with RLHF and then more recently just like making them smarter with like reinforcement learning. So just to be clear, I wouldn't say that it's true anymore that reinforcement learning hasn't had like big application success. I would say it was true five years ago.

Demetrios:

No, it's like almost this, as a friend put it, when he was at Neurips this year, everything was RL and he was like, what's, what's old is new. And we're all going back to like, wait a minute, I feel like we've been here before.

Jonathan Hunt:

Right, right. No, that's true.

Demetrios:

So I could geek you out about this with you for a long time, but it feels like there is a natural thread that we can pull on when it comes to the simulation and what you're doing at Elefant AI. Can you talk to me about the inspiration behind why you wanted to go and create your own thing and why you chose this space?

Jonathan Hunt:

I mean everyone in tech, I kind of thought about starting my own company with various friends many times over the years. I think this time we just felt like there was a really open problem that was not despite all the massive amounts of investment and effort going into like large language models and visual language models and stuff, was not really being focused on. And that is making models that actually can generate behavior in 3D spaces. And so I would also say that there is a few more people working on this. This was like prior to say physical intelligence and several other companies. But so we really felt there's this opening of LLMs can do amazing things and VLMs can even tell you about a picture and describe things in a picture, but none of them are really giving you what a human can do in any 3D world. You can give most humans a 3D game they've never seen before and some language instructions like wander around and find the wizard. And they'll basically more or less be able to carry that out without further training.

Jonathan Hunt:

Not perfectly, not necessarily as good as someone who's played that game a lot, but certainly probably make a decent first stab at it. And so that's kind of, in a sense what we're trying to build is models that can generate behavior in any 3D world conditioned on language. And we just became excited that that was like an underexplored area and that this is part of the thesis of our company that gaming provides a great avenue to building that foundation model. Because games, you can have real users, you can have them be very demanding of your model, but you can also roll things out really quickly. Like we shipped the first version of our public agent in a couple of months. And also going back to why was RL academically successful? It's also much more forgiving than say robotics or self driving cars of making mistakes initially and stuff. So we kind of became excited both that this was an underexplored area and that gaming was a great way to like make progress in this area.

Demetrios:

Now I've seen demos of folks using LLMs to create agents within these gaming atmospheres. And I particularly think about Minecraft. I know there's been a lot of demos of LLMs in Minecraft and the way that I understand it is they gave LLMs tools and the tools were inside of Minecraft. How is what you're trying to do different and why is it something that you want to go down the path of instead of trying to just make the LLMs have better tools or work better in this environment?

Jonathan Hunt:

So Minecraft is a really cool project. It's really fun and I really like, more recently people have these kind of get using Minecraft to get different LLMs to build things for them and like kind of giving you a visual depiction of these models. And so they're really cool. I would say one of the things is that most of the interface between kind of the LLM and Minecraft, the game is phonetically very similar, is at the level of language. And there's also a great paper from Nvidia called Voyager, which is conceptually somewhat similar. So here the LLM is outputting either like JavaScript code or like calls to JavaScript to do things. So you say to find resources, you can run like a function in JavaScript that's like query what resources are around and you get like a text based response and things.

Jonathan Hunt:

And so although that's really fun, it's quite specific to Minecraft.

Jonathan Hunt:

So Minecraft, because it's been around for a long time, is quite an open game. People have built these APIs so you can do things like that, but most games don't have that.

Jonathan Hunt:

Like most games are more you have a rendering engine and you see the world. And one of the motivations is that we, you know, although our first agent is in Minecraft, we're trying to build underlying technology that will work across many different games without a lot of like human kind of handholding or engineering each time. And although Minecraft's a really cool product and really fun, if you say like wanted to get it to play, I don't know, some first person shooter game, I mean it wouldn't work at all.

Jonathan Hunt:

It's like using the Minecraft API, right. Whereas the human who played Minecraft could get dropped into a first person shooter and they might not be very good at shooting because you don't have that so much in Minecraft, but they'd be able to navigate around.

Jonathan Hunt:

Like it's similar conceptually the way the world works. And so that's more the focus of what we're building.

Demetrios:

Okay, and now what is the benefit of having something like this? Because I, I see one idea in my mind is that me as a lonely gamer, I never am lonely because I have a friend to play with all the time. And it's. It's much better than an npc.

Jonathan Hunt:

So that's definitely one application. And so we're actually going to probably like, rebrand our consumer app to be called Player 2, because maybe it's clearer what. So that. That use case. So I think that is one use case. I think we're still discovering others. Others could be kind of used for game testing.

Jonathan Hunt:

It's also people, actually. Some people actually just quite like to watch them, I guess. Similar to how people watch, like, streamers.

Jonathan Hunt:

Like, people we're sort of I guess our company is kind of split into two pieces in a way. on the one hand, we're trying to build this underlying foundation model, and then on the other hand, we're trying to experiment and see how consumers want to use it. And so one application we have definitely seen is, as you said, I want a friend to play with and my friends are busy or something. But I think other applications, some people just want to really want to watch it. They would just kind of, I guess it's like your own personal live streamer or something. And some people want to play with their friends and the agent, just to add to the crew and things. So I think we're still sort of exploring different ways that people might want to use it.

Demetrios:

Man, I'll tell you what, one of my favorite things to do is play a game of FIFA with friends and having that option to be able to do that whenever I want, because it's not really the same playing FIFA against the computer or against the. The NPCs, but if you have something that is a bit more human, like in nature, and it will kick it out of bounds every once in a while, and it will give you those passes when you're open. That is so cool. And the other part, I could see the streaming aspect, because streaming and watching it just like sometimes we learn from the AI that plays chess and we're Why did it do that move? But one thing that I'm thinking about are, so what are some of the hard challenges that you're hitting as you're trying to create this foundational model that is more in a 3D space as opposed to text or vision or whatever the medium may be?

Jonathan Hunt:

I think it's just like the recipe is not already there. I think there's lots of public recipes, how to train LLMs. And although there is a lot of interesting data sets also the data is not there. And so a lot of our work is also looking at kind of interesting and clever ways that we can collect data. So one thing we don't have yet, but we want to make available is we already had users kind of ask is like, can I teach the agent something? And so that's one thing we're looking into is if our agent is doing something dumb in one of your games, maybe you could show us the answer.

Jonathan Hunt:

And that would be really valuable training data and stuff. So I think, I would just say, just in general, it's just as I said, more underexplored. But I think a lot of it, most of modern machine learning is not just thinking about the model or the training, but thinking about, where is the data going to come from and how can we kind of build things into our product that will help get the kind of data that will help make our model better? And that's again, where we want to work with our end users. We do we see that a lot of end users are actually quite keen to coach or to help. And so that's an area we're working on.

Demetrios:

You've got decentralized reinforcement learning right there, which is awesome.

Jonathan Hunt:

Exactly.

Demetrios:

Some people pay good money for that.

Jonathan Hunt:

Yeah.

Demetrios:

You've got a fan base that's interested in doing it, which is really cool. I wonder about the reward system and what I've. What I can imagine is that if the, the game is being played and you either pass the level or in FIFA you score a goal, that is a really clear way to give a reward. And so do you find that from that angle it's easier to be training these models?

Jonathan Hunt:

So that's actually quite an interesting. So maybe I'll answer that a little bit indirectly, if you don't mind, because I think one of the things that's quite interesting to me and I. People are coming around to this point of view. I've had some conversations with, with some people at other companies is we are not just trying to make an AI agent who can beat a game. And because in some ways that's easier.

Jonathan Hunt:

We want to make it, as you said, like an AI agent who understands the language and participates in the game according to like, how you discuss it. Right. And so that, that might involve like, you can imagine every now and then, like, if you're being silly with your friends, you might Say like let's deliberately try and get killed or something.

Jonathan Hunt:

Like that's not and so it's actually like a much broader set of tasks and it also means that we really need to understand what the user is saying. And so for that I would say like the reward signal is harder because in Minecraft again we could just focus on winning at Minecraft like make the agent able to win the game. But that might be fun to watch once. But it like there particularly a game like Minecraft is very open ended. So there's like really many ways you can play. You can not even worry about winning. You can win in different ways. You can just spend your time building.

Jonathan Hunt:

And so we actually are trying to build an agent that like understands what you're asking and participates with you. Which I think is a broader and harder problem than just kind of like race towards the end and win.

Demetrios:

Yeah, so I see what you're saying now because it is, I was thinking like if you're just trying to build something that can win, it feels like that is not that hard of a problem. But now if it's, you're trying to build something that can be the teammate, that will be goofy or it will mess up or it will understand the plan. If you're playing whatever Rainbow six or Halo and you're planning out with friends you go over there to the left, it knows to go over there and it's not just going to be no, I've got the better route calculated and I'm just going to beat the game right now.

Jonathan Hunt:

And I think particularly because I would say, I mean it's not always been true but some of like beating the game can just be like faster reaction times. I think that's where again we don't want to just beat a game. We want to like be a comp. Understand the game and what people are asking of us and play in like a human like style or whatever. So beating the game is one thing. They might say show me how fast you can speed run this game.

Jonathan Hunt:

there's many other tasks that people will ask of us and I think that is like a broader and harder problem.

Demetrios:

Well, I know you guys are hiring so for anybody that thinks that these problems are fun and interesting and they want to go and work with you, I highly encourage it because this sounds super cool and it's a distributed team so you're remote. Okay. And, yeah, thanks for coming on here, jj.

Jonathan Hunt:

No worries.

Jonathan Hunt:

Thanks for your time.