The State of AI with Rowan Cheung

Demis Hassabis on Gemini 2.0, Project Astra, Agents, Gaming with Elon, and more

Rowan Cheung

Google just launched Gemini 2.0, Project Astra, and AI agents for gaming & web browsing. Rowan Cheung (@rowancheung) sat down for an exclusive interview with Demis Hassabis, CEO of Google DeepMind (@demishassabis), to discuss the launch, why it matters, and Google’s entry into the AI agent era.

__

Join our daily AI newsletter: https://www.therundown.ai/subscribe
Learn AI hands-on with our AI University: https://rundown.ai/ai-university/

Demis, so great to have you. Great to. Great to see you again. So Google DeepMind is announcing Gemini two alongside several agents, including Project Astra, an agent that can take action on your behalf on the web, a coding agent, and even an agent that can navigate video games alongside you. My question is, what exactly is getting announced and what is fully launching now and going to be available to use by everyday users right now? Yeah, we're launching the beginning of, the Gemini 2.0 era. So we're starting with, the flash size model, which is our web host model and kind of the most popular model we have in the 1.5 series. And, and it's going to be the beginning of us rolling out our 2.0 models. And the 2.0 models, have, a lot more new capabilities. And also, incredible performance for their sizes and, and their cost and latency. So, for example, 2.0 flash, which we're very proud of, is about as powerful as the 1.5 Pro, so you're kind of getting an entire tier of performance extra, for the same kind of size and speed. So, so we think, you know, we're really excited. What developers are going to do with it is going to be accessible, straight away. And so for me, my studio and also we'll be upgrading the, the Gemini app, with it under the hood as well. Excellent. And this is very exciting. How do the benchmarks specifically, how did the benchmarks of flash hold up against the big version of Gemini 1.5 Pro? Yeah. So the benchmarks, looking extremely good for the 2.0 series. So flash, for example, is is, you know, pretty much across the board, roughly equivalent to, the much bigger 1.5 Pro model. So we're very excited that, what that's going to unlock downstream in terms of products and, and of course, our own use cases internally at Google of serving billions of users with, with these models, things like eye overviews and, doing that at scale. So, we're really excited about what that's potentially going to unlock. So alongside Gemini two, Project Astra is coming out, and very, very excited about Project Astra. So we've kind of seen like the chat bots, we've seen voice assistants, but Project Astra is like a massive leap because it's like disco intelligence that can actually see here. And there stand your world in real time and be there with you kind of all the time. Yeah. It sounds like we're getting remarkably close to, like, the AI companion from the movie. Her. How close is Project Astra to that vision? Yeah, put to Astra, which we sort of teased at IO last year. This year, earlier this year is, our vision for Universal Digital assistant. Right. That's what we call it. And, actually, that's what the 2.0 series of Gemini models is going to underpin is, what we're moving into our sort of agent based areas, so to speak. So that's what we think. We're going to be, you know, the big part of next year is going to be about is agent based systems. And, that can actually carry out things in the world for you complete tasks, reason, plan and act, in the world. And of course, we've got a rich heritage of doing that with our games work, at DeepMind for a long time, you know, most famously, I guess, AlphaGo. And I think those kinds of ideas and systems are going to come back to the fore along with the big foundational models getting increasingly sophisticated and more and more multimodal, almost becoming world models. And that's the direction that we're going in with Gemini. And I think, the prototypes we're showing, we're going to be showing, and along with the 2.0, release is, things like Project Astra showing, what a universal system out in the real world can do for you understand the world around you in form factors like your phone, but also in glasses. And I think that's, going to be really critical part of an as a digital assistant being useful, to you in your everyday life. And then we're also going to show agent based kind of behaviors in Project Mariner, a new project where, I think it's the future of computer control and having these systems be able to understand what's on your browser and the interface and a kind of new UI in a way of using, the web, and then fun things like you mentioned of, Gemini for games where, you know, you can just screen share your Android screen and it's like a game companion. It can be a tutorial. Devise your strategies on any of the games that you're working on. Wow. There's a lot to digest there. Let's start with the demo. So I've seen obviously the demos from IO where Astro remembers where you left your glasses and interprets drawings in real time, but it seems like it's getting so much better. Could Astro be like this foundation for an AI with infinite context and windows that just remembers everything you've ever seen Yeah, I mean, that's the that's the dream. That's the kind of, long term goal is to have a system. The, you know, really understands you and understands your preferences and what you're trying to do. And it's just a really helpful, companion or assistant in that, and, and across devices, across domains. So that's what we mean by universal, you know, so maybe you're using it on your computer and then you go out into the world and you stick it out, you know, glasses on or use your phone and it kind of remembers the sessions and what you're trying to do, right from, from session to session. And I think that's how it'll be a really useful assistant is it's got to be personalized and understand what you're trying to achieve and what sorts of things you like. And just like a, you know, an amazing human assistant would be and, you know, I think that's the next, next era. And at the moment, these are separate prototype projects, actually, that they're all going to be, you know, variously available in trusted tester programs or in beta form. And while we're exploring, the use cases and, and also continuing to develop the underlying technologies, obviously built on top of Gemini 2.0, but we're very excited that, you know, there that we have we have kind of beta program trust to test the programs for, for all of these different projects. And, so people can start getting their hands on them, and then we can start getting feedback about, you know, what people like, what people they use it for, and what other technologies are required on top of Gemini, to make those use cases seamless. And for this universal agent that, you know, is there with you on the web in the real life, obviously we need big context windows and memory for that. Is there any advancements on that front and what is the current context window of Astro. Yeah. So the context windows, I mean, as you know, with the previous Gemini 1.5 ERA, we had the long context innovation, up to 2 million tokens. Actually, internally we tested this to be on 10 million. So at this point we can effectively make it arbitrarily big, but, it comes at a cost of speed, right? So a largely make the memory the more costly it is to search that memory and find things in it. And then that, that, that, that increases the latency and also the overall cost of running that model. We have some new innovations that we're working on that will increasingly, sort of exponentially reduce the cost of larger and larger models so that eventually, you know, I think over the next, pretty short amount of time, we'll have effectively infinite context. And then the interesting thing comes is like, you know, getting some inspiration from human memory where you don't just remember everything that's a bit inefficient, right? Well, you want to remember all the important and salient things, you know? So, the things that are going to be important for planning and and imagining the future, and so or sentimental in some way, or important in some other way. And so we're experimenting with ways to borrow from human memory, and the way the hippocampus works about, how does one decide what is a salient or useful thing to remember? And then that will, immediately increase the efficiency of these, long contexts as well. What do you think the ideal form factor is? Or Project Gaster? I know you mentioned your phone, possibly glasses. Or is there something else entirely? Yeah. Well, actually, this is a fun exploration. We're doing so on your phone. It's most performant because the phone is the most powerful, the edge device. But it becomes clear, actually, when you start seeing text is using it that, that, there are certain tasks that you're, you might be used to doing. So, for example, when you're cooking, and you want to ask Astra about, you know, ingredients or advice about what you should do next in the cooking recipe or, you know, give you some opinion about one of the ingredients. Your hands are usually occupied while you're, while you're cooking. And so then, it's also it's sort of natural that you would want a different, a wearable device that you don't have to hold. So I think this it's quite clear some use cases are going to require devices like glasses. And actually the more I've used it, we more we, we use it internally in sort of dog fooding. We, we realized that actually excitingly, this could be a killer use case for a glass kind of device. Right. Which I think in the past has been sort of looking for a really compelling everyday use case. And I think this is actually it. But then why not, you know, one could imagine why stop there? And maybe there are other kinds of new devices, you know, brand new ones. So this kind of, that would be optimal for this type of, this type of application. And we're thinking about those kinds of form factors to. That's super fascinating. There's something really interesting happening as well with your companions right now, people kind of forming these deep connections with the AI. And that was Astro. You're creating something that's actually present in people's lives, seeing their world, understanding their emotions. Do you worry about people kind of becoming too dependent on these relationships? How do you see that kind of playing out? Yeah, it's very interesting question. And I think not enough is known about this domain. We're still very early. I think on the plus side, I could imagine these companions becoming really fun to interact with. Actually, the gaming one is the most obvious, and you'll see that in the demos we show you. It's sort of really fun. It's like hanging out with your friend playing a game, right? And you use or someone like you might do that with an online friend. I think also there's obviously important use cases for, people who are lonely or the elderly. I think it could be very good, but obviously, there are, you know, important ethical issues around attachment to these systems and, the social implications of that. You know, I think, we're going to need social scientists and ethicists and others to weigh in on that, not just technologists. Right. And I think we we sort of kind of need to see a little bit how it plays out, I would say, because it's just so early in the agent based era. But I think, I can see lots of positive use cases, but we need to also be aware of of the social implications of that too, as the, as these systems become increasingly more, sophisticated and powerful. Last quick question on the gaming front. Yeah. You had a recent Twitter exchange with Elon about possibly doing an AI game together. Yeah. I don't know if you're allowed to comment, but are these conversations kind of continuing beyond Twitter at all? Yeah, we have a lot of fun chatting and and always have done, you know, Elon was, an original investor in DeepMind and back in the days of no need of for more than a decade now. And we have a lot of fun conversations. And we both love gaming. You know, I think it's well documented. Obviously, I used to write games and make games and as well as play games and and Elon does as well. And so I just think, yeah, it's it was a fun exchange. We'll see where it goes. I think there's huge potential though in, in AI, components for games. Right. Not not just generating I wasn't just meaning generating a whole game with. I think we're still quite a long way away from that, even though our journey two projects are really cool, but I'm thinking more like AI characters or, auto balancing games or new types of games with learning characters in them. Or learning agents in them could be very compelling. I think there's, you know, when I used to write computer games as a made, my early career, I would have dreamed of having kind of the AI that we have today. Right? I was writing AI for games back then in the 90s, and it was very basic, and I, you know, been amazing. If you've given me the kind of technologies we have today, what kind of games could we create? And I'm thinking a lot about that. And, you know, maybe there'll be a fun collaboration down the road that. All right, well that's it. Thanks so much for the interview. And, good luck with the launch. Great. Thank you. Thanks for talking to us. All right. Thank you so much.