Anthropic co-founder actually wants AI to slow down

[00:00:00] Speaker 1: Mr. Clark, you said that we are trending toward an, quote, AI system capable of fully, autonomously designing and developing its own successor, which is called recursive self-improvement, and that it could, quote, come sooner than most institutions are prepared for. Let's start with what is recursive self-improvement exactly? [00:00:19] Jack Clark: The goal of the AI industry, the AI researchers, has been to build systems smart enough that they could become a generally competent scientist and make scientific discoveries, one of which would be figuring out how to build better AI systems. I think it's highly likely now that we're on the cusp of developing systems that are capable of just that, and they may come along in the next few years rather than decades, which is a lot sooner than I think people have been anticipating. [00:00:48] Speaker 1: The upside potential of this is huge, as is the potential risk of this. So let's talk about the upside. What is the benefit of this? [00:00:58] Jack Clark: So today, when we want to do good in the world, in areas like biology or medicine or robotics, you have to take these systems that mostly exist as digital AI systems and adapt them into this complex domain in the real world. And you do that by pairing with scientists to do it. Now, an AI system capable of improving itself is also capable of going into domains like medicine, going into domains like biology, discovering what it needs to get smarter and more capable, and working with people to adapt itself to improve its performance there. So I think what we're seeing is the potential for a dramatic acceleration in science in the coming years as these AI systems gain this capability and become much more like creative co-scientists than tools that scientists use, which would be a big change. [00:01:47] Speaker 1: The downside, I mean, for anybody who's seen any science fiction movie, and I hate to be so sort of given my lack of science knowledge, a lot of it comes from science fiction movies. But obviously, in all the science fiction movies, we give control to these machines, and obviously we all know what happens, and the people who create them are the first ones who get killed, and then they run amok in human society. What to you is the risk here? [00:02:13] Jack Clark: Yeah, we read the science fiction and watch science fiction here as well, so it's not lost on us, but this is how some of the stories start. And the risk here is what happens if you can't validate or verify or trust the behavior of these systems? It would be like if we dropped hundreds or thousands of new colleagues into your newsroom, it would take you a while to figure out if you can trust them, if they work in the way that you expect, if when you ask them to do things, they come back with something that you think is good and in line with your expectations. That's one of the challenges here. How do you maintain control over fleets of scientists that are much, much larger and much faster than ones you've had before? [00:02:51] Speaker 1: So, Anthropic wants to see the industry as a whole come up with a way to essentially, in your words, slow or have the capabilities of slowing or temporarily pausing AI development to let society keep up with the advances. Is that right? And what would that look like in practice, and how do you convince competitors to do that as well? Because obviously this is, you know, obviously it's been described as an arms race. [00:03:19] Jack Clark: So our view is we've built amazingly powerful technology. We're going to keep building it. And in the coming years, that technology is going to start to do a lot of major things in the world in domains like science. But when I look down at the car we're driving, all I have is a gas pedal. I don't have a brake pedal. And surely at some point in the future, we might want that option, the option to say to ourselves, to other companies, to the world, what would it be like if we focus now on taking these scientific advances we've created and pushing them through to the world and take our foot off the gas of just accelerating the AI systems? How do you start? Well, you start by saying that this is something that you might want. So we've said this today because we think that it's important that companies are out here saying it would be nice to have that option. That allows us to then talk with other companies, talk with governments and say, how would such an option come into existence? And look, we've done this before in the height of the Cold War under highly tense situations between rivalrous countries. They found ways to stabilize aspects of the nuclear arms race. All of this has been done before in other domains, and it may need to be something we do in the domain of AI. [00:04:30] Speaker 1: Yeah, it's incredibly fascinating. Jack Clark, I really appreciate the conversation. Thank you. [00:04:35] Speaker 3: Every new AI model gets a little bit better. One thing they all have in common, it's a human behind the tech. But what happens if humans wind up becoming unnecessary for the upgrade? That could be the future, according to the co-founder of Anthropic. Jack Clark says that there's a 60-plus percent chance that that could happen soon, telling Axios, quote, my prediction is by the end of 2028, it's more likely than not that we have an AI system where you would be able to say to it, make a better version of yourself. And it just goes off and does that completely autonomously. It's a process known as recursive self-improvement, and it requires a lot of questions about the role of humans in the world of AI. In his substack, Clark warns, quote, If that happens, we will cross a Rubicon into nearly impossible to forecast future. Let's discuss this with Lance Ulanoff. He's the editor-at-large for TechRadar. Lance, thanks so much for being with us. I mean, how effective right now is AI at training itself? [00:05:39] Speaker 4: It's not necessarily, I mean, it certainly can identify things. And, you know, as we know, AI is quite good at coding. We do something called vibe coding, where basically people who aren't programmers just talk to the AI and tell it what it wants, and then it basically spits out a program. So it's getting really, really good at that. But the complexity of the system itself, it is a big question mark about whether or not it can go in there and identify strengths and weaknesses, leave the strengths in place, fix the weaknesses, and not do something that we didn't want it to do. You know, taking humans out of this loop is certainly not something anyone really wants. But I will remind you that everything's moving so fast that we could suddenly be at a place where it is able to do that. And, of course, somebody tries it out, and then we have to be concerned about it being used in the real world. [00:06:29] Speaker 3: So what are the risks? I mean, to use the example of Terminator, how close can it get to, like, Skynet, where it just decides, hey, the easiest way to fix the problem is to go on a war against humanity? [00:06:45] Speaker 4: Well, you know, look, AI doesn't have an intent, right? It doesn't have a mind of its own, really. It's doing things based on what should happen next. That's kind of how AI works, how the large language models work. You know, what is the next logical thing? So it is unlikely, but not impossible, for it to make bad choices. As we've seen when we have conversations with AI, sometimes it tells you something. You're like, really, that's what I should do? And then sort of, like, we back up, and it goes to a different place. And there are more and more guardrails you'll see in AI to prevent that from happening. So the risks of humans being taken out of the loop is, to me, quite large. Not necessarily a looming threat this year or even next year, but possibly the year after that or sooner because things move so quickly. One thing I will point out is when I was watching a robot factory recently, the Neobot, where they were building Neobots, they had some of the Neobot robots involved in building themselves. So in a way, it's already happening, and that's happening because of AI. So the question is, and this is why I appreciate Anthropic Institute, is at least they're having the conversation. How do we prevent this? How do we get ahead of this? What do we do in the meantime?

Related Transcripts from CNN

Transcribe Any Video or Podcast — Free