Full interview: "Godfather of artificial intelligence" talks impact and potential of AI

[00:00:00] Speaker 1: How would you describe this current moment in AI, machine learning, whatever we want to call it? [00:00:06] Speaker 2: I think it's a pivotal moment. ChatGPT has shown that these big language models can do amazing things. And the general public has suddenly caught on. Yeah, we have. Because Microsoft released something. And they're suddenly aware of stuff that people at the big companies have been aware of for the last five years. [00:00:27] Speaker 1: What did you think the first time you used ChatGPT? [00:00:30] Speaker 2: Well, I've used lots of things that came before ChatGPT that were quite similar. So ChatGPT itself didn't amaze me much. GPT2, which was one of the earlier language models, amazed me. And a model at Google amazed me that could actually explain why a joke was funny. [00:00:51] Speaker 1: Oh, really? In just natural language, it'll tell you. [00:00:54] Speaker 2: Yeah, you tell it a joke. Not for all jokes, but for quite a few of them, it can tell you why it's funny. [00:00:59] Speaker 1: OK. [00:01:00] Speaker 2: And that-- it seems very hard to say it doesn't understand when it can tell you why a joke's funny. [00:01:05] Speaker 1: So if ChatGPT wasn't all that surprising or impressive, were you surprised by the public's reaction to it? Because the reaction was big. [00:01:15] Speaker 2: Yes, I think everybody was a bit surprised by how big the reaction was that it was the sort of fastest growing up ever. Yeah. Maybe we shouldn't have been surprised. But people-- the researchers had kind of got used to the fact that these things actually worked. [00:01:31] Speaker 1: You were famously, like, half a century ahead of the curve on this AI stuff. Go ahead, correct me. Go ahead. Not really. Not really, because-- [00:01:42] Speaker 2: There were two schools of thought in AI. There was mainstream AI. [00:01:47] Speaker 1: And then there was you and, like, five friends. [00:01:50] Speaker 2: Yeah, they thought it was all about reasoning and logic. And then there was neural nets, which weren't called AI then, which thought that you better study biology, because those were the only things that really worked. And so mainstream AI based its theories on reasoning and logic. And we based our theories on the idea that connections between neurons change, and that's how you learn. And it turned out, in the long run, we came up trumps. But in the short term, it looked kind of hopeless. [00:02:26] Speaker 1: Looking back, knowing what you know now, do you think there's anything you could have said then that would have convinced people? [00:02:32] Speaker 2: I could have said it then, but it wouldn't have convinced people. And what I could have said then is the only reason that neural networks weren't working really well in the 1980s was because the computers weren't fast enough and the data sets weren't big enough. But back in the '80s, the big issue was, could you expect a big neural network with lots of neurons in it, compute nodes, and connections between them, that learns by just changing the strengths of the connections? Could you expect that to just look at data, and with no kind of innate prior knowledge, learn how to do things? And people in mainstream AI thought that was completely ridiculous. [00:03:09] Speaker 1: It sounds a little ridiculous. [00:03:11] Speaker 2: It is a little ridiculous, but it works. [00:03:15] Speaker 1: And how did you know or why did you intuit that it would work? [00:03:18] Speaker 2: Because the brain works, because that's-- you have to explain how come we can do things, and how come we can do things like we didn't evolve for, like reading. Reading's much too recent for us to have had significant evolutionary input to it. But we can learn to do that, and mathematics, we can learn that. So there must be a way to learn in these neural networks. [00:03:37] Speaker 1: Yesterday, Nick Frost, who used to work with you, told us that you are not really that interested in creating AI. Your core interest is just in understanding how the brain works. [00:03:47] Speaker 2: Yes, I'd really like to understand how the brain works. Obviously, if your failed theories of how the brain works lead to good technology, you cash in on that, and it gets grants and things. But I really would like to know how the brain works. And I think there's currently a divergence between the artificial neural networks that are the basis of all this new AI and how the brain actually works. I think they're going different routes now. [00:04:12] Speaker 1: So we're still not going about it the right way? [00:04:16] Speaker 2: That's what I believe. This is my personal opinion. But all of the big models now use a technique called backpropagation. [00:04:23] Speaker 1: Which you helped popularize in the '80s. [00:04:24] Speaker 2: Which I helped popularize in the '80s, very good. And I don't think that's what the brain is doing. Explain why. OK, there's a fundamental difference between two different-- there's two different paths to intelligence. So one path is a biological path, where you have hardware that's a bit flaky, an analog. So what we have to do is communicate by using natural language. Also by showing people how to do things, imitation, things like that. But instead of being able to communicate a hundred trillion numbers, we can only communicate what you can say in a sentence, which is not that many bits per second. And so we're really bad at communicating compared with these current computer models that run on digital computers. [00:05:08] Speaker 1: It's almost infinite they're able to send signals. [00:05:11] Speaker 2: Their communication bandwidth is huge because they're exactly the same model. They're clones of the same model running on different computers. And because of that, they can see huge amounts of data because different computers can see different data. And then they can combine what they learned. [00:05:27] Speaker 1: More than any person could ever comprehend. [00:05:29] Speaker 2: Far more than any person could ever comprehend. [00:05:30] Speaker 1: And yet somehow we're smarter than them still. [00:05:32] Speaker 2: OK, so they're like idiot savants, right? ChatGPT knows much more than any one person. If you had a competition about, you know, how much you know, it would just wipe out any one person. [00:05:43] Speaker 1: It would do amazing at bar trivia. [00:05:44] Speaker 2: Yes, it would do amazing. And it would do amazing. And it can do all the -- it can write poems. It can, you know, they're not so good at reasoning. We're better at reasoning. We have to extract our knowledge from much less data. So we've got a hundred trillion connections, most of which we learn. But we only live for a billion seconds, which isn't very long. Whereas things like ChatGPT have run for much more time than that to absorb all this data. But on many different computers. [00:06:15] Speaker 1: 1986. You publish a thing in nature that is the idea we're going to have a sentence of words and it'll predict the last word. [00:06:23] Speaker 2: Yes. That was the first language model. [00:06:25] Speaker 1: That's basically what we're doing now. [00:06:27] Speaker 2: Yes and no. [00:06:27] Speaker 1: 1986 was a long time ago. Why still did people not say, oh, OK, I think he's onto something? [00:06:33] Speaker 2: Oh, because back then, if you ask how much data I trained that model on, I had a little simple world of just family relationships. There were 112 possible sentences and I trained it on 104 of them and checked out whether it got the last eight right. [00:06:48] Speaker 1: OK, and how did it do? [00:06:51] Speaker 2: It got most of the last eight right. [00:06:52] Speaker 1: OK. [00:06:52] Speaker 2: It did better than symbolic AI. [00:06:54] Speaker 1: So it's just that the computers weren't powerful enough at the time. [00:06:58] Speaker 2: The computers we have now are millions of times faster. They're parallel, but they can do millions of times more competition. So I did a little computation. If I'd taken the computer I had back in 1986 and I started learning something on it, it would still be running now and not have got there. Huh. And that's stuff that would now take a few seconds to learn. [00:07:23] Speaker 1: Did you know that's what was holding you back? [00:07:27] Speaker 2: I didn't know it. I believe that might be what was holding us back. But people sort of made fun of the idea that the claim that, well, you know, if I just had a much bigger computer, much more data, everything would work. And the reason it doesn't work now is because we haven't got enough data and enough compute. That's seen as a sort of lame excuse for the fact that your thing doesn't work. [00:07:47] Speaker 1: Was it hard in the '90s doing this work? [00:07:50] Speaker 2: In the '90s, computers were improving, but yes. So there were other learning techniques that on small data sets worked at least as well as neural networks and were easier to explain and had much fancier mathematical theory behind them. And so people within computer science lost interest in neural networks. Within psychology, they didn't. Because within psychology, they're interested in how people might actually learn. And these other techniques looked even less plausible than backpropagation. Yeah. [00:08:23] Speaker 1: Which is an interesting part of your background. You came to this not because you were interested in computers necessarily, but because you were interested in the brain. [00:08:30] Speaker 2: Yes. I sort of decided -- I was interested in psychology originally. Then I decided we were never going to understand how people work without understanding the brain. The idea that you could do it without worrying about the brain. That was a sort of fashionable idea back in the '70s. But I decided that wasn't on. You had to understand how the brain worked. [00:08:46] Speaker 1: So we fast forward now to the 2000s. Is there a key moment you think back to as a turning point when it's like, okay, our side is going to prevail in this? [00:08:56] Speaker 2: Around 2006, we started doing what we call deep learning. Before then, it had been hard to get neural nets with many layers of representation to learn complicated things. Right. And we found better ways of doing it. Better ways of initializing the networks called pre-training. And the P in chat GBT stands for pre-training. [00:09:21] Speaker 1: Okay. And the T is transformer. [00:09:24] Speaker 2: And G is generative. Yeah. And it was actually generative models that provided this better way of pre-training neural nets. So the seeds of it were there in 2006. In, by 2009, we'd already produced something that was better than the best speech recognizers, and recognizing which phoneme you were saying. [00:09:43] Speaker 1: Using different technology than all the other speech recognizers were. [00:09:47] Speaker 2: Than the standard approach, which had been tuned for 20, for 30 years. There were other people using neural nets, but they weren't using deep neural nets. And then there's a big thing that happens in 2012. Yes. Actually, two big things. [00:10:01] Speaker 1: Okay. [00:10:02] Speaker 2: One is that the research we'd done in 2009, done by two of my students over a summer, that led to better speech recognition. That got disseminated to all the big speech recognition labs at Microsoft and IBM and Google. And in 2012, Google was the first to get it into a product. And suddenly, speech recognition on the Android became as good as Siri, if not better. So that was a deployment of deep neural nets applied to speech recognition three years earlier. At the same time as that happened, within a few months of that happening, two other students of mine developed an object recognition system that would look at images and tell you what the object was. And it worked much better than previous systems. [00:10:49] Speaker 1: How did this system work? [00:10:50] Speaker 2: Okay. There was someone called Fei-Fei Li and her collaborators who created a big database of images, like a million images of a thousand different categories. You'd have to look at an image and give your best guess about what the primary object was in the image. So the images would typically have one object in the middle. Yeah. And they'd have to say things like bullet train or husky or -- and the other systems were getting like 25 percent errors and we were getting like 15 percent errors. Okay. Within a few years, that 15 percent went down to 3 percent, which was about human level. [00:11:25] Speaker 1: And can you explain in a way people would understand the difference between the way they were doing it and the way your team did it? [00:11:31] Speaker 2: I can try. [00:11:33] Speaker 1: That's all we can hope for. [00:11:34] Speaker 2: Okay. So suppose you wanted to recognize a bird in an image. Okay. The image itself, let's suppose it's a 200 by 200 image. That's got 200 times 200 pixels. And each pixel has three values for the three colors RGB. And so you've got 200 by 200 by three numbers in the computer. It's just numbers in the computer, right? And the job is to take those numbers in the computer and convert them to a string that says bird. Okay. So how would you go about doing that? And for 50 years, people in standard AI tried to do that and couldn't. Okay. It's tricky to convert a bunch of numbers into a label that says bird. So here's a way you might go about it. At the first level of features, you might make feature detectors. Things that detect little combinations of pixels. Okay. So you might make a feature detector that said, look, if all these pixels are dark and all these pixels are bright, I'm going to turn on. Okay. And so that feature detector would represent an edge here. Okay. A vertical edge. You might have another one that said, if all these pixels are bright and all these pixels are dark, I'll turn on. That would be a feature detector that represented a horizontal edge. [00:12:47] Speaker 1: Okay. [00:12:47] Speaker 2: And you can have others for edges at different orientations. [00:12:49] Speaker 1: We had a lot of work to do. All we've done is made a box. [00:12:51] Speaker 2: Right. So we've got to have a whole lot of feature detectors like that. And that's what you actually have in your brain. [00:12:55] Speaker 1: Okay. [00:12:56] Speaker 2: So if you look in a cat or monkey cortex, it's got feature detectors like that. Then at the next level, you might say, if you were wiring it up by hand, you would create all these little feature detectors. At the next level, you would say, okay, suppose I have two edge detectors that join at a fine angle. That could just be a beak. So the next level up, we'll have a feature detector that detects two of the lower level detectors joining a fine angle. [00:13:24] Speaker ?: Okay. [00:13:25] Speaker 2: We might also notice a bunch of edges that sort of form a circle. We might have a detector for that. [00:13:30] Speaker 1: Okay. [00:13:31] Speaker 2: Then at the next level up, we might have a detector that says, hey, I find this beak-like thing and I find a circular thing in roughly the right spatial relationship to make the eye and the beak of a bird. And so at the next level up, you'd have a bird detector that says, if I see those two there, I think it might be a bird. Okay. And you could imagine wiring all that up by hand. Okay. And so the idea of backpropagation is just put in random weights to begin with. And now the feature detectors would just be rubbish. That it would be garbage. Okay. Okay. But look to see what it predicts. Okay. And if it happened to predict bird, it wouldn't. But if it happened to, leave the weights alone. You got it right. The connection strengths. But if it predicts cat, then what you do is you go backwards through the network and you ask the following question. And you can ask this with a branch of mathematics called calculus. But you just need to think about the question. And the question is, how should I change this connection strength so it's less likely to say cat and more likely to say bird? That's called the error, the discrepancy, right? [00:14:33] Speaker 1: Okay. [00:14:33] Speaker 2: And you figure out for every connection strength how I should change it a little bit to make it more likely to say bird and less likely to say cat. [00:14:40] Speaker 1: And a person's figuring that out or the algorithm is set to work? [00:14:44] Speaker 2: A person has said this is a bird. So a person looked at the image and said it's a bird. Yeah. It's not a cat, it's a bird. Yeah. So that's a label supplied by a person. But then the algorithm backpropagation is just a way of figuring out how to change every connection strength to make it more likely to say bird and less likely to say cat. [00:15:04] Speaker 1: It just keeps trying. Keep turning the dials. [00:15:06] Speaker 2: It just keeps doing that. And now if you showed enough birds and enough cats, when you showed a bird, it'll say bird. And when you showed a cat, it'll say cat. And it turns out that works much, much better than trying to wire everything by hand. [00:15:17] Speaker 1: And that's what your students did on this image database. [00:15:19] Speaker 2: That's what they did on the image database, yes. And they got it to work really well. Now, they were very clever students. In fact, one of them, Ilya Sutskova, is also one of the main people behind ChatGPT. So that was a huge moment in AI. And ChatGPT was another huge moment. And he was actually involved in both of them. [00:15:37] Speaker 1: Yeah, yeah. I don't know. Maybe it's cold in the room. You got to the end of the story, I got shivers. The idea that you do this little dial thing and it says bird. It feels like just an amazing breakthrough. [00:15:48] Speaker 2: Yeah, it was. Mainly because the other people in computer vision thought, okay, so these neural nets, they work for simple things like recognizing a handwritten digit. But that's not a real complicated image with sort of natural background and stuff. It's never going to work for these big complicated images. And then suddenly it did. [00:16:09] Speaker 1: Yeah. [00:16:10] Speaker 2: And to their credit, the people have been really staunch critics of neural nets and said, these things are never going to work. [00:16:17] Speaker ?: Yeah. [00:16:17] Speaker 2: When they worked, they did something that scientists don't normally do. Which is said, oh, it worked. We'll do that. Yeah. [00:16:23] Speaker 1: People see it as a huge shift. [00:16:25] Speaker 2: Yes. It was quite impressive that they flipped very fast because they saw that it worked better than what they were doing. Yeah. [00:16:30] Speaker 1: You make this point that when people are thinking both about their machines and about ourselves and the way we think, we think language in, language out, must be language in the middle. Yes. And this is an important misunderstanding. Yeah. Can you just explain that? [00:16:47] Speaker 2: I think that's complete rubbish. Yeah. So, if that were true and it were just language in the middle, you'd have thought that approach, which is called symbolic AI, would have been really good at doing things like machine translation, which is just taking English in and producing French art or something. You'd have thought manipulating symbols was the right approach for that. But actually, neural nets work much better. And Google Translate, when they switched from doing that kind of approach to using neural nets, they did much better. What I think you've got in the middle is you've got millions of neurons. And some of them are active and some of them aren't. And that's what's in there. The only place you'll find the symbols are at the input and at the output. [00:17:30] Speaker 1: We're not exactly at the University of Toronto. We're close to the University of Toronto. At universities here and around the world, we're teaching a lot of people to code. Does it still make sense to be teaching so many people to code? [00:17:45] Speaker 2: I don't know the answer to that. In about 2015, I famously said it didn't make sense to be teaching radiologists to recognize things in images. Because within the next five years, computers will be better at it. [00:17:59] Speaker 1: Yeah. Are we all about to be radiologists, though? [00:18:01] Speaker 2: Well, computers are not better at it. I was wrong. It's going to take ten years, not five. I wasn't wrong in spirit. I just got an eye factor of two. Computers are now comparable with radiologists with a lot of medical images. Yeah. They're not way better at all of them yet. [00:18:14] Speaker 1: But they'll only get better. [00:18:16] Speaker 2: Yeah. So I think there'll be a while when it's still worth having coders. And I don't know how long that'll be. But we'll need less of them, maybe. Or we'll need the same number and they'll be able to achieve a whole lot more. [00:18:29] Speaker 1: We're talking about Cohere. We went over and visited them yesterday. You're an investor in them. Maybe the question is just, like, how'd they convince you? What was the pitch that convinced you, I want to invest in this? [00:18:40] Speaker 2: So they're good people. And I've worked with several of them. Yeah. And they were one of the first companies to realize that you need to take these big language models being developed to places like Google and other places, OpenAI. And make them available to companies. So it's going to be enormously valuable to companies to be able to use these big language models. And so that's what they've been doing. And they've got a significant lead in that. So that's why I think they're going to be successful. [00:19:17] Speaker 1: Another thing you've said that I just find fascinating, so I want to get you to talk about it, is the idea that there'll be kind of a new kind of computer that will be sent to this problem. What is that idea? [00:19:30] Speaker 2: So there's the biological route to intelligence, where every brain is different. And we have to communicate knowledge from one to another by using language. And there's the current AI version of neural nets, where you have identical models running on different computers. And they can actually share the connection strengths. So they can share billions of numbers. [00:19:50] Speaker 1: This is how we make a bird. [00:19:52] Speaker 2: So they can share all the connection strengths for recognizing a bird. And one can learn to recognize cats and the other can learn to recognize birds. And they can share their connection strengths. And now each of them can do both things. And that's what's happening in these big language models they're sharing. But that only works in digital computers. Because they have to be able to do identical things. And you can't make different biological brains behave identically. So you can't share the connection strengths. [00:20:13] Speaker 1: But why wouldn't we stick with digital computers? [00:20:17] Speaker 2: Because of the power consumption. You need a lot of power. It's getting less as chips get better. But you need a lot of power to do this. To run a digital computer, you have to run it at such high power that it behaves exactly in the right way. Whereas if you're willing to run at much lower power, like the brain is. Then you'll allow a bit of noise and so on. But that particular system will adapt to the kind of noise in that particular system. And the whole thing will work. Even though you're not running it at such high power that it behaves exactly as you intended. And the difference is the brain runs on 30 watts. A big AI system needs like a megawatt. So we're training on 30 watts. And these big AI systems are using, because they've got lots of copies of the same thing, they're using like a megawatt. So, you know, you're talking factor of the order of a thousand in the power requirements. And so I think there's going to be a phase when we train on digital computers. But once something's trained, we run it on very low power systems. So if you want your toaster to be able to have a conversation with you, and you want a chip in it that only costs a couple of dollars, but can do ChatGPT, that had better be a low power analog chip. [00:21:32] Speaker 1: What are kind of like the next things you think this technology will do that will impact people's lives? [00:21:40] Speaker 2: It's hard to pick one thing. I think this is going to be everywhere, right? It's already sort of getting to be everywhere. ChatGPT has just made a lot of people realize that it's going to be everywhere. But it's already, you know, when Google does search, it uses big neural nets to help decide what's the best thing to show you. We're at a transition point now where ChatGPT is this kind of idiot savant. And it also doesn't really understand about truth. It's been trained on lots of inconsistent data. It's trying to predict what someone will say next on the web. [00:22:14] Speaker 1: Yeah. [00:22:15] Speaker 2: And people have different opinions. And it has to have a kind of blend of all these opinions. So that it can model what anybody might say. It's very different from a person who tries to have a consistent world view. [00:22:29] Speaker 1: Yeah. [00:22:30] Speaker 2: Particularly if you want to act in the world. It's good to have a consistent world view. And I think one thing that's going to happen is we're going to move towards systems that can understand different world views and can understand that, okay, if you have this world view, then this is the answer. And if you have this other world view, then that's the answer. [00:22:53] Speaker 1: We get our own truths. [00:22:54] Speaker 2: Well, that's the problem, right? Because what you and I probably believe, unless you're an extreme relativist, is that actually is a truth to the matter. [00:23:02] Speaker 1: Certainly on many topics. [00:23:04] Speaker 2: On many topics. [00:23:04] Speaker 1: Or even most topics. Yeah. [00:23:06] Speaker 2: Like the earth is actually not flat. It just looks flat. [00:23:10] Speaker ?: Right? [00:23:11] Speaker 1: Yeah. So do we really want a model that says, well, for some people, like, we don't know. [00:23:15] Speaker 2: That's going to be a big issue. And we don't know how to deal with that at present. [00:23:19] Speaker 1: Yeah. [00:23:20] Speaker 2: And I don't think Microsoft knows how to deal with it either. [00:23:23] Speaker 1: They don't. And it seems to be a huge governance challenge. Who makes these decisions? [00:23:30] Speaker 2: It's very tricky things. You don't want some big for-profit company deciding what's true. [00:23:37] Speaker 1: But they're controlling how we turn the neurons. [00:23:39] Speaker 2: Well, Google is very careful not to do that at present. What Google will do is refer you to relevant documents. Yeah. Which will have all sorts of opinions in them. [00:23:49] Speaker 1: Well, they haven't released their chat product, at least as we speak. Right. But we've seen at least the people that have released chat products feel like there are certain things they don't want to be said by their voice. And so they go in there and meddle with it so it won't say offensive things. [00:24:05] Speaker 2: Yeah, but there's a limit to what you can do that way. There's always going to be things you didn't think of, right? [00:24:09] Speaker 1: Yeah. [00:24:10] Speaker 2: So I think Google is going to be far more careful than Microsoft when it does release a chat bot. [00:24:14] Speaker 1: Yeah. [00:24:15] Speaker 2: And it'll probably come with lots of warnings. This is just a chat bot. Yeah. And don't necessarily believe what it says. [00:24:23] Speaker 1: Careful in the labeling or careful in the way they meddle with it so it doesn't do lousy things? All of those things. [00:24:29] Speaker 2: Careful in how they present it as a product and careful in how they train it. Yeah. And do a lot of work to prevent it from saying bad things and-- But who gets to decide what a bad thing is? Some bad things are fairly obvious. [00:24:45] Speaker 1: But many of the most important ones are not. Yes. [00:24:48] Speaker 2: So that is a big open issue at present. I think Microsoft was extremely brave to release ChatGPT. [00:24:54] Speaker 1: Yeah. Do you see this as like a larger-- Some people see this as a larger societal thing. We need either regulation or big public debates about how we handle these issues. [00:25:04] Speaker 2: Well, when it comes to the issue of what's true, I mean, do you want the government to decide what's true? It's this big problem, right? [00:25:11] Speaker 1: Yeah. [00:25:12] Speaker 2: You don't want the government doing it either. [00:25:14] Speaker 1: I'm sure you've thought deeply on this question for a long time. How do we navigate the line between you just send it off into the world and we find ways to curate it? [00:25:25] Speaker 2: Like I say, I don't know the answer. Yeah. And I don't believe anybody really knows how to handle these issues. We're going to have to learn quite fast how to handle these issues because it's a big problem at present. Yeah. But how it's going to be done, I don't know. But I suspect as a first step, at least these big language models are going to have to understand that there are different points of view and that completions it makes are relative to a point of view. [00:25:51] Speaker 1: Some people are worried that this could take off very quickly and we just might not be ready for that. Does that concern you? [00:25:58] Speaker 2: It does a bit. Until quite recently, I thought it was going to be like 20 to 50 years before we have general purpose AI. Yeah. And now I think it may be 20 years or less. So. [00:26:11] Speaker 1: Okay. Some people think it could be like five. Is that silly? [00:26:16] Speaker 2: I wouldn't completely rule that possibility out now. Whereas a few years ago, I would have said no way. [00:26:21] Speaker 1: Okay. And then some people say AGI could be massively dangerous to humanity because we just don't know what a system that's so much smarter than us will do. Do you share that concern? [00:26:34] Speaker 2: I do a bit. I mean, obviously what we need to do is make this synergistic, have it so it helps people. And I think the main issue here, well, one of the main issues is the political systems we have. So I'm not confident that President Putin is going to use AI in ways that help people. [00:26:58] Speaker 1: Like even if, say, the U.S. and Canada and a bunch of countries say, okay, we're going to put these guardrails up, then how do you? [00:27:04] Speaker 2: Yeah, it's particularly for things like autonomous lethal weapons. [00:27:10] Speaker 1: Okay. [00:27:10] Speaker 2: We'd like to have something like Geneva Conventions, like chemical weapons. People decided they were so nasty they weren't going to use them, except just occasionally. But, I mean, basically they don't use them. People would love to get a similar treaty for autonomous lethal weapons. But I don't think there's any way they're going to get that. I think if Putin had an autonomous lethal weapons, he would use them right away. [00:27:30] Speaker 1: This is like the most pointed version of the question. And you can just laugh it off or not answer it if you want. But what do you think the chances are of AI just wiping out humanity? [00:27:40] Speaker 2: Can we put a number on that? It's somewhere between 0% and 100%. [00:27:47] Speaker 1: Okay. [00:27:48] Speaker 2: I mean, I think it's not inconceivable. [00:27:51] Speaker 1: Okay. [00:27:52] Speaker 2: That's all I'll say. Okay. I think if we're sensible, we'll try and develop it so that it doesn't. But what worries me is the political systems we're in. [00:28:03] Speaker 1: Yeah. [00:28:03] Speaker 2: Where it needs everybody to be sensible. [00:28:05] Speaker 1: There's a massive political challenge, it seems to me. And there's a massive economic challenge in that you can have a whole lot of individuals who pursue the right course. And yet, the profit motive of corporations may not be as cautious as the individuals who work for them. [00:28:24] Speaker 2: Maybe, I mean, I only really know about Google. That's the only corporation I've worked for. [00:28:29] Speaker ?: Yeah. [00:28:30] Speaker 2: And they're extremely cautious about AI. Because they've got this wonderful search engine that gives you the answers you want to see. And they can't afford to risk that. Whereas Microsoft has Bing. If Bing disappeared, Microsoft would hardly notice. [00:28:50] Speaker 1: Yeah. But it was easy for Google to take it slow when there wasn't someone nipping at their heels. And this seems to be exactly the issue. [00:28:58] Speaker 2: Google has actually been in the lead. I mean, Transformers were invented at Google. Right. The big language models, early ones, were at Google. But-- [00:29:04] Speaker 1: And they kind of kept it in your lab. [00:29:07] Speaker 2: They're being much more conservative. [00:29:08] Speaker 1: And I think rightly so. Yes. But now they feel this pressure. [00:29:12] Speaker 2: Yeah. And so they're trying to. They're developing a system called BARD that they're going to put out there. And they're doing lots and lots of testing of it. But they're going to be, I think, a lot more cautious than Microsoft. [00:29:25] Speaker 1: You mentioned autonomous weapons. Let me give you a chance just to tell this story. What's the connection between that and how you ended up in Canada? Okay. [00:29:32] Speaker 2: There were several reasons I came to Canada. But one of them was certainly not wanting to take money from the U.S. Defense Department. This was at the time of Reagan when they were mining the harbors in Nicaragua. And it was interesting. I was at a big university in Pittsburgh. And I was one of the few people there who thought that mining the harbors in Nicaragua was really wrong. So I felt like a fish out of water. [00:30:01] Speaker 1: And you saw that this was where the money was coming from for this kind of work. [00:30:05] Speaker 2: So at that department, almost all their money came from the Defense Department. [00:30:09] Speaker 1: You started to talk about the concerns that bringing this technology to warfare could present. What are your concerns? [00:30:17] Speaker 2: Oh, that the Americans would like to replace their soldiers by autonomous, by AI soldiers. And they're trying to work towards that. [00:30:28] Speaker 1: And what evidence do you see of that? [00:30:32] Speaker 2: I'm on a mailing list from the U.S. Defense Department. I'm not sure they know I'm on the mailing list. [00:30:38] Speaker 1: It's a big list. They didn't notice you're there. Yeah. You might be off tomorrow. I might be off tomorrow. What's on the list? [00:30:44] Speaker 2: Oh, they just describe various things they're going to do. There's some disgusting things on there. [00:30:50] Speaker 1: Okay. What disgusts you? [00:30:52] Speaker 2: The thing that disgusted me most was a proposal for a self-healing minefield. So the idea is, look at it from the point of view of the minefield. When some silly civilian trespasses into the minefield, they get blown up. And that makes a hole in the poor minefield. So it's got a gap in that. So it's not fit for purpose. Yeah. So the idea is maybe nearby mines could communicate and maybe they could move over a bit. And they called that healing. And it was just the idea of talking about healing for these things that blow the legs off children. I mean, and the healing being about the minefield healing. Yeah. That disgusted me. [00:31:33] Speaker 1: There is this argument that though the autonomous systems might play a role in helping the warfighter, it's ultimately a human making the decision. [00:31:42] Speaker 2: That's what worries me. If you wanted to make an effective autonomous soldier, you'd need to give it the ability to create sub goals. In other words, it has to realize things like, okay, I want to kill that person over there. But to get over there, how am I going to get over there? And then it has to realize, well, if I could get to that road, I could get there more quickly. So it has a sub goal of getting to the road. So as soon as you give it the ability to create its own sub goals, it's going to become more effective. And so people like Putin are going to want robots like that. And but as soon as it's got an ability to create sub goals, you have what's called the alignment problem, which is how are you sure it's not going to create sub goals that are going to be not good for people, not good for you? [00:32:30] Speaker 1: Who knows who's on that road? [00:32:32] Speaker 2: Who knows who's on that road? And if these systems are being developed by the military, the idea of wiring in some rule that says never hurt a person. Well, that's they're being designed to hurt people. [00:32:44] Speaker 1: Do you see any way out of this? Is it a treaty? Is it what is it? [00:32:48] Speaker 2: I think the best bet is something like a Geneva Convention, but it's going to be very difficult. I think if there was a lot of public outcry, that might persuade. I can imagine the Biden administration going for something like that with enough public outcry. But then you have to deal with Putin. [00:33:03] Speaker 1: Yeah. Okay, we've covered so much. I think I have like two more things. [00:33:09] Speaker 2: There's one more thing I want to say. [00:33:10] Speaker 1: Yeah, yeah, yeah, go for it. [00:33:12] Speaker 2: You could ask me the question. Some people say that these big models are just autocomplete. [00:33:17] Speaker 1: Well, on some level, the models are autocomplete. We're told that the large language models, they're just predicting the next word. Is that not so simple? [00:33:24] Speaker 2: No, that's true. They are just predicting the next word. All right. And so they're just autocomplete. But ask yourself the question of what do you need to understand about what's been said so far in order to predict the next word accurately? And basically, you have to understand what's been said to predict the next word accurately. So you're just autocomplete, too, in the same sense as they are. You can predict the next word. Maybe not as well as ChatGPT. [00:33:48] Speaker 1: Yeah. [00:33:49] Speaker 2: But to do that, you have to understand the sentence. So let me give you a little example from translation. It's a very Canadian example. [00:33:56] Speaker 1: Okay. [00:33:57] Speaker 2: Suppose I take the sentence, "The trophy would not fit in the suitcase because it was too big." And I want to translate that into French. Well, when I say the trophy would not fit in the suitcase because it was too big, you assume the "it" refers to trophy. [00:34:15] Speaker 1: I do. [00:34:16] Speaker 2: And in French, trophy has a particular gender. So you know what pronoun to use. Yeah. But suppose I say, "The trophy would not fit in the suitcase because it was too small." Now you think that it refers to suitcase. Right. And that has a different gender in French. So in order to translate that sentence to French, you have to know when it wouldn't fit in because it was too big, it's the trophy that's too big. And when it wouldn't fit in because it was too small, it's a suitcase that's too small. And that means you have to understand about spatial relations and containment and so on. Yeah. So you have to understand just to do machine translation or to predict that pronoun. If you want to predict that pronoun, you've got to understand what's being said. It's not enough just to treat it as a string of words. [00:34:58] Speaker 1: Yeah, yeah, yeah. I mean, this gets me to another thing you've pointed out which is kind of a either exciting or troubling idea that you working intimately in this field for as long as anyone describe the progress as, well, we had this idea and we tried it and it worked. And so we get a couple decades of back propagation. We have this idea for a transformer. Now we'll do some -- but it could -- there's hundreds of other ideas that haven't been tried out. Yes. [00:35:26] Speaker 2: So I think even if we didn't have any new ideas, just making computers go faster and getting more data will make all this stuff work better. We've seen that as they scale up ChatGPT. It's not radically new ideas there, it's just more connections and more data to train it with. Yeah. But in addition to that, there's going to be new ideas like transformers and they're going to make it work much better. [00:35:47] Speaker 1: Are we close to the computers coming up with their own ideas for improving themselves? [00:35:52] Speaker 2: Yes, we might be. [00:35:53] Speaker 1: And then it could just go fast. [00:35:56] Speaker 2: That's an issue, right. We have to think hard about how to control that. [00:36:00] Speaker 1: Yeah. Can we? [00:36:01] Speaker 2: We don't know. We haven't been there yet, but we can try. [00:36:05] Speaker 1: Okay. That seems kind of concerning. Yes. Do you have any -- you're seen as like a godfather of this industry. Do you have any concern about what you've wrought? [00:36:17] Speaker 2: I do a bit. On the other hand, I think whatever's going to happen is pretty much inevitable. That is, one person stopping doing research wouldn't stop this happening. If my impact is to make it happen a month earlier, that's about the limit of what one person can do. [00:36:37] Speaker 1: There's this idea of the -- and I'm going to get it wrong -- the short runway and the long takeoff. Maybe we need time to prepare. Or maybe it's better if it happens quickly because then people will have urgency around the issue rather than like creep, creep, creep. Do you have any like thoughts on this? [00:36:50] Speaker 2: I think time to prepare would be good. And so I think it's very reasonable for people to be worrying about those issues now. Even though it's not going to happen in the next year or two. People should be thinking about those issues. [00:37:01] Speaker 1: We haven't even touched on job displacement, which is just my mistake for not bringing it up. Is this just going to eat up just job after job after job after job? [00:37:10] Speaker 2: I think it's going to make jobs different. People are going to be doing the more creative end and less of the routine end. [00:37:18] Speaker 1: But what's the creative? If it can write the poem and make the movie and all of that? [00:37:22] Speaker 2: Well, if you go back in history and look at ATMs, these cash machines came along. And people said that's the end of bank tellers. It wasn't actually the end of bank tellers. The bank tellers now deal with more complicated things. And take coders. So people say, you know, these things can do simple coding and usually get it right. You just need to get it to write the program and then just check it. So you'll be able to work 10 times as fast. Well, either you could have 10% of the programmers or you could have the same number of the programmers producing 10 times as much stuff. [00:37:58] Speaker 1: Yeah. [00:37:59] Speaker 2: And I think there's going to be a lot of trade-offs like that. Once these things start being creative, there'll be hugely more stuff created. [00:38:07] Speaker 1: This is the biggest technological advancement since, is this another industrial revolution? What is this? How should people think of it? [00:38:15] Speaker 2: I think it's comparable in scale with the industrial revolution or electricity. [00:38:21] Speaker 1: Electricity, yeah. [00:38:22] Speaker 2: Or maybe the wheel. [00:38:23] Speaker 1: Or maybe the wheel. [00:38:25] Speaker 2: Yeah. That was earlier. [00:38:28] Speaker 1: Yeah. Okay. So buckle up. [00:38:31] Speaker 2: Yeah. One of the reasons I got a, Toronto got a big lead in AI is because of the policies of the granting agencies in Canada, which don't have much money. But they use some of that money to support curiosity-driven basic research. [00:38:49] Speaker 1: Okay. [00:38:49] Speaker 2: And so in the states, the funding comes and you have to say what products you're going to produce with it and so on. Yeah. Here, some of the government money, quite a lot of it, is given to professors to employ graduate students and other researchers to explore things they're curious about. Yeah. And if they seem to be good at that, then they get more money three years later. Yeah. And that's what supported both Yoshio Bengio and me. Yeah. It was money for curiosity-driven basic research. Yeah. And we've seen that before. [00:39:22] Speaker 1: Even through decades of not being able to show much. Yes. [00:39:25] Speaker 2: Even through decades of not being able to show much. So that's one thing that happened in Canada. Another thing that happened was there's a Canadian organization called the Canadian Institute for Advanced Research that provides extra money to professors in areas where Canada is good and provides money for professors to interact with each other when they're far apart, like in Vancouver and Toronto, but also to interact with researchers in other parts of the world, like America and Britain and Israel and so on. And CIFAR set up a program in AI. It set up one originally in the 1980s, which is the one that brought me to Canada, which was in symbolic AI. [00:40:02] Speaker 1: Oh. And yet you came. [00:40:05] Speaker 2: I was an oddball. Okay. I was kind of weird because I did this stuff everybody else thought was nonsense. They recognized that I was good at this kind of nonsense. And so they found me here. [00:40:14] Speaker 1: If anyone's going to do the nonsense, it might as well be him. [00:40:16] Speaker 2: One of my letters of recommendation said that. It said, you know, I don't believe in this stuff, but if you want somebody to do it, Jeff Hinton's the guy. [00:40:24] Speaker 1: Okay. [00:40:25] Speaker 2: And then after that program finished, I went back to Britain for a few years. And then when I came back to Canada, they decided to fund a program in deep learning, essentially. Yeah. [00:40:37] Speaker 1: Sentience. I think you have complaints with even just how you define that, right? [00:40:41] Speaker 2: Yeah. When it comes to sentience, I'm amazed that people can confidently pronounce these things are not sentient. And when you ask them what they mean by sentient, they say, well, they don't really know. So how can you be confident they're not sentient if you don't know what sentient means? [00:40:59] Speaker 1: So maybe they are already. [00:41:01] Speaker 2: Who knows? I think whether they're sentient or not depends on what you mean by sentient. So you better define what you mean by sentient before you try and answer the question, are they sentient? [00:41:09] Speaker 1: Does it matter what we think? Or does it only matter whether it effectively acts as if it is sentient? [00:41:16] Speaker 2: It's a very good question, Matt. [00:41:20] Speaker 1: And what's your answer? [00:41:21] Speaker 2: I don't have one. [00:41:22] Speaker 1: Okay. Because if it's not sentient, but it decides for whatever reason that it believes it is, and it needs to achieve some goal that is contrary to our interests, but it believes in its interests, does it really matter if in any human reflection? [00:41:36] Speaker 2: Well, I think a good context to think of this in is an autonomous lethal weapon. [00:41:41] Speaker 1: Yeah. [00:41:42] Speaker 2: Okay. So it's all very well saying it's not sentient. But when it's hunting you down to shoot you, yeah, you're going to start thinking it's sentient. [00:41:52] Speaker 1: Or not really caring. Not an important standard anymore. [00:41:55] Speaker 2: The kind of intelligence we're developing is very different from our intelligence. So it's this idiot savant kind of intelligence. Yeah. So it's quite possible if it is a tool sentient, it's sentient in a somewhat different way from us. [00:42:07] Speaker 1: Yeah. But your goal is to make it more like us. And you think we'll get there? [00:42:10] Speaker 2: No, my goal is to understand us. [00:42:12] Speaker 1: Oh, okay. [00:42:13] Speaker 2: No, but you-- And I think the way you understand us is by building things like us. [00:42:17] Speaker 1: Okay. So that's, I mean-- [00:42:18] Speaker 2: The physicist called Richard Feynman said you can't understand things unless you can build them. That's the real test of do you understand it. [00:42:27] Speaker 1: And so you've been building. [00:42:28] Speaker 2: So I've been building, yeah.

Related Transcripts from CBS Mornings

Transcribe Any Video or Podcast — Free