AI Unpacked with Nobel Laureate Geoffrey Hinton

Even the “godfather of AI” Geoffrey Hinton has been surprised by the speed and scale at which AI has developed. In this keynote from Valence's AI & the Workforce Summit, he explains what is so powerful about the technology and how leaders can unlock its potential and prevent its pitfalls.

Video Transcript

Key Points

Content is loading...

Parker Mitchell: Geoff, welcome. We're so excited to have you here today. We are gathered with CHRO's, heads of talent of some of the largest companies in the world. And what we're trying to do is make sense of AI. We're really wondering what it's going to be like in the future. But to understand that, I'd like to go back to the past.

If we look back to, let's say, around 2010, so almost 15 years ago, if you tried to, you know, Geoffrey Hinton in 2010, the predictions that you made: where were you too optimistic, too pessimistic about the speed of progress? How has the field progressed since then?

Geoffrey Hinton: So ask me about 2016 later. So I think if you had asked people, even fairly enthusiastic people who believed in neural nets in 2010, where we will be now, they wouldn't have believed we'd have something like GPT4. They would have said you're not, in the next 14 years, you're not going to develop something that's an expert at everything. Not a very good expert, but an expert at everything. You're not going to be able to have a system where you can just ask any question you like, some obscure question about British tax law, or some weird question about how you solve equations, and it's going to be able to give you a pretty good answer, an answer that's better than 99% of the population could give you. That's extraordinary, and we wouldn't have predicted that.

Parker Mitchell: And so progress is happening faster than you anticipated.

Geoffrey Hinton: Yes.

Parker Mitchell: Can you share more, what's it like to experience that as one of the leading researchers in the space and watching it accelerate?

Geoffrey Hinton: It's amazing, because back in the ’80s, when Rumelhart reinvented back propagation, he rediscovered it. And he and I worked together to use it for things. And we thought, to begin with, we thought, this is going to solve everything. We've got something that can just learn. And there didn't seem to be any limits to it. And then it was very disappointing. And we didn't understand why it didn't work better. It was partly architectural things. And for about 30 years we used an input-output function that looked like this, when we should have used one that looked like this. Um, just crazy. But it was mainly scale. And we just didn't understand that this whole idea would only really come into its own when you had a lot of connections, and a lot of training data, and a huge amount of compute. So we couldn't have done it back then. And if we'd said back then, “Yeah, but if we made one a million times bigger and had a million times more data, it would really work.” That would have just sounded like a pathetic excuse. But it turned out that was the truth.

Parker Mitchell: That's fascinating. So one of the things that you and I talked about earlier is the underselling of what large language models do if we use the term “next word prediction.” The experience that we have is that they could be reasoning; they could have a degree of intelligence. Can you share more about how that comes about?

Geoffrey Hinton: So there's many people who say these things are just using statistical tricks. They don't really understand what they're saying. They're just using correlations. But if you ask those people, well, what's your model of how people understand? If they're symbolic AI people, their model is we have symbolic expressions and we manipulate them with symbolic rules. And that never worked that well. It didn't work nearly as well as the large language models. If you ask cognitive scientists, they'll come up with a variety of explanations, but my initial tiny language model wasn't designed to do NLP, natural language processing. It was designed to show how people could learn the meanings of words. So it's a model of people. A very simplistic model. But the best model we have of how people understand sentences is these large language models. It's not like we have a different model of how people work, and these work differently. The only good model of how people work that we have is like this. So I think they really do understand, and they understand in the same way as we do.

Parker Mitchell: And these large language models might have that kind of embedded creativity already in them?

Geoffrey Hinton: Yes, so many people say, you know, these language models will do routine things, but people are creative. Well, if you take a standard test of creativity, I think the large language models now do better than 90 percent of people. So the idea they're not creative is crazy. This is very relevant to the debate among artists and Silicon Valley about whether these AI models are just stealing the creations of artists. Obviously to produce a work in a genre, you have to listen to a lot of music in that genre. But it's the same with a person. Whenever a person produces new music in a genre, they are stealing the works of previous people in just the same way the AI system is. So the AI system is not stealing them any more than another musician does.

Parker Mitchell: I mean, it's fascinating, if you read analysis of the work of Picasso, he is clearly borrowing from artistic traditions. I think he's, you know, Benin masks and many other areas, and he's merging them into a new, you know, a new approach. But he is building off of things that he's seen. I think AI, if it's seen everything, there's no reason why it can't do the same thing.

Geoffrey Hinton: Yes. So AI can be creative. And of course, to be creative in a particular way, you look at works of art that are done in that way. But it's hard to say that it's stealing, because what it's not doing is pastiching together bits of other things. It's understanding the underlying structure the same way a person does and then generating new stuff with the same kind of underlying structure. So it's just very like a person creating something.

Parker Mitchell: Now you also studied the psychology of the human brain in your undergrad. How does that compare to what we have in our brains?

Geoffrey Hinton: So we have about a hundred million synapses. And even though many of them are used for other things, like breathing, the cortex, the neocortex, has most of those. And so we've got many more adaptable parameters than these big language models. Which makes it very strange that GPT4 knows thousands of times more than we do.

Parker Mitchell: And you said a hundred million. I think you meant a hundred trillion.

Geoffrey Hinton: Did I say a hundred million?

Parker Mitchell: I think you said a hundred million.

Geoffrey Hinton: I could be a politician. I can't tell the difference between millions and trillions. A hundred trillion, yes.

Parker Mitchell: A hundred trillion synapses. And so it's fascinating. So we have large language models that are two orders of magnitude smaller than the connections in the human brain and yet know an enormous amount of information.

Geoffrey Hinton: Yes, they're a not very good expert at everything, so they know thousands of times more than any one person. And one of the reasons it can do that is you can have many different copies of exactly the same neural net running on different hardware. So you can get one copy to look at this bit of the internet, another copy to look at that bit of the internet. They can both figure out how they'd like to change their own weights. And if you just average those changes, then both copies have learned from the experience that each of them had. So now you take a thousand of those. Imagine if we could take a thousand people. They could all go off and do a different course. And at the end, everyone knew what everyone had learned, had experienced.

Parker Mitchell: We've talked a little bit about memory and how memory is stored in the human brain. We've talked about sort of fast weights and how those can adjust. Is there anything missing in an LLM architecture that humans still do exceptionally better, that the human brain does better?

Geoffrey Hinton: I think we still learn better from limited data. And we don't quite know how we do that. We know the human brain has changes in connection strengths at many different timescales. So the first time I met Terry Sejnowski in 1979, that was basically the first thing we talked about: how these neural net models have just two timescales. They have the timescale of the activities of the neurons changing. And so each time you put in a different sentence, neural activities will change. And then they have the activities of the values of the weights, the connection strings, and they change very slowly. That's where all the knowledge is. And they just have those two timescales.

Now, you could have many more timescales. Let's just suppose you have one more timescale, where you have weights, you have the weights that change slowly, But you have an overlay of weights that change much faster but decay quickly. That gives you all sorts of extra nice properties. So, for example, if I say an unexpected word to you like “cucumber,” and, a couple of minutes later, I put headphones on you, and I put lots of noise in the headphones, and I play words so you can only just hear them, most of them you can't quite make out what they are. You'll be considerably better at making out the word “cucumber.” Because you heard it two minutes ago.

So the question is, where is that stored? And it's not stored in neural activities. You can't afford to do that, you'll use up too many neurons. And it's not stored in the long-term weights, because in a few days’ time it'll be gone. It's stored in short-term changes to the synapse strengths. And we don't have that in the models at present.

Parker Mitchell: My undergraduate research was actually looking at something very similar, except it was preperceptual. So you would flash the word “cucumber” very quickly. You didn't notice that you'd seen it. It was subliminal. And then you could pick it up more likely if you either saw it, you know, in a collection of words or listened to it. And so there was a question of how did you understand, how did you process the word cucumber without realizing it in such a way that your brain stored it and was able to recognize it more quickly?

Geoffrey Hinton: I think there's also a phenomenon where you flash the word “cucumber,” and you'll be better at hearing, at recognizing the word “lettuce.”

Parker Mitchell: Yes, that was actually, in particular, it was the association of sort of similar words.

Geoffrey Hinton: Yes, so it's not just that you got the word, you got the semantics of the word, without any consciousness.

Parker Mitchell: Can you share some examples of how introducing new information to an LLM that it might not have had in its training data, how it can reason over that and come up with an answer that's similar to how a human might reason by analogy?

Geoffrey Hinton: Well, I can give a nice example of it doing analogies that most people can't do.

Parker Mitchell: I would love to hear that.

Geoffrey Hinton: So I asked GPT4 some time ago, when it wasn't hooked up to the web, why is a compost heap like an atom bomb?

Parker Mitchell: And I would not be able to answer that question.

Geoffrey Hinton: Excellent. So it said the time scales are very different and the energy scales are very different. And then it went on about chain reactions. It went on about how, in a compost heap, the hotter it gets, the faster it generates heat. In an atom bomb, the more neutrons it's producing, the faster it generates neutrons. And so the underlying physics similarity GPT4 had seen. Now, it probably didn't see it when I asked the question. It had probably seen it during training.

So we see a lot of analogies, and we actually store things in the weights. And it's much easier to store things in weights if they're kind of analogous structures. Because you can use, you can share the weights. And these large language models are just the same. And so in order to store huge amounts of information, they have to see analogies between different facts that they're learning. And they will have seen many analogies that no person's ever seen.

Parker Mitchell: So this is fascinating. So in order to compress that amount of information into that few parameters, they have to implicitly understand and codify analogies in their weighting.

Geoffrey Hinton: And many of those analogies are analogies at a deep level, like between a compost heap and an atom bomb.

Parker Mitchell: And they might be discovering, they might have embedded in the weights right now, analogies that we as humans have not actually thought about ourselves.

Geoffrey Hinton: Yes, because GPT4 is a not a very good expert at physics, but it's also not a very good expert at ancient Greek literature. And it may well be there's something in ancient Greek literature that's rather like some weird thing in quantum mechanics, but no one person has ever seen those two things.

Parker Mitchell: And so, in 2010 you started understanding what was possible, you and Ilya [Sutskever], won ImageNet. Alex, I think was…

Geoffrey Hinton: Alex Krizhevsky. It's called AlexNet.

Parker Mitchell: AlexNet, oh, that's right.

Geoffrey Hinton: He was an amazing coder, and he managed to make, to code convolutional nets on NVIDIA GPUs much more efficiently than anybody else.

Parker Mitchell: And so at that point, you've started to see that scale matters. How has the past 10 years, 2016, why is that moment an important moment for you?

Geoffrey Hinton: Oh, the reason I mentioned 2016 is because I made a prediction in 2016 that was wrong in the opposite direction.

I predicted that in five years’ time we wouldn't need radiologists anymore. This upset some radiologists. And it turned out it was wrong. I was off by about a factor of two, possibly even a factor of three. The time is going to come, and I meant for scans, I actually think I said at the time five years, maybe ten. But when they're reading scans, in maybe ten years from now, I'm very confident that the way you'll read almost all medical scans is an AI will read them and a doctor will check it. The AI is just going to get much better than doctors. AI can see much more in scans than doctors can.

So my wife had cancer, and she'd get CAT scans every so often, and they'd say the tumor's two centimetres. And then a month later they'd say the tumor's three centimetres. Well, this thing's shaped like an octopus. Two is not a very good measure of the size of an octopus, right? You'd like to know much more about what's going on. And with AI we can do that. With doctors, they can't do that because they don't have the, they don't know what the outcomes are. But I think with AI we're going to be able to see things about cancers that'll tell you whether they're going to metastasize soon and stuff like that. We know there's lots more information in the images that isn't being used.

Parker Mitchell: Well, it's as you said earlier, if you've got, you know, 500 doctors that can each spend a lifetime looking at 500 images and seeing the progression of them and then compress their brains, that's vastly more information than one single doctor.

Geoffrey Hinton: Yes. So no radiologist can train on enough data to compete with these things once these things are really good at vision.

But, for example, in tuition, we're going to get very good AI tutors. And there's a lot of research that shows, take a school kid and put them in a classroom, they'll learn at a certain rate. Give them a private tutor, they'll learn twice as fast. And so we know that AI is approaching being good enough to understand what people are misunderstanding. And as soon as you get private tuition by an entity that knows what you don't understand, it's going to be a much more efficient way of learning than just sitting in a classroom and listening to a broadcast. So I think in health care and education, there's going to be huge advantages.

Parker Mitchell: I want to spend a moment on that education example because we've been inspired by that idea of a tutor for everyone, for people learning in traditional education, a leadership coach for everyone who is at work. And so for us, this idea of personalization matters. Do you think AI could understand you in your context and almost be able to sort of access, it's like a librarian for the world's information, but just for you.

Geoffrey Hinton: Absolutely. So a few weeks ago, I won a Nobel prize. And I've never had a personal assistant before. And the university gave me a personal assistant, and she now understands quite a lot about me. And it's wonderful. And everybody could have that if we can do it with AI.

Parker Mitchell: That's fascinating. And you had to bring her up to speed, give her context. And if she had infinite access to your information, she'd be even more helpful.

Geoffrey Hinton: Yeah. Yeah. But I think that's sort of the good scenario. We all get these really intelligent personal assistants that know everything about us, and help.

Elaina Yallen: When we think about building an AI product, something that gets tossed around a lot is human-machine or human-model empathy and helping users understand what maybe they should expect from models, so they know how to channel it properly. How do you think about that for software?

Geoffrey Hinton: Well, there's one experiment where you have AI doctors and real doctors, and they interact with patients, and then you ask the patients, “How would you rate empathy?” The AI ones do much better. The AI ones actually listen to the patients. So, already they can exhibit empathy. It may be, we think of empathy as, you think, “How would that be for me?” And then you think, “Oh my god, that would be awful for me. I'm so sorry.” And maybe they don't do that. But they nevertheless, behaviorally, they seem to exhibit empathy pretty well. And we would like AI, if you had an AI tutor, you'd like it to have empathy about the fact that the pupil’s misunderstood something. And I'm sure they're going to be able to do that.

Parker Mitchell: And I think you would say, correct me if I'm wrong, that if it exhibits empathy, it might be doing it in the same way that we exhibit empathy. And therefore it might be, it's not just, like, performative empathy. It's going to come across as genuine empathy. Is that right?

Geoffrey Hinton: It might be genuine empathy. I think for us to call it genuine empathy, the AIs would have to be similar enough to us so they could imagine what it would be like for them. We tend to think of empathy as the ability to imagine what it would be like for you and then see, understand how it is for the other person. And I think if you're not doing that, you're just being very, “Oh that's terrible, I'm so sorry about that,” but you're not thinking of how it would be for you, right? That seems less genuine empathy, and AI can certainly do that.

Parker Mitchell: I mean, I definitely agree with that, but I think part of the beauty of literature is that it puts you in other people's positions, and you can experience it through that, and you can say, “Well, I've never been in that position, but I've now lived that experience.” And if you have the world's literature compressed into that, you know, model, they might be able to understand what a range of humans, even more than I would, would be going through and exhibit empathy to that.

Geoffrey Hinton: They might. Yes.

Parker Mitchell: That's really interesting. So I want to zoom out to the societal side of things. So we've seen an enormous amount of hype, an enormous amount of coverage of LLMs in the past couple of years. One of the things you and I talked about is the analogy of sort of how difficult it is to see the future when things are growing exponentially. Can you share a little bit more about how you're experiencing that?

Geoffrey Hinton: Yeah, we're not used to exponential growth. So, a good analogy is, if you're driving at night, on a windy road that you don't know, you often drive on the taillights of the car in front of you. And as the car gets further away from you, the taillights get dimmer. And they get dimmer quadratically. So, if you triple the distance, they get dimmer by a factor of nine. Good. That's why you're trying to stay close.

With fog, it's not like that at all. It's totally different. With fog, if you can see clearly at, like, 100 yards, you just assume you'll be able to see something at 200 yards. But actually, you can see clearly at 100 yards and then nothing at 200 yards, because fog is exponential. Per unit distance, it removes a certain fraction of the light. It's very different from linear or quadratic things that we're used to. People don't really understand the word “exponential” because it's misused so much. People misuse the word “exponential” to mean a lot. In fact, I think the rate at which they're misusing the word “exponential” is growing quadratically.

Parker Mitchell: It reminds me of a riddle that I used to love as a child, which was, if you have a pond that starts with one lily in it, and it doubles every day until the 30th day, when the lilies cover the pond and obliterate sunlight until the pond dies, which day is the pond half filled with lilies? And the answer is the 29th day. But the intuition people have is, oh, maybe it's around the 15th. And so it's hard to sometimes understand, because we don't live in that experience, what exponential growth could be like.

Is there anything as you think about the future of work? We talked a little bit about workforce. A world of everyone having assistance is obviously wonderful. A world of jobs being replaced is obviously going to cause a lot of social stress. How should people who are leading large companies think about navigating the next two to three years?

Geoffrey Hinton: There's obviously joblessness. So we just don't know whether AI is going to get rid of a lot of jobs. I suspect it is. Yann thinks, Yann LeCun, my friend, thinks it isn't. And in the past, things like automatic teller machines didn't cause massive unemployment among tellers. They just ended up doing more interesting, complicated things. And taking longer about it, so you have to queue for a long time. So, maybe it'll produce joblessness, maybe it won't.

I suspect there's some kinds of jobs where you could use a lot more of that. So, if, for example, they made doctors more efficient, we could all, especially old people, could use a lot more doctors’ time. If you got a doctor who was 10 times as efficient, I'd just get 10 times as much healthcare. Great.

There's other things, though, that aren't like that. And what'll happen is one person with an AI assistant will be doing the jobs that 10 people used to do, and the other 9 people will be unemployed. And the problem with that is, you've got an increase in productivity. That should help people. But you get 9 people unemployed, and one rich person who gets a bit richer. And that's very bad for society.

Obviously, we can't see very far in the future. If you take the fog analogy, I think the wall comes down at three to five years. We're fairly confident we've got some idea what's going to happen in the next few years. In 10 years time, we have no idea what's going to happen. And you can see that by looking 10 years back. We had no idea this was going to happen.

I think companies should navigate it by going in the direction of everybody having an intelligent AI assistant. So people feel they're going to get improved working conditions from this smart assistant. You're going to get increases in productivity. That would be great for everybody.

Parker Mitchell: The next five years are going to be extraordinarily eventful, for lack of a better word. And you've played an enormous role helping us get here, getting through the AI winter, getting through those moments when it might not have felt like it was quite as clear as it is now. And I just wanted to say what an honor it's been to have this conversation. And thank you.

Geoffrey Hinton: Well, thanks very much for inviting me. It's been fun.

Parker Mitchell: Yeah, I really enjoyed it. Thank you.

Elaina Yallen: Thank you so much.

Parker Mitchell: You're welcome.

AI & the Workforce Summit:

The Human + AI Era

AI Unpacked with Nobel Laureate Geoffrey Hinton

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Video Transcript

Key Points

AI is changing the way we work.

AI Unpacked with Nobel Laureate Geoffrey Hinton

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Video Transcript

Key Points