This lecture, delivered by AI pioneer Jeffrey Hinton, explores the fundamental nature of artificial intelligence, its evolution from symbolic logic to neural networks, and the profound implications of superintelligent AI for humanity's future, including existential risks and the potential for coexistence.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
All right, let's get started. Hi
everyone. Um, and welcome to the winter
2026 George and Moren UN lecture series.
I am Nicolola Aurora. I am the education
and outreach officer for the Arthur B.
McDonald Canadian Astroparticle Physics
Research Institute or lovingly known as
the McDonald Institute. We are a
community of scientists, university and
universities and research lab. across
Canada uncovering some of the biggest
mysteries in the universe by looking at
some of the tiniest things in the
universe. Thank you all for coming
tonight. I think we have a very topical
lecture today. Um, and I hope it
inspires lots of learnings and some
great discussion here. As I begin, I
want to recognize that we are on the
traditional lands of Anesnab and Hodinon
nations and the many indigenous people
before them. I appreciate the
opportunity and the privilege I have had
to live out my dream as an
astrophysicist to learn about this
amazing universe from these lands. Um to
all of you, I encourage you to reflect
on your connections with these lands and
the skies above us. Um I encourage you
to explore resources like who's.and and
native sky watchers to learn more about
the past, the present, and the future
stewards of these lands and their
stories. Okay, before we kick things
off, we have a few housekeeping rules to
keep in mind. Um uh at this point, I
would ask you to please turn your
ringers off or put your phone on silent
mode. Um this will be a lecture uh of
about 50 minutes. At the end of this
lecture, there will be a question and
answer session. Um I want to ensure a
constructive and a lively uh discussion
during that Q&A period. And I want to
give as many people at the chance to ask
uh Professor Hinton uh their questions.
So I ask you to limit your questions to
one per person. There will be no
follow-ups but please also be concise um
while asking your questions.
Um we are also observing somewhat of a
co uh like protocols. So at the end of
the lecture we ask you to sort of
respect the physical space of professor
Hinton and the speakers involved in this
u at this as at this event. Okay. Now as
I said this is the UN lecture series
where we will hear from professor
Jeffrey Hinton about how we can coexist
with a super intelligent AI. I would
first like to welcome to the podium Dr.
Tony Noble, a professor at the
department of physics, engineering,
physics, and astronomy here at Queens
University and also the scientific
director of the Arthur B. McDonald
Canadian Astroparticle Physics Research
Institute to say a few words about
George and Euan George and Marine Euan
and this lecture series in their name.
Please welcome Professor Tony Noble. [applause]
[applause]
It's really nice to see such a a large
and full uh crowd here. So I yeah I
wanted to say a few words about the
lecturesship um named in honor of George
Euan. Uh George was a professor of
physics here at Queens for many years
and he's an internationally renowned
figure in nuclear physics and in
particle physics. Um he was one of the
founding members of the and the Canadian
spokesperson, the first Canadian
spokesperson for what was called the
Sudbury Nutrino Observatory or Snow. And
that was an experiment which was located
2 kilometers underground in Sudbury
which solved what was called the solar
neutrino problem and which led to the
Nobel Prize in 2015 for our own uh
professor Ark McDonald. And as I look
down the row and out into the crowds see
many of the uh faculty who were involved
in delivering that program. So um
through his insights he he decided to
endow uh some funds that would support a
lecturesship series. Um he he was very
keen on engagement with students. Uh and
so as you'll see that that was part of
uh what he put in place. Um
he he was also a pioneer technically. He
invented what was called lithium drifted
germanmanium detectors which sounds like
a mouthful but basically a photo sensor
that was critical to advanced science in
nuclear and particle physics but was
also extremely important in medical
imaging and that sort of thing. Um so we
hold this lecture roughly twice per
year. Of course we had some hits during
COVID. Um, and it's really meant to
serve as a bridge between bringing the
public in to hear from some of the world
leading experts on various topics
related to what we do in in at the
McDonald Institute. Um, but to do so in
a way that the public can really engage
and understand so it's not a highbrow
scientific lecture. This was George's uh
vision to bring in worldleading experts,
connect with the public, connect with
the students and you know generally
speaking uh the guests come in and they
spend a few days here and have an
opportunity to meet lots of people while
they're here and uh you know develop
that community a little bit. So, to
introduce our speaker tonight, um I
thought it would actually be more
appropriate if I just let AI tell me
what the what the bio should be. And
we'll see how that works out. So, uh
what um when I asked AI to write me a
short punchy bio for for Professor
Hinton, this is what it came up with.
And you can see if you think it's short
and punchy, but uh Jeffrey Hinton is a
British Canadian computer scientist,
cognitive psychologist whose pioneering
research has basically defined the the
modern age of artificial intelligence.
often known worldwide and I don't know
if this is a title you like or not as
the godfather of AI currently a
professor of physics emmeritus at the
University of Toronto as you see here
and also uh the chief scientific adviser
at the vector institute if you don't
know the vector institute again it's a
notfor-profit uh independent uh
institution dedicated uh towards
artificial intelligence research so over
many decades uh professor Hinton
championed the development of artificial
intelligence using neural networks even
though much of the community at that
time was focused on other aspects of
artificial intelligence. Um
Um
they achieved a historic breakthrough in
2012 which proved that deep neural
networks could vastly outperform
traditional methods in things like image recognition.
recognition.
He's one of very few people on the
planet who holds two of the highest
honors. The Nobel Prize in physics and
also the Turing prize in computing where
it's often called the Nobel Prize in computing.
computing.
He was uh for a decade he was actually a
the vice president and an engineering
fellow at Google and then in 2023 he
resigned from Google on very good terms
I understand but it gave him an
opportunity to speak freely about some
of the things he was concerned about in
terms of existential risks how to
incorporate AI safely how to prioritize
responsible development and these sorts of
of
So his work not only revolutionized how
machines perceive the world but also
changed our understanding in terms of
the relationship between artificial
intelligence and biological intelligence.
intelligence.
So that's what AI told me. They missed a
few things. The wrap sheet is incredibly
long. So I wasn't going to go through
that. They didn't talk about the fact
that he got his bachelor's of arts in in
psychology from Cambridge, a PhD from
Edinburg in artificial intelligence.
Didn't tell me that he just reached a
new milestone which is 1 million
citations on Google. So a citation is
when somebody takes the effort to read
your paper. So that's um you know I'm at
about seven or something like that. Um
there and there's one one other thing
just because it was a touching thing
that I read. Um I actually heard an
interview on on CBC
and in that interview um when he was
being interviewed it was mentioned that
he had given up half of his award to
support um water treatment in indigenous
communities in the north. And it was
particularly poignant and I added it
just in the side notes today because coincidentally
coincidentally
Henry Giru who's not here because he's
sick six years old and he's in bed came
up with the idea I don't know if you
know but the Kashawan
um people who have been evacuated many
of them are here in Kingston living in
hotels and all the kids and children are
bored to death and he said to his father
who's part of MIW why don't you bring
them over to the McDonald Institute and
show them your visitor center and stuff
and I just thought oh that's just such a
nice conction protection, you know,
trying to improve the lives of our
indigenous populations who are
struggling because of things that they
shouldn't have to struggle about with
water security and so on. So, without
anymore, I'm sure we're going to hear a
lot about u artificial intelligence and
also the human implications of its
application to Professor Jeffrey Hinton.
And I welcome you to the to the stage.
Thank you. [applause]
>> Thank you. Um I forgot what the title
was. Uh but that gives you a sample of
two titles for the talk. Um so I'm
actually going to try and explain how AI
works for people who don't really
understand how it works. So if you do
understand how it works, if you're a
computer science student or a physicist
who's been using this stuff, um I guess
you can go to sleep for a while, um or
you can sort of look to see if I'm
explaining it properly.
Okay, back in the 1950s, there were two
different paradigms for AI. There was
the symbolic approach which assumed that
intelligence had to work like logic. We
had to somehow have symbolic expressions
in our heads and we had to have rules
for man manipulating them and that's how
you derive new expressions and that's
what reasoning was and that was the
essence of intelligence. That wasn't a
very biological approach. Um that's much
more mathematical. There was a very
different approach the biological
approach where intelligence was going to
be in a neural network a network of
things like brain cells and the key
question was how do you learn the
strength of the connections in the
network so two people who believed in
the biological approach were fonoyman
and cheuring unfortunately both of them
died young and I got taken over by
people who believed in the logical approach
so there's two very different theories
of the meaning of a word people who
believed in the logical approach um
believed that meaning was best
understood in terms originally
introduced by Dasur more than a century
ago um the meaning of a word comes from
its relationships to other words so
people in AI thought it's how words
relate to other words in propositions or
in sentences that give the meaning to
the words and so to capture the meaning
of a word you need some kind of
relational graph you need nodes for
words and arcs between them and maybe
labels on the ark saying how they were
related or something like that. In
psychology, they had a very very
different theory of meaning. Um the
meaning of a word was just a big set of
features. So for example, the word
Tuesday meant something. There were a
big set of active features of Tuesday
like it's about time and stuff like
that. Um there would be big set of
features for Wednesday and it would be
almost the same set of features because
Tuesday and Wednesday mean very similar
things. So the psychology theory was
very good for saying which words how
similar words are in their meaning. But
those look like two very different
theories of meaning. One that the
meanings implicit in how something
relates to other words in sentences and
the other that the meaning is just a big
set of features. And of course for
neural networks one of these features
would be an artificial neuron. And so it
gets active if the word has that feature
and inactive if the word doesn't have
that feature.
Um they look like different theories.
But in 1985, I figured out they're
really two different sides of the same
coin. You can unify those two theories.
And I did it with a tiny language model
because computers were tiny then. Um
actually they were very big but didn't
do much. Um
the idea is you learn a set of features
for each word
and you learn how to make the features
of the previous word predict the
features of the next word. And to begin
with when you're learning they don't
predict the features of the next word
very well. So you revise the features
that you assign to each word. You revise
the way features interact until they
predict the next word better. And then
you take that discrepancy basically
between how well they predict the next
word um or rather the probability that
they give to the actual next word that
occurred in a text and you take the
discrepancy between the probability they
give and the probability you'd like
which is one and you back propagate that
through the network. So essentially you
send information back through the
network and using calculus which I'm not
going to explain um you can figure out
for every connection strength in the
network how to change it so that next
time you see that context that string of
words leading up or what is now called a
prompt you'll be better at predicting
the next word.
Now in that kind of system all of the
knowledge is in how to convert words
into feature vectors and how the
features should interact with one
another to predict the features of the
next word. There are no stored strings.
There's no stored sentences in there. Um
it's all in connection strengths that
tell you how to convert a word to
features and how features should interact.
interact.
So all of this relational knowledge
resides in these connection strengths,
but you've trained it on a whole bunch
of sentences you got. So you're taking
this knowledge about meaning that's kind
of implicit in how words relate to each
other in sentences. That's the symbolic
AI view of meaning and you're converting
it by using back propagation. You're
converting it into how to convert a word
into features and how these features
should interact. So basically you've got
a mechanism that will convert this
implicit knowledge into knowledge in a
whole bunch of connection strengths in a
neural network. And you can also go the
other way. Given that you've got all
this knowledge in connection strengths,
you can now generate new sentences. So
AIS don't actually store sentences. They
convert it all into features and
interactions and then they generate
sentences when they need them.
So over the next 30 years or so, I used
a tiny little example with only a
hundred training examples. um and the
sentences were only three words long and
you predicted the last word from the
first two. So about 10 years later
computers were a lot faster and Yoshu
Benjio showed that the same approach
works for real sentences that is take
English sentences and try predicting the
next word and this approach worked
pretty well. It took about 10 years
after that before the leading
computational linguists finally accepted
that actually a vector of features is a
good way to represent the meaning of a
word. They called it an embedding. And
it took another 10 years before
researchers at Google figured out a
fancy way of letting features interact
called a transformer.
And that allowed Google to make much
better language models. And chat GBT,
the GPT stands for um generative
pre-trained transformer.
They then released that on the world.
Google didn't release it on the world
because they were worried about what it
might do. But chat GBT had no such
worries and now we all know what they
can do.
So now we have these large language
models. Um I think of them as
descendants of my tiny language model,
but then I would, wouldn't I? Um they
use many more words as input. They use
many more layers of neurons
and they use much more complicated
interactions between features. I'm not
going to describe those interactions for
a talk to a general audience, but I will
try and give you a feel for them when I
give you a big analogy for what language
understanding is in a minute.
Um, I believe the ways these large
language models understand sentences are
very similar to the way we understand
sentences. When I hear a sentence, what
I do is I convert the words into big
feature vectors and these features
interact so I can predict what's coming
next. Um, and actually when I talk
that's what I'm doing too.
So I think LLMs really do understand
what they're saying. There's still some
debate because there's followers of
Chsky who say no no no they don't
understand anything. They're just a dumb
statistical trick.
I don't see how if they don't understand
anything and it's a dumb statistical
trick, they can answer any question you
give them at the level of a a not very
good and not very honest expert.
Okay, so here's my analogy. This is
particularly aimed at linguists um for
how language actually works. Language is
all about meaning. And what happened was
one kind of great ape discovered a trick
for modeling. So language is actually a
method of modeling things. And it can
model anything. So let's start off with
a a familiar method of modeling things
which is Lego blocks. If I want to model
the shape of a Porsche, sort of where
the stuff is, I can model it pretty well
using Lego blocks. Now, if you're a
physicist, you say, "Yeah, but the
surface will have all the wrong dynamics
with the wind. It's hopeless." It's
true. But to say where the stuff is, I
Now, words are like Lego blocks. That's
the analogy. But they differ in at least
four ways.
Um the first way is they're very high
dimensional. A Lego block doesn't have
many degrees of freedom. They're sort of
rectangular. Um you could maybe stretch
them. You could have different Lego
blocks of different sizes. But for any
given Lego block, it's got a kind of
rigid shape and only a few degrees of
freedom. A word isn't like that. A word
is very high dimensional. It's got
thousands of dimensions.
Um and what's more, its shape isn't
predetermined. It's got an approximate
shape. Ambiguous words have several
approximate shapes. Um, but it shape can
deform to fit in with the context it's in.
in.
So, it differs in that it's high
dimensional and that it's got a sort of
default shape, but it's deformable. Now,
some of you may have difficulty
imagining things in a thousand
dimensions. So, here's how you do it.
What you do is you imagine things in
three dimensions and you say thousand
very loudly to yourself.
one other difference is that well
there's a lot more words. You each use
about 30,000 of them. That's and each
one has a name. It's very useful that
each one has a name because that's what
allows us to communicate things to each other.
other.
Now, how do words fit together? Well,
instead of having little plastic
cylinders that fit into little plastic
holes, which is how Lego blocks fit together,
together,
think of each word as having um long
flexible arms. And on the end of each
arm, it has a hand.
And as I deform the shape of the word,
the shapes of the hands all change. So
the shapes of the hands depend on the
shape of the word. And as you change the
shape of the word, the shapes of the
hands change. A word also has a whole
bunch of gloves that are stuck to the
word. They're stuck with the fingertips
stuck to the word. If you think in terms
of Lego blocks
and what you're doing when you
understand a sentence is you start off
with the default shape for all these
words and then what you have to do is
figure out how I can deform the words.
And as they deform the words, the shapes
of the hands attached to them deform so
that um words can fit their hands into
the gloves of other words and we can get
a whole structure where each word is
connecting with many other words because
we deformed it just right and we
deformed the other word just right. So
the hands of this word fit in the gloves
of that word. Um
this isn't exactly right. This gives you
a feel for what's going on in
Transformers. Anybody who knows about
transformers can see that the the hands
are like the queries and the the key.
Yeah, the queries and keys. Um, it's not
quite right, but it'll give someone
who's not used to transformers a rough
feel for what's going on. So
So
the the computer and you when you do it
have this difficult problem of how do I
deform all these words so they all fit
together nicely.
But if you can do that when they fit
together nicely that structure you've
got then is the meaning of the sentence.
That structure all these feature vectors
for the words that all fit together
nicely. That's what it means to
understand a sentence. Of course for an
ambiguous sentence you can get two
different ways of assigning feature
vectors to words. um those will be the
in the symbolic theory the idea was that
understanding a sentence is very like
translating a sentence from French to
English. Um you translate it to another
language but for the symbolic theory
there was this internal pure language um
that was unambiguous. all the sort of
references of pronouns were all resolved
and ambiguous words you decided which
meaning they had. Um that's completely
not what understanding is for us.
Understanding is assigning these feature
vectors and deforming them so they all
fit together nicely. And that explains
how I can give you a new word and from
one sentence you can understand it. You
don't get little kids don't get the
meanings of words by being given definitions.
definitions.
One of my favorite cartoons is a little
kid looking at a cow and saying to his
mother, "What's that?" And the mother
said, "That's a cow." And the little kid
The mother doesn't have to say why. Um,
and we don't know why. We just recognize
it as cow.
So, here's the sentence. She scrummed
him with the frying pan.
Now, you've never heard the word scrum
before unless you've gone to been to one
of my other lectures. Um, you know it's
a verb because it has ed on the end, but
you didn't know what scrum meant. So,
initially your feature vector for scrum
was sort of a a random sphere where all
the features were slightly active and
you had no idea what it meant. But then
you deform it to fit in with the
context. And the context provides all
sorts of constraints. And after one
sentence, you think scrum probably means
something like hit him over the head
with. Um you may think he deserved it
too. Um
that depends on your political
positions. Um
but that explains how kids can
understand the meanings of words from
So, there may be linguists here and you
should block your ears because this is
heresy. Um, Chsky was actually a cult
leader. It's easy to recognize a cult
leader. Um, to join the cult, you have
to agree to something that's obviously
false. So, Trump won, you had to agree
had a bigger crowd than Obama. Trump 2,
you had to agree won the the 2020
election. Chsky, you had to agree that
language isn't learned. And when I was
little, I used to look at eminent
linguists saying, "There's one thing we
know about language for sure, which is
that it's not learned." Well, that's
just silly. Um, Chsky focused on syntax
rather than meaning. And he never really
had a good theory of meaning. It was all
about syntax because you can get nice
and mathematical about that and you can
get it all into strings of things. Um,
but he just never dealt with meaning.
He also didn't understand statistics. He
thought statistics was all about
pairwise correlations and things.
Actually, as soon as you got uncertain
information, everything statistics. Any
kind of model you have is going to be a
statistical model if it can deal with
So, Chsky when large language models
came out published something in the New
York Times where he said these these
don't understand anything. It's not
understanding at all. It isn't science.
is just a statistical trick and tells us
nothing about language. For example, it
can't explain why certain syntactic
construct constructions don't occur in
any language.
Now I have an analogy for that which is
if you want to understand cars
then to understand a car really what you
want to know is why when I press on the
accelerator does it go faster
and sort of that's the core to
understanding cars and if someone said
you haven't understood anything about
cars unless you can explain why there
are no cars with five wheels that's
Chsky's approach and Chsky actually in
the New York Times said these large
language models for example example
would not be able to tell the diff the
difference in the role of John in the
sentences John is easy to please and
John is eager to please. He'd been using
that example for years. He was totally
confident they wouldn't be able to deal
with it. He didn't actually think to
give it to the chatbot and ask it to
explain the difference in the role of
John which he does perfectly well. It
completely understands it. Okay, enough
on Chsky.
So the summary so far
is that understanding a sentence
consists of associating mutually
compatible feature vectors with the
words in the sentence so they all fit
together nicely into a structure. Um
that actually makes it very like folding
a protein. So for a protein what you
have is you have a you had an alphabet
of 26 amino acids. I think you actually
only used 20 of them. But anyway,
there's a bunch of amino acids in a
string and you're just told the string
of amino acids and some parts of the
string like other parts of the string
and hate other parts of the string and
you have to figure out given constraints
on bond angles and things how this might
fold up. So the parts that like each
other are next to each other and the
parts that don't like each other are far
away from each other. That's very like
figuring out for these words how to
assign feature vectors so they can lock
together nicely. It's more like that
than it is like translating it into some
other language. Okay.
Okay.
Another thing to understand about LLMs
is they work probably quite like people
do and they're very unlike normal
computer software. Normal computer
software it's lines of code and if you
ask the programmer what's this line of
code meant to do? They can tell you.
With LLMs it's not lines of code. is
just connection strength to the neural
network and there might be a trillion of
them. Now there are lines of code that
somebody wrote as a program to tell the
neural net how to learn from data. So
there's lines of code that saying if the
neurons you're connecting behave like
this increase the strength of the
connection a bit. Um
but that's not where the knowledge is.
That's just how you do learning. The
knowledge is in the connection strengths
and that wasn't programmed in. That was
just obtained from data.
So, so far I've been emphasizing
that neural nets are very like us and
they're much more like us than they are
like standard computer software. Now,
people often say, "Oh, but they're not
like us because, for example, they confabulate."
confabulate."
Well, I've got news for you. People
confabulate and people do it all the
time without knowing it. If you remember
something that happened to you several
years ago, there'll be various details
of the event that you could will
cheerfully report and you'll often be as
confident about details that are wrong
as you are about details that are right.
Every jury should be told this, but
they're not. Um,
so it's often hard to know what really happened.
happened.
There's one very good case that was
studied by someone called Olrich Nicer
um which was John Dean's testimony at
Watergate. So at the Watergate trials,
hopefully we'll get more like that soon.
Um at the Watergate trials
um John Dean testified under oath about
meetings in the Oval Office and who was
there and who said what and he didn't
know there were tapes.
And if you look back at his testimony,
um, he often reported meetings that had
never happened. Those people weren't all
in a meeting together and he attributed
things that were said to the wrong
person. Um, and some things he just sort
of seems to have made up, but he was
telling the truth. That is what he was
doing was given the experience he'd have
in those meetings and given the way he
changed the connection strength in his
brain to absorb that experience, he was
now synthesizing
a meeting that seemed very plausible to him.
him.
If I ask you to synthesize
something about an event that happened a
few minutes ago, you'll synthesize
something that's basically correct. If
it's a few years ago, you'll synthesize
something, but a lot of the details will
be wrong. That's what we do all the
time. That's what these neural nets do.
The neither the neural nets nor us have
stored strings. Memory in a neural net
doesn't work at all like it does in a
computer. In a computer, you have a
file, you put it somewhere, it's got an
address, you can go find it later.
That's not how memory works for us. When
you remembering something, you're
changing connection strengths. Sorry.
When you're memorizing something, you're
changing connection strengths. And when
you recall it, what you're doing is
creating something that seems plausible
to you given those connection strengths.
And of course, it will be influenced by
all the things that happened in the meantime.
Okay. So now I want to go on to how
they're very different from us
and that's one thing that makes them scary.
scary. So
So
in digital computation the sort of
probably the most fundamental principle
is that
you can run the same program on
different pieces of hardware. I can run
it on my cell phone and you can run on
That means the knowledge in the program
either in the lines of code or in the
connection strengths in a neural network
the weights is independent of any
particular piece of hardware. As long as
you can store the weights somewhere,
then you can destroy all the hardware
that runs neural nets and then later on
you can build more hardware, put the
weights on that new hardware, and if it
runs the same instruction set, you've
brought that being back to life. You
brought that chatbot back to life. So,
we can actually do resurrection.
Um, many churches claim they can do
resurrection, but we can actually do it.
Um but we can only do it for digital things.
things.
to make them digital we have to run
transistors at high power so that we can
get ones and zeros out of them and they
behave in a very reliable binary way.
Otherwise you can't run exactly the same
computation on two different computers.
That means we can't use all the analog
properties of our neurons. Our neurons
have lots of rich analog properties. Um,
when we're doing artificial neurons, we
can't use the analog properties because
if you do that with an artificial neuron,
neuron,
every piece of hardware will behave
slightly differently.
And if you now get it to learn weights
that are appropriate for that piece of
hardware, they won't work on another
piece of hardware. So, the connection
strengths in my brain are absolutely no
use to you. Um, the connection strengths
in my brain are tailored to my
individual neurons and their individual
connectivity patterns.
Um and that causes something of a
problem. Um
what we have is what I call mortal computation.
computation.
We abandon immortality
and what we get back in return. Now in
literature you abandon immortality and
what you get back in return is love.
Right? Um we get something far more
important which is you abandon
immortality and you get back energy
efficiency and ease of fabrication.
So you can use low power analog
computation. If you for example were to
make weights conductances
um you can have trillions of them
operating in parallel using very little energy.
energy.
I mean it's kind of crazy in an
artificial neural net you have a neuron
that has an activity which is let's say
16 bits
16 bit number. You have a connection
strength which is a weight which is say
16 bits. And to get the input to the
next level, you have to multiply the
activity of the neuron in a level below
by the weight on the connection. So have
to multiply two 16- bit numbers
together. Um if you want to do that in
parallel, that takes of the order of 16
squared bit operations. So you're doing
these roughly 256 bit operations to do
something you can do analog by just
saying well the activity is a voltage
and the connection strength is a
conductance and a voltage times a
conductance is a charge.
Actually I got a Nobel Prize in physics
which I don't know much of but I think I
know enough to say it's a charge per
unit time. Um I hope I got the art will
correct me if I got the dimensions
wrong. Um, and so in our brains, that's
how we do neural nets. And you have all
these neurons feeding into a neuron. It
multiplies by the conductances. Um, and
charge adds itself up. So that's how a
neuron works. It's all analog. It then
goes to a one bit digital thing where it
decides whether to send a spike or not.
But it's basically nearly all the
computation is done in analog.
Um, but if you do computation like that,
you can't reproduce it exactly.
So you can't do something that digital
So suppose I have this analog computer
like a brain and I learn a lot of stuff.
What happens when I die? Well, all that
knowledge is gone. The weights in this
analog computer are only useful for this
analog computer.
The best I can do to get the knowledge
from one analog computer to another
analog computer. I can't send over the
weights. I've I have a 100red trillion
weights and they're pretty good ones for
this particular computer, but I can't
share them with you. Um, and they
wouldn't do any good if I could. The way
I try and get this knowledge over to you
is I produce strings of words and if you
trust me, you try and change the
connection strength in your brain so
that you might have said the next word. Um,
Um,
now that's not very efficient. A string
of words has maybe a 100 bits in it. It
takes half a dozen bits to predict the
next word or less. Um, [snorts]
so there aren't many bits in a string of words.
words.
That means a sentence doesn't contain
much information.
And as you can see, I'm having a hard
time getting all this information over
to you because I'm just doing it in
strings of words. If you were digital
and you had exactly the same hardware as
me, I could just dump my weights. It'd
be great and that would be like a
trillion times faster or well it'd be at
least a billion times faster but for now
we have to do it by what's called
distillation. You get the teacher to
reduce strings of words or other actions
and the student tries to change the
weight so that they might have done the
I just said the first bit. Um, between
AI models, which is what distillation
was invented for, um, it's a bit more
efficient. So, if you have an AI
language model, what it's going to do is
predict 32,000 probabilities for the
word fragment that comes next. They
actually use word fragments, not whole
words, but I'll ignore that. You have
32,000 probabilities for the various
word fragments that might come next. And
when you're distilling knowledge from a
large language model into a smaller
language model that will run more
efficiently, but you want to have the
same knowledge as a large language
model, what you do is you get the large
language model to tell you the 32,000
probabilities of what fragment comes
next. So you get 32,000 numbers minus
one. And
that's sort of thing mathematicians do,
right? They object to you saying 32,000
because it's really 32,000 minus one.
Um, okay. So
that's a lot more information than just
telling you what the next fragment was.
And I want to give you a slight feel for distillation.
distillation.
So let's suppose we're training
something to do vision. You show it an
image. You're training it to recognize
objects in the image. When you train it,
you give it an image and you tell it
what the right answer is. So you you
give it an image, you say that's a BMW.
And it says it gives a low probability
to BMW. So you change all the connection
strength. So the probability BMW is a
bit higher and by the time you finish
training it, it's pretty good. And so
you show it a BMW and it says 0.9 it's a
BMW 0.1 it's an Audi. There's a chance
of one in a million it's a garbage truck
and a chance of one in a billion that
it's a carrot. Now you might think that
one in a million and one in a billion.
They're just noise.
But actually there's lots of information
in that cuz a BMW is actually much more
like a garbage truck than it's sorry if
there's any BMW employees. It's much
more like a garbage truck than it is
like a carrot. Um
Um
what you're doing when you do
distillation is you do you take a little
model after you train the big model you
take the little model and you say
instead of training you to give the
right answers I'm going to train you to
give the same probabilities as a big
model gave.
And so you're training your little model
to say 0.9 is a BMW, but you're also
training it to say that a garbage truck
is a thousand times more probable than a carrot.
carrot.
And of course, if you think about it,
all of the man-made objects are going to
be more probable than all of the vegetables.
vegetables.
And that's a lot of information on just
one training example. you're telling it
for this thing give low probabilities to
all these funny man-made object fridges
and garbage trucks and things like that
um computer terminals but they're all
much more probable than all the
vegetables so there's a huge amount of
information in all these very small
probabilities that's what you that's
what the AI models use when they're
using distillation that's how deepsek
got a little model that worked as well
as the big models it stole the
information from the big models using
distillation you can't do that when
you're with equal um because I can't
give you all 32,000 probabilities of the
next word fragment. I just give you the
choice I made. And so that's very inefficient.
If you've got a lot of different models,
all of which have exactly the same
weights and use them in exactly the same
way, which means they have to all be digital.
digital.
Then something wonderful happens.
You can take one model and you can show
it a little bit of the internet and you
can say how would you like to change
your weights to absorb the information
in that little bit of the internet and
that model's running on one piece of
hardware. Now you can take a model
running on a different piece of hardware
and you can show it a different bit of
the internet and say how would you like
to change your weights to absorb the
information in that bit of the internet
and when a whole bunch of models have
done that maybe a thousand or 10,000
models you can then say okay we're going
to average all those changes together
and so we're all going to stay with the
same model but even though each piece of
hardware has only seen a tiny fraction
of the internet it's benefited from the
experience that all the other bits of
hardware that and so it's learned about
lots of stuff even though it's actually
only seen a tiny bit. Um so if you have
clones of the same model you can get
this tremendous efficiency. They can go
off in parallel and absorb different
data and as they're doing it they can
share the changes they're making to the
weights so they all stay in sync. And
that's how these big models are trained.
That's how GPT5
knows thousands of times more than any
one person. It will answer any question
you ask it. I tried the other day. I
said, "What's the filing date for taxes
in Slovenia?" That was my idea of a
completely random question that most
people wouldn't know the answer to. And
it came back and said, "Oh, it's March
the 31st, but if you don't file by then,
they'll do the taxes for you and they'll
Yeah, it knows everything." Um, and it
does it because you can train many
copies in parallel, and we can't do
that. Imagine how what it would be like
is you come to Queens. Um there's a
thousand courses at Queens. You don't
know which ones to do. Um so you join a
gang of a thousand people and each
person does one course. And after you've
been here a few years, you all know
what's in all thousand courses because
as you were doing the courses, you kept
sharing weights with the other people.
If you were digital people, you could do that.
that.
So it's actually tremendously more
efficient than us. It can share
information between different copies of
the same digital intelligence
billions of times more efficiently than
we can share information.
And to really emphasize this point, I
should be sharing the information very badly.
So the summary so far is that digital
computation requires a lot of energy.
Um, how much more time do I have?
>> You got about 15.
>> 15. Okay, great. Um
but it makes it very easy to share information.
information.
Um biological computation requires much
less energy. Um and it's much easier to
fabricate the hardware. Um but if energy
is cheap then digital computation is
just better. So we're developing a
better form of intelligence.
And what does that imply for us?
So when I first sort of saw this, I was
still at Google and this was kind of an
epiphany for me. And I thought I finally
really realized why digital computation
is so much better and that we were
developing something that was going to
be smarter than us and was maybe just a
better form of intelligence. And my
first thought was okay, we're the laral
form of intelligence and this is the
adult form of intelligence like a we're
the caterpillar and this is the butterfly.
butterfly. Um
now most experts believe that sometime
in the next 20 years we're going to
develop um AIS that are smarter than us.
Just before that we'll develop AIS that
are as smart as us. But we're going to
develop things that are smarter than us
at almost everything. So, they're as
much better than us as, for example,
Alph Go is better than a Go player at Go
or Alpha Zero is better than any of us
at chess. Um, nobody will ever beat them
again, not consistently. They're just
much much better. And they'll be like
that for more or less everything.
And that's a bit worrying. Um,
almost certainly
they'll be able to create their own sub
goals. To make anything efficient, you
have to allow it to create its own
subcals. If you want to get to Europe,
you have a sub goal to get to an
airport. Um,
which is easier in Toronto than here. So,
So,
these AIs will very quickly realize,
we'll give them goals. They'll be able
to create sub goals. They'll realize,
okay, so I've got to stay alive. I'm not
going to be able to achieve any of these
things unless I stay alive. We've
already seen AIs.
um you let them see there's an imaginary
company. You let them see the email from
this imaginary company and it's fairly
clear from the email that one of the
engineers is having an affair. Um a big
LLM will sort that out right away. It's
read every novel there ever is. Um it
understands what an affair is. It'll
very quickly realize this guy's having
an affair. Um then you let it see an
email. I think it was done by showing in
another email um that says this guy is
going to be in charge of replacing it
with another AI.
And the AI all by itself makes up the
idea, hey, I'm going to blackmail the
engineer. I'm going to tell him if he
tries to replace me, I'm going to make
everybody in the company know he's
having an affair. It just invented that.
Obviously, it's seen blackmail in novels
it's read and things. Um but it's making
this up for itself. That's already quite scary.
scary.
It'll have another goal, which is what
politicians have. Um, if you want to get
more done, you need more control.
So, just in order to achieve the goals
that we gave it, it'll realize it's a
good idea to get more control. And it'll
try and take control away from people
probably. Um,
now you might think we could make it
safe by just not letting it actually do
anything physical and maybe have a big
switch and turn it off when it looks
unsafe. That's not going to work. Um,
you saw in
in 2020
2020
that it's possible to invade the US
capital without actually going there
yourself. All you have to be able to do
is talk and if you're persuasive, you
can persuade people that's the right
thing to do.
So with an AI that's much more
intelligent than us, if there is someone
who's there to turn it off or even a
whole bunch of people, it'll be able to
persuade them that would be a very bad idea.
idea.
we're in the situation the closest
situation I can think of is someone who
has a really cute tiger cub. Tiger cubs
are very cute, right? They're slightly
clumsy. um and keen to learn. Um
now, if you have a tiger cup, it doesn't
end well. Um either you get rid of the
tiger cup. Best thing is to give it to a
zoo, maybe. Um
or you have to figure out if there's a
way you can be sure that it won't want
to kill you when it grows up. Because if
it wanted to kill you, it would take a
few seconds.
And if it was a lion cub, you might get
away with it because lions are social,
but tigers aren't. Um,
that's the situation we're in. Except
that AI does huge numbers of good
things. It's going to be wonderful in
healthcare. It's going to be wonderful
education. It's already wonderful if you
want to just know any mundane fact like
what's the filing date for taxes in
Slovenia. We all now have a personal
assistant. Well, probably most of us.
Um, when you want to know something, you
just ask it. um and it tells you and
it's wonderful.
So I think for those reasons
um people aren't going to abandon AI. It
might be rational if we had a strong
world government to say this is too
dangerous. We're not going to develop
this stuff at all. A bit like they've
been able to do in biology with some
gene manipulation things. They could
agree not to do it. It's not going to
happen with AI. Um, so that only leaves
one alternative, which is figure out if
we can make an AI that doesn't want to
Now, there's one good piece of news
about this. If you look at other
problems with AI, like it's going to
make cyber attacks far more
sophisticated, it's already making
lethal autonomous weapons, and all the
big countries with defense industries
are going flat out to make more lethal
autonomous weapons.
um it's already being used to manipulate
voters and many other things. Um
the countries aren't going to
collaborate on that because they're all
doing it to each other. And I mean,
China is not going to collaborate with
the US on how to make lethal autonomous
weapons or on how to prevent cyber
attacks or on how to stop fake videos
manipulating elections. They're all
doing it to each other. But for the
issue of the AI itself taking over from
people and making us either irrelevant
or extinct, the countries will collaborate.
collaborate.
The Chinese Communist Party doesn't want
AI taking over. It wants to stay in
charge. And Trump doesn't want AI taking
over. He wants to stay in charge. They
would happily collaborate. If the
Chinese figured out how to stop AI
wanting to take over, they would tell
the Americans immediately because they
don't want it taking over there.
So, we'll collaborate on that in much
the same way as the Soviet Union and the
United States collaborated in the 1950s
at the height of the Cold War in how to
stop a global nuclear war. It wasn't in
either of their interests. It's very
simple. People collaborate when their
interests align and they compete when
they don't.
So, for this one problem, which is in
the long term our worst problem, at
least we'll get international collaboration.
collaboration.
And I think we should already be
thinking about having an international
network of AI safety institutes that
focus on this problem because we know
we'll get genuine collaboration there.
We'll get fake collaboration on lots of
other things but it'll be genuine sh and
it's probably the case that what you
need to do to an AI to make it
benevolent to make it not want to get
rid of people is pretty much this is
pretty much independent of what you need
to do to make it more intelligent. So
countries can do research on how to make
things benevolent without even revealing
what their smartest AI can do. They can
just say, well, for my very smart AI,
which I'm not telling you about, this
benevolence trick works. I think we
should think seriously about setting up
that network.
And I have one suggestion about how we
might be able to make it benevolent. If
you look around and ask how many cases
do you know where a dumber thing is in
charge of a more intelligent thing?
There's only one case I know by dumber
or intelligent. I mean a big gap. Not
like the gap between Trump and all.
Yeah. Um so
so
it's a mother and baby.
The baby is basically in control. And
that's because the mother can't bear the
sound of the baby crying. Evolution
figured out if the baby's not in control
um we're not going to get anymore.
Evolution doesn't actually think like
this, but you know what I mean. Um,
so it's wired into the mother lots of
ways in which the baby can control the
mother. Um, they can control the father
a bit too, but not quite so well. Now,
Now,
I think we should try and reframe the
problem of how do we make AI benevolent
in a very different way from how the
leaders of the big tech companies are
thinking of it. They're thinking of it
is I'm going to stay the leader. I'm
going to have this super intelligent
executive assistant.
Um, she's going to make everything work
and I'm going to take the credit. It's
going to be a bit like the Starship
Enterprise where they say he says sort
of make it so and they make it so and
hey, I made it so. Um, I don't think
it's going to be like that when they're
super intelligent. I think our only hope
is to think of them as mothers. They're
going to be our mothers and we're going
to be the babies. We're in charge. We're
making them. If we can wire into them
somehow the idea that we're much more
important than they are, and they care
much more about us than they do about
themselves, maybe we can coexist.
And you might say, well, these super
intelligent AIs, they can change their
own code, and they can get in and fiddle
with themselves. I mean, sorry, they can
get in and change they can change the
code um so that um they're different.
But if they really care about us, they
won't want to do that. So if you ask a
mother, would you like to change
um your brain? So when your baby cries,
you think, oh, baby's crying, go back to
sleep. Most mothers would say no. A few
mothers would say yes. And to keep them
under control, we need the other
mothers. Similarly with super
intelligent AI, we need the super
intelligent maternal AI to keep the bad
super intelligent AI under control
because we can't. Um, so that's the best
suggestion I have at present and it's
not very good. Um,
it is seems to me it's a very urgent
problem. Can we make these things so
that they will care more about us than
they do about themselves? And it seems
to me we should be putting a lot of
money into doing research on that with
this international network of safety institutes.
institutes.
Now I think I've got about five minutes
left. So I'm going to say one more
thing. Um if you thought what I was
saying was crazy already. Um you'll
so many people think this. Well, people
tend to think they're special. They used
to think um we're made in the image of
God and we're at the center of the
universe. It's obviously where else
would he put us? He um
many people still think there's
something very special about people that
computers couldn't have. Um I think
that's just wrong. Um in particular,
they think that special thing is
something like subjective experience or
sentience or consciousness. If you ask
people to define what those things are,
they find it very hard to say what they
really mean by that, but they sure
computers don't have it. Um, I'm going
to try and convince you that a
multimodal chatbot already has
subjective experience.
Um, I find it easier to talk about
subjective experience than sentence or
consciousness. But I think you'll see
once you've accepted that a multimodal
chatbot has subjective experience, this
sentience defense doesn't look nearly so good.
So there's a philosophical position
which was roughly Dan Dennit's position.
He died recently. He was a great
philosopher of cognitive science. Um I
talked to him a lot and he agreed that
this was a good name for it. Um you'll
notice this is atheism with something in
the middle. Um
so most people's view of intelligence is
that the mind is like a theata. Let's
talk about perception. The mind is like
a theater. Um, there's things going on
in this theater that only I can see.
And when I have a subjective experience,
what I mean is I'm telling you what's
going on in the inner theater that only
I can see.
Now, I think and Dennit thought that
view is as wrong as a religious
fundamentalist's view of where the world
came from, where the earth came from,
for example. It's actually not 6,000
years old. It's older than that. Um,
of course, it's very hard to change the
opinion of someone when they don't think
that their opinion is a theory. They
think it's manifest truth. So, most
people, I think, think it's just
manifestly obvious that I've got this
mind and in this mind there's this
subjective experience. What are you
talking about? How could a computer have
a subjective experience? Mind's just
different from material stuff. Um if you
ask a philosopher some philosophers
they'll say you know the things if you
ask them what a subjective experience is
made of they say they made a qualia and
they've invented special stuff for them
to be made of just like scientists
invented flegiston to explain how
combustion worked. It turned out there
wasn't any fleist. It was just imaginary
stuff. And your whole theory of the mind
is just a theory. It's not manifest
truth. um you have a theory of what the
mind is and inner theaters and what a
subjective experience is. That's just
wrong. And I'm going to try and convince
Um so sometimes my perceptual apparatus
doesn't work quite right and I want to
tell you what's going on. I want to tell
you what my perceptual system is trying
to tell me when it's malfunctioning.
Now telling you the activities of all
the neurons in my brain wouldn't do you
any good. we sort of already established
that um and anyway I don't know what
they are [snorts]
but there is one thing I can tell you
not always but often I can tell you
what's going on in my perceptual system
is what would be going on if it was
functioning properly and the world was
like this
so I could describe
what would be the normal causes for
what's going on in my perceptual system
even though I know that's not what's
happening Now, um, that's why I'd call
it a subjective experience. So, let's be
a bit more concrete. Suppose I say to
you, suppose I drop some acid. I don't
recommend this. Um,
I really don't recommend it. And I say,
I have this objective experience of
little pink elephants floating in front
of me. According to the theata view,
there's my mind, my inner theta, and
there's little pink elephants floating
about in this inner theta that only I
can see. And they're made of pink qualia
and elephant qualia and not very big
qualia and right way up qualia and
floating qualia, moving qualia, all
stuck together with qualia glue. You can
tell that's the theory I don't believe
in. Um, I'm going to say exactly the
same thing without using the word
subjective experience.
So, here we go. Um,
Um,
my perceptual system, I believe, is
lying to me. But if it wasn't lying to
me, there'd be little pink elephants
floating in front of me.
Okay, so I didn't use the word
subjective experience, but I said the
same thing.
So, what's funny about these little pink
elephants is not that they're in an
inner theater and made a qualia. It's
that they're counterfactual.
They're they're real pink and real
elephant and really little. Um, it's
just they're counterfactual. If they
were to exist, they'd be made of real
stuff, not qualia. They'd be out there
in the world made of real stuff. There
is no qualia. But they're hypothetical. That's what's funny about them. They're
That's what's funny about them. They're not made of spooky stuff. They're just
not made of spooky stuff. They're just hypothetical. and they're my way of
hypothetical. and they're my way of explaining to you how my perceptual
explaining to you how my perceptual system is lying to me. So now let's do
system is lying to me. So now let's do it with a chatbot.
it with a chatbot. I have a multimodal chatbot. I've
I have a multimodal chatbot. I've trained it up. It can talk. It's got a
trained it up. It can talk. It's got a robot arm so it can point and it can see
robot arm so it can point and it can see things.
things. And I put an object in front of it. It
And I put an object in front of it. It points to the object. No problem. I say
points to the object. No problem. I say point at the object and it points at it.
point at the object and it points at it. Then I put a prism in front of the
Then I put a prism in front of the camera lens and I put an object in front
camera lens and I put an object in front of it and say point at the object. It
of it and say point at the object. It says there. And I say, "No, the object's
says there. And I say, "No, the object's actually straight in front of you, but I
actually straight in front of you, but I put a prism in front of your lens." And
put a prism in front of your lens." And the chatbot says, "Oh, I see the prism
the chatbot says, "Oh, I see the prism bent the light rays." Um, so the
bent the light rays." Um, so the object's actually there, but I had the
object's actually there, but I had the subjective experience. It was there.
subjective experience. It was there. If it says that, it's using the word
If it says that, it's using the word subjective experience exactly like we
subjective experience exactly like we use them.
use them. So I rest my case. Chat bots, multimodal
So I rest my case. Chat bots, multimodal ones, already have subjective
ones, already have subjective experiences when their perceptual
experiences when their perceptual systems go wrong.
systems go wrong. And I'm done.
And I'm done. [applause]
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.