Bell's theorem demonstrates that quantum mechanics is fundamentally non-local, meaning that entangled particles can influence each other instantaneously regardless of distance, a concept that challenges classical notions of space, time, and causality.
Mind Map
クリックして展開
クリックしてインタラクティブなマインドマップを確認
[music]
Hey everyone. Today I have for you a
genuine glitch in reality that's going
to blow your mind and change the whole
way you think about everything. So it's
called Bell's theorem and this is one of
the most mysterious, unsettling,
magnificent results in all of
theoretical physics. So let's talk about it.
Bell's theorem demonstrates that quantum
mechanics is weirdly non-local.
That is, there's something going on with
quantum physics that doesn't seem to be
bothered by the limitations of space and
time. Now, of course, much has been said
about this, including in various popular
science uh articles and videos and all
that sort of thing. You often hear about
quantum entanglement, spooky action at a
distance, and all that kind of stuff.
And there's often some crossover with uh
sci-fi, about communication systems that
work faster than light and all that. And
there's also kind of this woo woo
connotation about consciousness and all
that sort of thing. And those are all
really fanciful notions, but in many
cases, what you hear about Bell's
theorem and quantum entanglement and all
that is not well grounded in the actual
physics and the math of quantum mechanics.
mechanics.
And so I wanted to make a video where we
actually really get into the technical
details of what exactly did Bell teach
us about the nature of reality. And so I
wanted to go through his famous
legendary 1964 paper, you know, word for
word, equation for equation. I want to
really dive into it and explore with you
exactly what is his argument and what
does it imply about the nature of reality.
reality.
I should point out in case you don't
know, I recently made a video on the
Einstein Podilski Rosen paradox which is
definitely a prequel to this video.
In fact, Bell's legendary 1964 paper is
called on the Einstein Podilski Rosen
paradox. Okay, so this is a followup to
the argument that Einstein, Podolski,
and Rosen put forward back in 1935 in
which they looked at quantum mechanics
and said, "Hey, wait a minute.
Something's wrong here. Something's
paradoxical. Either quantum mechanics is
super weird or maybe it's just incomplete."
incomplete."
And so almost 30 years after that, John
Stewart Bell thought about it real hard
and was like, "You know what? Sorry
Einstein and friends, actually quantum
mechanics is not incomplete, but rather
it's just really weird and genuinely
non-local in at least in some subtle
ways." So that's the context in which
Bell wrote this paper. It's a follow-up
to the argument put forward by Einstein,
Podilski, and Rosen. So before watching
this video, I do recommend watching my
video on the EPR paradox. Or if you
haven't seen that video, but you're just
familiar with the EPR paradox, then
that's cool, too. You don't have to get
your info from me. I'm just one of many
sources on this beautiful internet.
All right, then let's get into the
paper. Well, first of all, this paper is
broken up into six parts. Part one is
the introduction.
Part two is the formulation where we
sort of define our terms and think about
what it is we're going to be thinking
about. Part three is an illustration of
some examples.
And part four has the main argument of
the paper in which we find that if you
try to explain quantum physics using a
local hidden variable theory, you run
into a contradiction. In part five, the
ideas are generalized. And in part six,
we have our conclusion. So those are the
six parts of this paper. We're going to
go through them one at a time. And in
between these, I'm also going to have
some animations and some information and
equations that provide context because
one thing you got to know about this
paper is it is so cryptic and it is so
dense with equations and very few words
that if you just try to read it, it's
really hard actually. You really got to
take your time with this one. And so
we're going to take our time and I'm
going to have related animations and
equations to help us along and to fill
in the gaps in the paper where it's
assumed that the reader is going to be
imagining a certain thing in mind when
they read it. Oh, and speaking of, I've
put a link to the PDF in the description
below the video. And I definitely
recommend printing out this paper so
that you have it for reference as we go
through it. If you don't have a printer,
that's fine, but then you should open it
up on another screen or another tab or something.
something.
All right. So now it's time to get into
the introduction of the paper. The paper
begins. The paradox of Einstein,
Podilski, and Rosen was advanced as an
argument that quantum mechanics could
not be a complete theory, but should be
supplemented by additional variables.
Remember at the end of the EPR paper
they talked about how quantum physics is
incomplete and it's missing something
and you have to put variables into
quantum physics in order to have it
provide a complete description of reality.
reality.
These additional variables were to
restore to the theory causality and locality
locality
and that's often called local causality.
It's just the idea that cause and effect
should propagate such that an object is
only affected by its immediate
surroundings. as opposed to some kind of
weird teleportation or spooky action at
a distance. So Einstein and friends
argued that you have to put some kind of
additional variables into quantum
mechanics in order to resolve the EPR
paradox and give quantum mechanics local causality.
causality.
In this note that is Bell's paper, that
idea will be formulated mathematically
and shown to be incompatible with the
statistical predictions of quantum
mechanics. So that's what we're going to
do today. We're going to mathematically
explore the concept of hidden additional
variables in quantum mechanics and show
that it doesn't work and that therefore
quantum mechanics genuinely does exhibit
non-local phenomena which is crazy. Like
that goes against everything we think we
know about the nature of reality.
Anyway, it is the requirement of
locality or more precisely that the
result of a measurement on one system be
unaffected by operations on a distant
system with which it has interacted in
the past. That creates the essential
difficulty. So the hidden variable story
doesn't work if you require the theory
to be local. There have been attempts to
show that even without such a
separability or locality requirement, no
hidden variable interpretation of
quantum mechanics is possible.
These attempts have been examined
elsewhere and found wanting. That is to
say, actually, you can make a hidden
variable interpretation of quantum
mechanics work if you relax the
constraint of locality. But then it's
like what's the point, right? Moreover,
a hidden variable interpretation of
elementary quantum theory has been
explicitly constructed. Here he's
referring to bombian mechanics. That
particular interpretation bomb mechanics
has indeed a grossly non-local
structure. Famously bow mechanics is a
non-local theory. This the non-locality
is characteristic according to the
results to be proved here of any such
theory which reproduces exactly the
quantum mechanical predictions. That is
to say, what we're going to show in this
paper is that if you want a theory that
matches the quantum mechanical
statistics and you want it to involve
hidden variables as advocated for by
Einstein, Prolski, and Rosen, then
necessarily you're going to end up with
a non-local theory. And of course, that
non-locality is the same kind of dilemma
that you end up having to confront if
you just take quantum mechanics at face
value in which it does appear to be a
non-local theory. So, no matter how you
look at it, there's some weird non-local
stuff going on in quantum mechanics.
All right. Now, before going further, I
want to say a few words about spin 1/2
particles because spin 1/2 particles are
the main characters of this paper. And
so, it'll be helpful to review some of
the main points regarding the experiment
and theory of spin 1/2 particles.
So on the experimental side for sure the
most important and famous spin 1/2
experiment is the stern gerlock
experiment. The way this experiment
works is imagine that you have an oven
and inside the oven you put some silver
and the oven is so hot that the silver
atoms start to evaporate and fly around
with crazy high speeds and some of them
are going to fly out of a hole in the
oven. And then suppose you have some
kind of apparatus called a columator so
that we end up with a line of silver
atoms flying in a particular direction.
And also suppose this whole experiment
happens in a vacuum so that the silver
atoms aren't bumping into air as they
fly along. Now then this beam of atoms
is directed to fly through a strong
non-uniform magnetic field. And
amazingly, what happens is that magnetic
field somehow splits the beam of atoms
into two beams. And it's like, what
what's going on with that two beams? Why
do we have two beams? How can it be that
you have one beam of atoms coming in and
you have two beams going out? Well, the
key to understanding this is that a
silver atom is electrically neutral.
It's 47 protons perfectly cancel out.
It's 47 electrons because it's just a
neutral atom. It's not ionized. But if
you look at the electrons in a silver
atom, you find that all of the electrons
are paired up in their various orbitals,
but there remains a single unpaired
electron in the 5s orbital.
And so for all of the paired electrons
in the silver atom, their spins cancel
each other out. But the unpaired 5s
electron has a spin of 1/2 because an
electron is a spin 1/2 particle. And as
a result, it's sort of like the whole
silver atom behaves like an electrically
neutral spin 1/2 particle. So that
unpaired electron spin gives the whole
atom a tiny magnetic moment. That is it
makes the silver atom sort of like a
tiny little magnet.
I should also say the nucleus of the
silver atom also has a net spin of 1/2.
But because the nucleus is so tightly
packed compared to the electrons, the
magnetic effect of the nuclear spin is
thousands of times smaller than the
magnetic effect of the electron spin. So
for all intents and purposes, it doesn't
matter in this experiment.
So then what happens to the silver atoms
as they're flying through this apparatus
is that the initial beam is totally
thermally random. I mean, you're talking
about evaporated silver atoms. There's
no preferred directionality to the spin.
It's all a random distribution over the
spin directions. But then as they fly
through the sternerlock magnet, for some
reason the spins get projected either
onto purely spin up or purely spin down.
And that's really weird because it's not
this distribution of some continuous
quantity. No, it's a quantum like either
up or down. There's only two options
that it can be, which is super weird,
right? This is a very quantum effect.
And so then if we want to say okay well
these two states are going to be
separated by one quantum unit then you
realize that given the symmetry of the
situation since both beams are deflected
by equal amounts we can say that spin up
is associated with a quantity of plus
1/2 and spin down is associated with a
quantity of - 1/2. So that the
difference between plus one/2 and minus
one/2 is one quantum unit. And so that's
why we call this a spin 1/2 particle.
Okay. So we have two discrete beams. And
clearly there's something weirdly
quantum going on here. But what's really
going on here? You know, cuz the story I
just told about spin 1/2 and the
electron, it's like a little magnet and
it separates out. What does that really
mean? Like physically, how should we
imagine that? Well, in a moment I'll
tell you a little bit of the quantum
theory and then we'll also imagine some
kind of speculative hidden variable
theory and we'll see that those don't
really work. So, we'll get into the
theory in a moment, but for now I
actually want to stick on the
experimental side of things so that we
can learn a little bit more about how
spin 1/2 particles actually behave.
So, imagine we do a sternerlock
experiment where we have a beam of
silver atoms flying through. It goes
through the sternerlock magnet and it
splits into two beams, spin up and spin
down. Now suppose we put a wall so that
all the spin down atoms hit the wall and
they stop going. But then the spin up
atoms, they can fly right through and
they can keep going. And now we have a
beam of spin up atoms. So then we line
it up and pass it through another
sternerlock magnet that's oriented along
the same axis, the same direction in
space. Well, then an amazing thing
happens, which is that in the second
Stern Gerlock magnet, we only see a spin
up beam. There's no spin down. And I
guess that's not too surprising. It kind
of makes sense because we start off with
a random beam of silver atoms. We split
that into a spin up and a spin down. And
then we reme-measure and we find, okay,
there's only spin up. Yeah. Okay, that's
not too mind-blowing. That kind of makes
a lot of sense, right? And remember, all
of this is happening in a vacuum
chamber. So there's no air molecules
that the silver atoms are bumping into
cuz if there were, then we could imagine
the beam kind of rerandomizing. You
know, eventually the silver atoms are
slamming into air molecules and getting
all reoriented and all that sort of
thing. So this is all happening inside a
vacuum chamber. What this two-stage
Stern Gerlock experiment shows is that
spin is a state that the atom is in,
right? It's a property that persists
with the atom and has some continuity
across time. So that it makes sense to
say this is a spin up atom at least for
now. You know, I mean, it can bump into
something and change its spin. But
supposing it doesn't, then it can
continue on in that spin- up state for
some amount of time. So that's cool.
That gives us some sense of the
physicality of spin. But we're still
left with the mysterious question of why
do we have two discrete options for a
spin measurement anyway as opposed to
some continuous range of outcomes? And
how should we visualize a spin state?
Well, again, we'll talk about the theory
of that in just a moment, but there's
one more experimental thing I want to
show you before we get there. What we're
going to do now is imagine slightly
rotating the second magnet by some small
angle theta. And then a magical thing
happens. The second beam now mostly
comes out as spin up. But now there's
also a spin down beam as well. And it's
very subtle because all the spin up
atoms that are flying through the second
detector, most of them are going to come
out spin up. But every now and then
there is a chance that it'll come out
spin down. And so if you think about
many atoms flying through and so it's
sort of like a continuous beam
situation, then imagine a very bright
spin up beam and a dull but nonzero spin
down beam. And so then the question
becomes what is the probability of it
being spin up versus spin down in this
kind of an experiment? And there's
actually a very good agreement between
quantum mechanics and experimental
results which show that for the atoms
passing through the second magnet they
have a cosine^ squar theta /2
probability of being spin up and
likewise a sin^ 2 theta /2 probability
of being spin down. Remember that
cosine^ 2 + sin^ square is 1. So those
probabilities add up to one 100%. And
we're going to take that as sort of a
ground truth for this video.
This cosine^ 2 / 2 sin^ square. We're
going to take that as an absolute fact
about reality because it has been
measured in many experiments and it is a
pretty direct result of quantum theory.
Oh, and one thing I should say in this
diagram, you see that second beam is
still horizontal even though I tilted
the picture of the detector. In reality,
if you're doing an experiment like this,
you would want to realign the second
beam so that it comes in parallel to the
detector. But there are ways of doing
that without modifying the spin state of
the particle. So I just didn't show that
in this diagram because I wanted to keep
things simple. Actually, let me show you
this. This is a cool much better
diagram. So this comes from Wikipedia.
Shout outs to Clara Kate Jones for
making this beautiful diagram. What this
diagram shows is a two-stage Stern
Gerlock experiment. The particle beam
comes in. You get a 50-50 split between
spin up and spin down denoted as Z plus
and Z minus. You know, because we're
measuring along the Z-axis.
Then we send that second beam through
the second detector. The second detector
appears to be tilted, but is actually
just in alignment with the way the Z
plus beam comes out of the first detector.
detector.
But now I want to look at something
really cool, which is what if the second
detector measures along a whole
different axis. So, for example, if the
second detector measures along the
x-axis, the spin up particle beam goes
through the second detector and then
splits into a 50/50 probability mix of
being spin left or spin right. By the
way, instead of spin left and spin
right, let's use the language spin up
along x and spin down along x. So you
see when we say spin up and spin down,
it's always with reference to a
measurement axis and spin up is going to
be the beam which goes up relative to
that axis. Okay? So we can always use
the words spin up and spin down. But in
this experiment, you can also think
about it as spin left and spin right
when we're measuring along the x-axis.
I suppose this experiment is not too
surprising either because we see that
the particles come in spin up. We
wouldn't really expect any kind of
probabilistic biases as far as spin
left, spin right because all we know is
that the particles are all spin up and
up is perpendicular to left and right.
So it'd be kind of weird if the second
particle beam had some kind of bias
towards left and right, right? Like
where would that come from? We should
still expect some kind of randomness
along the x direction. Okay, so that
doesn't really blow your mind, but this
See, imagine we have a three-stage
experiment where the particle beam comes
in, the first detector splits into spin
up and spin down. We send only the spin
up through. Then the second detector
measures along X. So we get our spin
left and our spin right or in other
words along X. We can talk about it in
terms of spin up and spin down along X.
And then suppose we only allow the spin
up along X beam to go through. Then we
measure again along the Zaxis. And the
craziest thing happens. Look what we
get. We get a 50/50 particle beam of
spin up or spin down along Z. Well, how
can that be? Because the first magnet
already filtered out all of the spin
down along Z. So, shouldn't we expect
for the outgoing beam, we should have
only spin- ups, right? Isn't that what
we should expect is only spin up along Z
because the first magnet already
filtered out the spin down. But no, in
reality in experiments, you get a 50/50
spin up along Z. So what is going on
there? That's very strange. And the
reason this is so strange is that we
know that spin is a property of the
atom. We know that it's a physical thing
that the atom carries with it as it
moves along. Right? Right? I mean, we
thought about this earlier and we
realized, yeah, the Stern Gerlock
experiment shows us that spin is a state
that the atom can be and it's a property
of the atom at some moment in time. And
so, how can it be that if we've filtered
out the spin down along Z atoms, somehow
after the third detector, we get spin
down along Z? Like, what's happening
there? How can spin be a conserved
quantity if it comes back like that?
Like, what's going on? Now, what I'm
showing here, this is just an
experimental fact. This is the reality.
And then as people, it's on us to figure
out how do we tell a story that makes
sense of this reality. And so in just a
moment, I'm going to tell you the
quantum story, which is going to explain
what's happening here. And the long
story short of that is when you measure
the spin along some axis, the particle
forgets its spin information along the
other axis because you're resetting the
spin state of the particle. you're
projecting it into a spin igen state of
whatever axis you most recently measured
it on. And so once you measure it spin
up spin down along X, now all of a
sudden if it's in a spin up along Xigg
state, that has equal 50/50 odds of
being measured spin up or spin down
along Z. But then of course when you
learn quantum physics you're always
thinking about this is so weird and so
strange and I don't like it and surely
there's some kind of more classical
explanation with some kind of hidden
variable. Surely there's some kind of
secret behavior happening inside the
atom or to do with these detectors.
Maybe the detectors are modifying the
atom in such a way as to flip them up
and flip them down and kind of reset
their state. All right. So when you
learn quantum physics, you yearn for a
more sane explanation.
And especially, you know what would be
really nice is if we didn't have all
these weird quantum probabilities,
right? So wouldn't it be cool if we can
come up with some kind of explanation
for what's going on in the Stern Gerlock
experiment, but rather than this
confusing quantum story with wave
functions and states, what if we can
come up with some kind of more classical
deterministic model of what's going on
here? Even though such models don't
work, it's still very helpful to give it
a try, see what we can come up with, and
then when we figure out the way in which
the model doesn't work, that'll help us
appreciate why we need quantum
mechanics, even though it's super weird.
And seeing the failure of these local
hidden variable models is going to segue
very nicely into the core argument of
Bell's paper. All right. So, I want to
return to this picture of the two-stage
Stern Gerlock experiment where we use
the first magnet just to filter out the
spin down atoms and give us a beam of
nice pure spin up atoms. Then, we're
going to send those through a second
detector tilted relative to the first by
an angle of theta. And as we talked
about earlier, the probability of the
atom being spin up in the second
detector is going to be cosine^ 2 of
theta / 2. In this plot, we put the
theta angle along the x-axis and we put
the percentage probability that it'll be
spin up on the y-axis.
So on the far left of this plot, you can
see that we have a 100% chance of
measuring spin up when the second
detector is tilted 0° relative to the
first. That is when they're in
alignment. A spin up coming in is always
a spin up going out. On the opposite
extreme, if you imagine we put the
second detector all the way upside down,
180 degrees tilted, then relative to
that orientation, the detector is going
to say, "Hey, every particle spin down."
And now that's not too surprising
because all that is is we're flipping
the second detector around. So what was
defined as spin up is now relative to
the second detector spin down. And so
really, we don't have to think about an
angle of all the way up to 180° because
the interesting stuff happens with a
tilt angle between 0 and 90°. And beyond
that point, there's a kind of symmetry
where it's the same thing, but it's just
everything's flipped relative to before.
And speaking of 90°, if we tilted the
second detector 90°, then we'd have a
50/50 chance of an incoming spin up atom
going out as either spin up or spin down.
down.
Here's an animation, and this will give
us a more dynamic picture of what's
going on here. So, we have our incoming
beam of silver atoms coming in from the
left. They go through the first
detector. We split out, spin up, spin
down. The spin- ups keep going. And on
the right, what I'm showing here, and
this is just a rectangle, so it's kind
of abstract, but all I mean to indicate
there is we're doing a spin measurement
along the axis symbolized by the
orientation of that rectangle.
And as the rectangle goes back and
forth, you can kind of get a feel for
how the relative probability of
measuring spin up and spin down along
that second measurement axis changes as
a function of the angle.
On one extreme, when the detectors are
aligned, spin up is always spin up. On
the other hand, when the detector is
90°, we get a 50/50 split. And in
between, we get a probability which goes
with this cosine^ 2 theta / 2 curve.
Now this equation the cosine^ square of
theta / 2 comes from the spinner math of
what happens when you project a spin
state relative to one axis onto another
axis. But all of that spinner math and
projection and all that that's the weird
quantum stuff we don't want to have to
deal with if we don't have to. So when
we're trying to come up with a hidden
variable explanation, we want to think
in terms of some kind of quantity that
we can attach to each particle. maybe
some kind of arrow that indicates some
sort of direction. And you know, one of
the first things that comes to mind when
you think about the Stern Gerlock
experiment is maybe each incoming atom
has some kind of vector-like directional
quantity associated with it and then
maybe the detector sort of flips that
vector up or down as the particle passes through.
through.
Now, I'm not saying that's the case. I'm
just saying that's kind of something
that we might instinctively or
intuitively think might be the case. And
so let's go ahead and test our intuition
against logic and reason and see if it
actually holds up. So what I'm showing
here is an animation where we have these
atoms coming in and there's a yellow
vector associated with each one of them
which encodes some sort of orientational
direction like thing that goes with the
atom. And so for the sake of argument,
we can say our incoming beam should have
a random distribution over those vector
angles because these are evaporated
silver atoms and it's all thermally
random. Then suppose we claim that what
a sternerlock magnet does is it's going
to flip that arrow either up or down.
And then if it flips it up, it sends it
upwards. If it flips it down, it sends
it downwards.
Well, at first glance, an explanation
like this seems like it could possibly
be kind of what's going on here. This is
a model where the Sternlock magnet plays
a really active role in aligning the
particle a certain way. And whether or
not it flips up or flips down, we can
say the rule there is just if the vector
is pointing even a little bit up, it
goes up. If it's pointing even a little
bit down, it goes down. If it's pointing
perfectly horizontal, well, in reality,
nothing's perfectly horizontal. There's
probability zero of that happening. And
even if it did happen, it happens so
rarely you'd never even notice.
You know, the cool thing about physics
is that you can put an idea forward and
you can really propose it like, hey,
maybe this is how it is. But one of the
rules of physics is you have to stick to
whatever principles you propose. But
then if you can show that your own
principle leads to a contradiction, well
then sorry, but you have to redesign
your model. Okay. So what I want to show
now is that this assumption that the
sternerlock magnet flips up or flips
down the atom is actually not consistent
with the experimental data. And the
reason is actually very simple and you
can totally see it which is that if you
have a two-stage Stern Gerlock
experiment where the second detector is
tilted. We know from the experimental
data that when the second detector is
tilted then some of the particles should
sometimes come out spin down even if
they went in as spin up.
But if we tilt the detector anywhere
between 0° and all the way up to 89.9°,
then by this rule that the sternerlock
magnet is going to flip the particle in
whichever way it was already kind of
pointing in. Well, that leads us to see
that an incoming beam of spin up is
always going to come out spin up.
And so right there you see that this
model doesn't actually work by our own
principle that we put forward about
these arrows getting flipped up or
flipped down and and all that it doesn't
work. It just doesn't match the
two-stage Stern Gerlock experiment.
And so whatever is going on with spin,
it's not that. It's something else.
So what do we do? Well, just because our
model didn't work doesn't mean we can't
massage it into something that might work.
work.
So let's go ahead and see if we can
massage our model into something which
matches the experimental data at least
better than our first attempt which kind
of matched the data in the case of one
sternerlock magnet but failed miserably
when we had two and the second one was
tilted. Well, okay. So what if we did
this? Let's say that a sternerlock
magnet doesn't actually flip the
particle up or down, right? Because if
it does that, then as we've seen, the
second detector is just going to give us
a bunch of spin ups and no spin downs.
So let's say instead of flipping the
arrow up or down, the Stern Gerlock
magnet just kind of passively sorts
these particles based on whether their
vector points a little bit up or a
little bit down.
And so any vector that points even a
little bit up, that gets sent towards
the up beam. And any vector that points
a little bit down, that atom goes in the
down beam. But the sternerlock magnet
doesn't change the direction of that vector.
vector.
So maybe this vector represents a kind
of classical spin axis. Then in this
model, the angular momentum of the
particle would be conserved as it passes
through the detector. But somehow and
for some reason, the detector is just
sorting the incoming particles into two
beams depending on whether they're a
little bit up or a little bit down.
Well, you know, there's a problem with
this model, which is that
philosophically, it's starting to feel a
bit contrived because it's hard to
reconcile the fact that we see two
discrete beams with such a passive thing
going on at the detector.
Because at least before when we thought
that maybe the magnet just flips the
thing up or flips the thing down, there
you have kind of a naturally physically
dichomous situation where yeah, it's a
sword, but then it's also an action
where the particles are really separated
out in a binary way.
So if you have a more passive situation
where it's just a sword, you kind of
have to wonder, well then how is it that
we get two sharp beams? But never mind
all that because even though it seems
implausible, that's different than it
being illogical or impossible or
incoherent. You know, nature is weird.
So maybe this is how it is. But now if
we take this model and pass it through a
second sternerlock magnet, the question
comes up of does this model match the
data? In particular, do we find a
cosine^ squar theta / 2 of an incoming
spin up remaining spin up versus a sin^
square theta /2 probability of it going
spin down? Well, if you just look at the
animation shown here, you can see that
at first glance it kind of does seem to
work because when the second detector is
not tilted at all, anything coming in
spin up is going to go out spin up. So
that's good. at theta equals 0, this
model matches experiment.
And then if you imagine at 90°, well,
there it's a 50/50 because coming in the
spin up beam, that's just going to be a
vector that's pointing up a little bit,
but the distribution is totally random
as far as left and right. And so when
the detector is tilted at 90°, that
could go either way at that point, you
know. And so there again, we find
another angle at which our model matches
the data. And another wonderful thing
about this model is that for
intermediate angles, it kind of seems
like it would fit the data. You know, if
you tilt the detector like 45°, you can
see there's kind of a chance that it
would be spin down versus spin up. And
so at first, this feels very exciting
and very promising.
But when you think through it carefully,
you realize that this model actually
doesn't quite [clears throat] match the
cosine squared statistics that we get
from the experiment and from quantum
physics because instead of a cosine
squared function, it's actually just a
linear function in theta. And that's
actually a very important point. So I
want to linger on that for a moment and
I want to see exactly why this model
gives us a probability which is linear
in theta. So you think about the fact
that we have evaporated silver atoms
coming in and presumably they're all
going to be randomly oriented. And so if
we want to come up with a picture that
involves this hidden variable of an
orientational vector-like degree of
freedom, call it lambda, then the
situation we're describing here begins
with lambda vectors chosen totally at
random as far as their direction is
concerned. And if you like, you can
imagine lambda is being selected
uniformly from the unit circle. or if
you want to be fully three-dimensional,
the unit sphere. Although, as we're
about to see, it actually really doesn't
matter whether we think about it in
terms of a two-dimensional situation or
a three-dimensional situation. In either
case, we find the same linear trend. All
right, then. So, the particle passes
through the first sternlock magnet and
all of these vectors lambda that were
pointing a little bit downwards get
filtered out. They go in the spin down
beam and we block that. But then if the
vector is pointing even a little bit up
then it keeps passing through and then
it moves on to the next sterner lock detector.
detector.
So let's go ahead and use the vector P
to symbolize the polarization vector
that is the axis of measurement for the
first sterning lock magnet. You see here
based on the diagram that all of the
particles that have made it through our
filter are all going to be measured spin
up if they're measured again perfectly
along the direction P with no tilt angle.
angle.
And so that's what it means
experimentally to prepare some spin 1/2
particles some firmians with the spin
polarization along the vector P. It
means that for sure we know if we
measure the spin along P we're going to
get spin up.
Now then what can we say about that
hidden variable vector lambda? Well, we
can say that the particles that are
allowed through necessarily have lambda
which is somewhere in the northern
hemisphere. that is the hemisphere that
points in the same kind of direction as
the polarization vector P. Or in other
words, these are the lambda such that
lambda.p is greater than zero. And the
lambda are still going to be uniformly
distributed around that hemisphere
because they came in uniformly
distributed around the sphere and we've
just cut it in half. So now we want to
ask the question of what is the
probability of a particle with some
lambda vector being measured spin up in
the second detector which would happen
in our local hidden variable model if
lambda. A is greater than zero. That is
if the lambda vector happens to be
pointing in the same hemisphere as the
measurement axis a. And when you think
about it, you realize that the
probability of lambda measuring spin up
depends on the overlap of the lambda
hemisphere and the a hemisphere.
See, cuz if we draw a and then we think
about the hemisphere of vectors that
point in kind of the same direction as a
that is for which the vector a is
positive, you realize that the set of
all lambdas which are going to be
measured spin up is precisely the
overlap between the lambda hemisphere
and the a hemisphere. And given that
lambda is going to have a uniform
probability distribution, we can see
then that the probability of measuring
spin up is just going to be the fraction
of the lambda hemisphere that overlaps
with A. And the probability of it
measuring spin down is going to be the
fraction of lambda's hemisphere that
does not overlap with A. And if you see
that, then you see one of the core
concepts of Bell's paper. We're going to
describe this slightly differently in a
moment when we get into the paper and
it's going to be a little bit more
complicated, but this right here is a
very fundamental insight. Imagining
rotating hemispheres and seeing how the
overlap varies linearly. That is a
mental image that you want to keep in
mind as we get into parts three and four
of the paper. All right, then. So, just
to be really formal about this, let's go
ahead and say that theta is the tilt
angle between our polarization vector P
and our measurement axis vector A. And
then I want you to go ahead and imagine
rotating theta from 0 to pi or 180 if
you want to talk in terms of degrees.
Well, when you start off with theta
equals 0, p and a are aligned the same
way. And there's a complete overlap
between the lambda hemisphere and the a
hemisphere. And so you have a 100%
chance, guaranteed chance that when
theta is zero, you're going to measure
the particle spin up. But now imagine
theta growing and growing until theta
equals 90° or p<unk> /2 radians. Well,
at that point you're going to have a
50/50 overlap between the lambda
hemisphere and the a hemisphere. And so
then you're going to have a 50/50 chance
of measuring spin up versus spin down.
And then if you go ahead and flip it all
the way around 180° A and P are
perfectly antiparallel, then it'll be
guaranteed that you'll measure spin down
for a theta of 180°. Bearing in mind
that spin down is relative to that
upside down vector a. Now these three
points for which theta is 0, theta is 90
and theta is 180° all of those actually
do match the experimental data and
quantum mechanics. So that's all good.
But what's not all good is that linear
dependence on the probability of
measuring spin up as a function of the
angle theta. And you can see that linear
dependence just based on the way the
area fraction changes as you slide theta
around and you change the overlap
between these two hemispheres.
You know, one way to think about the
probability logic here is just imagine
you're playing one of those board games
that has the spinner thing and you spin
the thing and then the probability that
it lands on some wedge is just going to
be the wedge area. Well, yeah. So when
you think about that kind of logic and
then you think about the wedge area of
the overlap between the hemispheres and
the way it changes you can see that the
probability is indeed linear in theta.
But now that linearity is actually a
real problem because from experiments
and from quantum mechanics we can very
confidently say that the probability of
measuring the particle spin up is not
linear in the tilt angle theta but
rather it's the cosine^ square of theta
/ 2. And that fact that cosine squared
curvy fact makes our linear model very
hard to believe because the math is
wrong. the statistical predictions of
our model are not the true statistics of
the situation.
So what do we do? We just give up. Well,
we actually should give up because as
we'll see in this, you know, the whole
paper is about how local hidden variable
models don't work. But let's not give up
yet. Let's be very stubborn, okay?
Because technically there is a way that
we can fix this particular model for
this particular situation.
And the way in which we do that is going
to involve a concept which we'll see
later on in the paper. So we're going to
try to save this model somehow. And the
way that we're going to try to do that
is going to be illustrative and teach us
something about the situation. Even
though ultimately this fix is going to
break down when we later on start
looking at quantum entanglement.
All right. Then so the way to fix the
model is to define an effective
measurement axis. Call that a prime. and
define that as the measurement axis A
tilted towards the polarization vector P
such that the equation 1 - 2 theta prime
pi= cosine of theta is satisfied. Now
here by theta prime I mean the tilt
angle between the polarization vector p
and the effective measurement axis a
prime which has been magically tilted in
towards the polarization vector p. And
when you look at this equation here with
the 1 - 2 theta prime pi, that is a
linear equation. And then you look on
the right hand side and that's a cosine.
Now this equation here, it's not
immediately obvious what this has to do
with cosine^ 2 thet. In just a minute
though, we're going to talk about
expectation values and cosine of theta.
And then when we come back to this
equation later on in the paper, it'll
make more sense why exactly it has the
form that it does. But I don't want to
get into that just now because it's a
bit of a tangent. For now, all I want to
say is that this equation involving
theta prime and theta is going to warp
the linear probability dependence of our
model which is linear and theta is going
to warp that into the cosine^ 2 theta /2
curve that we expect from quantum
mechanics. And in fact, that is the
definition of where this theta prime and
theta equation comes from. So this trick
is actually a lot simpler than it seems
because when you think about what we
have here, as we've seen, our model
works when theta is 0, when theta is
90°, when theta is 180, but it breaks
down in between because we have a line
instead of a cosine squar. And so all
this trick is is just saying that we can
go ahead and warp that line into that
cosine squared curve simply by saying
that the effective measurement axis that
the particle is actually being measured
along is not the A that we thought it
was but is actually this A tilted
slightly towards the polarization vector
P. And by doing that we can go ahead and
bend the statistical predictions of our
model in such a way as to make it match
the experimental data and also quantum mechanics.
mechanics.
Now, the first time you hear this, I
mean, you should be thinking, "Rich,
come on now. What? This is absurd. We
should not tolerate this. We should not
go along with this." Your eyebrow should
raise skeptically to the point where
your forehead starts to get sore. Like,
there's just no credible way to justify
this move, this little trick that we're
doing. And so, for that reason, I want
to go ahead and call this the sketchy
move. I know it's kind of a playful
terminology, but there's a couple of
good reasons why we want to call it
this. First of all, it's a concept that
we're going to see a couple more times
throughout the paper. And then secondly,
I want to emphasize that this move is
not illegal. It's not logically
impossible. Technically, it doesn't
violate locality. There's nothing uh
physically impossible going on when we
put forward this model. But it's
extremely sketchy and hard to believe
because it raises so many questions. Why
should the effective measurement axis be
a prime? And also, how is it then that
we have the polarization vector and also
our hidden variable lambda vector that
we both have to take into account?
Because the polarization vector bends
the effective measurement axis. Then we
also have this lambda vector and what's
going on there? And our whole model
starts to become complicated and
contrived and very very hard to believe.
But we're not going to dismiss it just
yet. because later when we think about
quantum entanglement, we're going to
prove that even the sketchy move is no
longer enough to save our model or any
local hidden variable model. And that's
really at the heart of Bell's theorem.
So in summary, by going along with the
sketchy move for now, we're being
maximally open-minded, we're giving the
local hidden variable perspective every
benefit of the doubt. So that later on
when we absolutely destroy local hidden
variables, when we crush this idea,
we'll say, "Look, we even allowed the
sketchy move and that still wasn't
enough to make it work."
Now, I want to take just a moment to
talk about the kind of mathematical
vocabulary we use in quantum physics
when we're describing measuring the spin
of a spin 1/2 particle along some
direction, call it a. And to do that you
often see this expression sigma a. Let
me tell you what that is. So we have the
famous poly matrices which are sigma x
is 0 1 1 0 sigma y is 0 i i 0 and sigma
z is 1 0 01.
And you can find the definition of these
polymatrices in Griffith's intro to
elementary particles equation 4.26.
Although honestly if you just Google
polymatrices you'll find them all over
the place. They're super famous. And
these polymatrices are generators of sud
2, the le algebra of su2 which is the
group that has to do with
transformations of two component
spinners. It's the special unitary group
of degree 2. Anyway, today we don't need
to get into the group theory of su2, but
I just bring up the poly matrices in a
sort of vocabulary like context. Like
we're not actually going to have to
explore their mathematical properties,
but I just want to show you why it is
that these matrices are associated with
measuring the spin of a spin 1/2 particle.
particle.
You often see sigma with an arrow over
it. And you can think of that as a
vector whose components are the three
poly matrices. So you have sigma x,
sigma y, sigma z all packaged into this
vector-like quantity. And with that
sigma vector, we can go ahead and define
the spin operator along the unit vector
A as S hat. The spin operator equals H
bar / 2 sigma. A.
And what we mean by sigma A is we're
going to multiply all of the components
of our measurement direction A with each
of the corresponding poly matrices. So
we have a sub x sigma x plus a sub y
sigma y plus a subz sigma z. So when you
pick out a particular direction in
three-dimensional space and you want to
measure the spin of a particle along
that direction, the components of that
direction unit vector are like weights
of how much of each of the poly matrices
we're going to bake into our spin
operator along that direction.
Now why do we care about a spin operator?
operator?
Well, as we talked about in the EPR
paper, when you have an observable
quantity like spin, the value of the
quantity is going to be the igen value
corresponding to the igen states of the
operator. So if we have a spin 1/2
particle and its state is represented by
the two component spinner s then the
spin operator acts on s as the equation
shat operating on s is h bar / 2 * sigma
a * s
and bear in mind sigma do a this is
going to be a 2x2 matrix in fact if you
want to think about it in terms of the
lee algebra sue 2 that matrix is going
to live at the coordinate It's a sub x,
a sub y, a subz within the lee algebra
which is spanned by the poly matrices
sigma x, sigma y, sigma z. If that makes
sense, great. If it doesn't, don't worry
about it. That's a level of group theory
that we don't have to get into today.
Instead, I want to give you a specific
example of what it means for a particle
to be an igen state of the spin operator.
operator.
So if a particle has definite spin, that
is we've measured the spin and it's
either spin up or spin down along some
axis, then it is going to be an igen
state of the spin operator along that
axis. That's what the measurement does.
You measure the spin of a particle and
you're projecting its wave function onto
an igen state of the spin operator along
that axis. And so therefore s is going
to be a solution to the equation of shat
acting on s equals lambda s for some
real value lambda which is going to be
the spin of the particle.
As a concrete example let's suppose
we're measuring the spin of a particle
along the zaxis.
Well in that case our direction vector
becomes 0 0 1 cuz the vector doesn't
point in x. It doesn't point in y it
points entirely in z. And so therefore
if we evaluate this quantity of sigma a
we find that we have no sigma x no sigma
y and all sigma z. And so then our spin
operator along the z direction becomes h
bar / 2 1 0 01.
And so now if we want to solve for what
are the igen states of spin up and spin
down along z all we have to do is solve
this equation of h bar / 2 * this sigma
z matrix * s equals lambda * s for some
real igen value lambda and this igen
vector igen value equation has the
solutions of 1 0 or 0 1 for s and then
you find igen values of plus h bar / 2
and minus h R /2 respectively. And you
can verify that for yourself if you plug
into that igen vector igen value
equation these different options for S
and lambda.
Oh, and one other thing I'll say is that
for these igen vectors, you can go ahead
and slap a complex phase factor onto
both components and they remain states.
And in a moment, I'll show you a picture
which makes that point obvious. But for
now, I just leave that as a mathematical
algebraic statement. All right. Right.
Now, instead of the spin operator S hat,
we may as well just talk in terms of
sigma. A, which is conceptually it's
exactly the same thing as Shat. The only
difference is it's not scaled by that
factor of H bar / 2. And so therefore,
this sigma operator has nice
dimensionless values of plus or minus
one for spin up versus spin down. And so
therefore the sentence the particle was
measured spin up along the axis can be
said as measuring sigma. A yielded a
value of + one. Or in other words if you
want to say the particle was measured
spin down along the axis. We can say
sigma. A yielded a value of negative 1.
Or if you want to say the particle was
measured spin up along the b axis you
say sigma.b yielded a value of + one.
Right? So [clears throat] what we have
here is a very concise and mathematical
way of saying that a spin 1/2 particle
was measured along some axis and the
result of that measurement is simply the
igen value + one or minus1.
So in Bell's paper, he's going to use
this a lot. And so that's why I wanted
to show you where sigma.A comes from and
what it means. And we don't really have
to get too deep today into the theory of
SU2 and spinners and all that and poly
matrices. So if you're not super
familiar with all of these algebraic
details, that's actually totally fine.
For the purpose of understanding Belle's
paper, you really just have to know from
a vocabulary point of view that sigma. A
means measuring the particle spin along
the AIS and that the results are going
to be + one or minus one depending on
whether it turns out to be spin up or
spin down respectively.
Before we move on, I do want to give you
just a couple more examples of this
concept just to make the idea a little
bit more intuitive, a little bit more
familiar. So suppose we had measured
instead of along Z along the X
direction. Well then we find that the
spin operator along X is going to be H
bar over 2 sigma X. And when you think
about what are the solutions to h bar 2
sigma x acting on s= lambda s you find
the igen states of 1 / <unk>2 * 1 plus
or - 1 corresponding to igen values of
plus or - h bar / 2. That is to say we
find the same exact kind of situation as
before when we measured along z as far
as the igen values. You have two options
spin up or spin down. The magnitude of
the observable is h bar over two. But
now you have this spinner that's in a
different state. It's pointing in a
different direction. And by the way, the
one over <unk>2, that's just a
normalization constant. And likewise, we
can repeat exactly the same procedure.
We can measure along y. We find that the
spin operator along the y direction is h
bar over 2 sigma y. You solve that
vector value equation. you find the igen
states of 1 /<unk>2 1 plus orus i with
the same old values of plus orus h bar / 2.
2.
And I know all of this feels very
abstract, but there is a visual story
that goes with this algebra. And I've
touched on it in my previous videos
about the mystery of spinners and
electromagnetism as a gauge theory and
also driving the dro equation where
there's a way of drawing a two component
spinner as a flag in three dimensions.
So for example, let's take the igen
state for a particle that's in a spin up
state relative to the z-axis. that is
the spinner 1 Z. Well, if we plot that
using this flag picture diagram and
we'll go ahead and slap on a time
evolution phase factor corresponding to
the energy of the particle, we see that
we have a flag that points straight up
along Z. And then the time evolution
phase factor, that is the rotation in
the complex plane, is going to twirl
that flag around.
If you're curious as to the algebraic
machinery that's happening behind the
scenes, definitely check out the paper
an introduction to spinners by Andrew
Mstein. That paper explains in depth how
exactly the two component spinners map
on to these flag diagrams. But now then
if we plot the spin down along Z spinner
01 that is you see hey it's a flag
that's pointing down along Z. So that
makes sense. And now notice the time
evolution phase vector which rotates the
flag in the complex plane has the effect
of twirling the flag but in the opposite
way as before. Although really it's the
same way. It's just that the flag is
pointing in the opposite direction. The
way to see this is point your right
thumb along the direction that the flag
pole is pointing and then you find that
the phase factor is going to twirl the
flag in the same way that your fingers
go around on your right hand.
So we find in these spinners a picture
of a thing of some kind of quantity that
has an orientation and that kind of
spins around under a complex phase time
evolution. And so that gives you a feel
for some of the algebraic machinery
that's happening behind the scenes when
we talk about spinners and polyatrices
and all of that.
And so [clears throat] now I want you to
imagine in your mind what would the igen
state of spin up along the xaxis look like?
like?
Well, there it is. Makes sense, right?
So, this is 1 / <unk>2 1 with the time
evolution phase factor. We can go ahead
and also add on the spin down along xigg
state. And that's exactly as you would
expect. Now, let's also add in the spin
up along yen state. And there it is
pointing along y spinning around. And if
you add in the spin down along yen
state, well then there it is.
So without going into too too much
detail about the algebra of spinners and
all that, I just wanted to show you that
there is a picture corresponding to all
of this algebra. And that's something
that I would definitely encourage you to
read more about and to explore. But for
the purposes of Belell's paper, we
actually don't need to get too into the
details there. But I hope this has been
useful context.
All right. So before returning to the
paper, I want to say a couple of words
about the concept of the expectation
value of these spin measurements cuz
we're going to see that concept later on
in the paper. So remember earlier we
were looking at the slide shown here and
we thought about how if we rotate the
second magnet by an angle theta for a
particle beam, which we know is going to
be spin up if we measure it vertically,
then the beam is going to split into two
beams. And for a small angle theta, it's
going to be mostly spin up. But there's
some probability of that also being spin
down. And then as we talked about
before, the probability of spin up is
going to be cosine^ squar of that tilt
angle theta / 2. And likewise, the
probability of it being spin down is
going to be 1 minus that. So we're going
to have sin^ square of theta / 2. And
that's all fine and good and that's
totally true and that's one way to talk
about it. But there's another way we can
talk about it in terms of expectation
value which is in some ways more convenient.
convenient.
So to be really technical about this,
suppose we go ahead and call the second
magnet's axis the vector A and then as
we talked about we can use the notation
sigma A as a shorthand for the result of
measuring the spin along the axis A.
Because as you know when you dot the
sigma vector comprised of the poly
matrices by some unit vector a you end
up with something that's directly
proportional to the spin operator but
which has igen values of + one if the
particle is measured spin up and
negative 1 if the particle is measured
spin down. So then now we ask the
question of what is the expectation
value of sigma. A and all we mean by
expectation value is the average over
many measurements holding the A vector
constant. Let me give you an analogy.
Let's say you're a gambler and somehow
you have the opportunity to play a game
where you have a 60% chance of winning a
dollar and a 40% chance of losing a
dollar. Well, in that case, the
expectation value is going to be 20
cents because you have 0.6 6 * 1 which
is 6 and then you add on to that the 0.4
* -1 which is 0.4 and so you have a net
0.2 expectation value of a profit and so
you should play that game. Now the
reason I bring up this analogy is
because of course if you play the game
once you're not going to get 20. You're
either going to make a dollar or you're
going to lose a dollar. So we should not
expect one game to yield 20 cents.
However, if you play that game a 100
times you're going to have about 20
bucks. that's what you should expect to
have. And so that's exactly the sense in
which we use the term expectation value
when thinking about these spin
measurements. In every case, when you
measure the spin, it's going to be a
plus one or a minus one. But depending
on the tilt angle and depending on the
probability that depends on the tilt
angle, there's going to be some average
number that we'll find for that tilt
angle over many subsequent measurements
along that axis. And if you work out the
math as we'll do in just a moment, you
end up with the plot shown here where on
the x-axis we have the tilt angle theta
and then if you look at this curve for
the expectation value and by the way we
use the bracket notation here to
indicate expectation value. Well, as a
sanity check, let's go ahead and look at
a few points and see if this curve kind
of makes sense.
So first of all when theta is zero and
when a is aligned with the polarization
of those incoming spin-up atoms then we
find an expectation value of one and
that makes sense because when the second
detector is not tilted then every single
time a spin up coming in is going to be
a spin up going out and so sigma. A is
going to yield an igen value of plus one all the time. So you do it 100 times
all the time. So you do it 100 times you're going to get 100 plus ones. And
you're going to get 100 plus ones. And then conversely, if we flip a all the
then conversely, if we flip a all the way upside down, then you have a spin up
way upside down, then you have a spin up coming in relative to the upside down
coming in relative to the upside down second detector. That's always going to
second detector. That's always going to come out as a spin down. And so in that
come out as a spin down. And so in that extreme case, you always have a negative
extreme case, you always have a negative 1 for sigma. A, therefore, the
1 for sigma. A, therefore, the expectation value is precisely -1. Now,
expectation value is precisely -1. Now, if you check out this point in the
if you check out this point in the middle of the plot when theta is 90° and
middle of the plot when theta is 90° and the measurement axis A is perfectly
the measurement axis A is perfectly perpendicular to the incoming spin up
perpendicular to the incoming spin up polarization, well, in that case, sigma.
polarization, well, in that case, sigma. A is going to be a +1 or a minus1, you
A is going to be a +1 or a minus1, you know, each with a 50% probability. And
know, each with a 50% probability. And so if you have a set of 100 numbers
so if you have a set of 100 numbers which are either +1 or minus1 with equal
which are either +1 or minus1 with equal probability, well, you add those all up
probability, well, you add those all up and on average you're going to get zero.
and on average you're going to get zero. All right, then. So based on the three
All right, then. So based on the three points we've looked at, the curve seems
points we've looked at, the curve seems to make sense. But how do we calculate
to make sense. But how do we calculate the exact form of this curve? Well, all
the exact form of this curve? Well, all you have to do is think like a gambler
you have to do is think like a gambler and say the expectation value is going
and say the expectation value is going to be the probability of measuring spin
to be the probability of measuring spin up along the axis A times a plus one
up along the axis A times a plus one corresponding to spin up plus the
corresponding to spin up plus the probability of measuring spin down along
probability of measuring spin down along the axis A time the negative 1 that
the axis A time the negative 1 that corresponds to spin down. This is just
corresponds to spin down. This is just like in that game where you have 60%
like in that game where you have 60% chance of winning a dollar, 40% chance
chance of winning a dollar, 40% chance of losing a dollar. So the expectation
of losing a dollar. So the expectation value is $0.2. So it's the same
value is $0.2. So it's the same reasoning as a gambling calculation. And
reasoning as a gambling calculation. And as we saw earlier, we already know the
as we saw earlier, we already know the probability of measuring spin up versus
probability of measuring spin up versus spin down. In the first case, we have a
spin down. In the first case, we have a cosine^ 2 / 2 probability of measuring
cosine^ 2 / 2 probability of measuring spin up. And then we have a sin^ 2 thet
spin up. And then we have a sin^ 2 thet / 2 probability of measuring spin down.
/ 2 probability of measuring spin down. Now, if you are a trig identity
Now, if you are a trig identity enthusiast, you'll recognize this form
enthusiast, you'll recognize this form as having a delightful simplification,
as having a delightful simplification, which is that cosine^ 2 / 2us theta / 2
which is that cosine^ 2 / 2us theta / 2 equals cosine of theta. Isn't that
equals cosine of theta. Isn't that wonderful how that simplifies? So that's
wonderful how that simplifies? So that's a super nice result. And we're going to
a super nice result. And we're going to see the same result in Belle's paper in
see the same result in Belle's paper in equation 3 in a slightly different
equation 3 in a slightly different context, but it's the same exact
context, but it's the same exact reasoning. So anyway, that's all I
reasoning. So anyway, that's all I wanted to say about the expectation
wanted to say about the expectation value. So just think about this as a
value. So just think about this as a pretty common and useful way of putting
pretty common and useful way of putting a statistical handle on this kind of
a statistical handle on this kind of probabilistic situation.
probabilistic situation. All right, then. So now I think we've
All right, then. So now I think we've discussed all of the prerequisites that
discussed all of the prerequisites that we need for the remainder of the paper.
we need for the remainder of the paper. So now let's go ahead and get into part
So now let's go ahead and get into part two formulation.
So remember how in the EPR paper they gave a specific example of a two
gave a specific example of a two particle wave function with
particle wave function with anti-correlated momenta and correlated
anti-correlated momenta and correlated positions.
positions. And with that wave function, we saw how
And with that wave function, we saw how if we measure the momentum of one of the
if we measure the momentum of one of the particles, we end up putting the other
particles, we end up putting the other particle in a momentum state. And
particle in a momentum state. And conversely, if we choose to measure the
conversely, if we choose to measure the position of the particle, then we put
position of the particle, then we put the other one into a position state. So
the other one into a position state. So that specific wave function in the EPR
that specific wave function in the EPR paper was a very mathematically
paper was a very mathematically convenient example to illustrate the
convenient example to illustrate the point. However, of course, the EPR
point. However, of course, the EPR paradox is more general than just a
paradox is more general than just a single specific two particle wave
single specific two particle wave function. And if you see equations 7 and
function. And if you see equations 7 and 8 of the EPR paper, you can see that
8 of the EPR paper, you can see that more generically, whenever you have two
more generically, whenever you have two particles in an entangled state and you
particles in an entangled state and you think about representing that wave
think about representing that wave function as a sum over states of the
function as a sum over states of the first particle, then when you measure
first particle, then when you measure the first particle and put it into that
the first particle and put it into that igen state, that's going to have an
igen state, that's going to have an impact on the state of the second
impact on the state of the second particle. And so really the EPR paradox
particle. And so really the EPR paradox is just the observation that because we
is just the observation that because we have the freedom to choose which
have the freedom to choose which observable we measure of the first
observable we measure of the first particle, we have the ability then to
particle, we have the ability then to affect the quantum state of the second
affect the quantum state of the second particle in a way that somehow appears
particle in a way that somehow appears to violate the constraint of local
to violate the constraint of local causality.
causality. So anyway, the reason I bring that up is
So anyway, the reason I bring that up is because in Bell's paper, we're going to
because in Bell's paper, we're going to use a different two particle state to
use a different two particle state to get at the same fundamental paradoxical
get at the same fundamental paradoxical nature of quantum physics. So instead of
nature of quantum physics. So instead of the particles having anti-correlated
the particles having anti-correlated momenta and correlated positions, we're
momenta and correlated positions, we're going to imagine a pair of spin 1/2
going to imagine a pair of spin 1/2 particles whose spins are going to be in
particles whose spins are going to be in an entangled state. And this
an entangled state. And this configuration for thinking about the EPR
configuration for thinking about the EPR paradox is actually not original to
paradox is actually not original to Bell. It was first put forward by Bow
Bell. It was first put forward by Bow and Aharonov in 1957.
and Aharonov in 1957. So part two of Bell's paper begins with
So part two of Bell's paper begins with the example advocated by Bow and
the example advocated by Bow and Aharonov. The EPR argument is the
Aharonov. The EPR argument is the following. Consider a pair of spin 1/2
following. Consider a pair of spin 1/2 particles formed somehow in the singlet
particles formed somehow in the singlet spin state. Now I want to pause here and
spin state. Now I want to pause here and say what exactly is the singlet spin
say what exactly is the singlet spin state? Well that means that the spins of
state? Well that means that the spins of the two particles have no preferred
the two particles have no preferred direction a priori. If you think about
direction a priori. If you think about either of the particles and you're going
either of the particles and you're going to measure their spin, there's total
to measure their spin, there's total rotational symmetry in that neither of
rotational symmetry in that neither of the particles has a preferred spin axis.
the particles has a preferred spin axis. It's totally uniformly distributed over
It's totally uniformly distributed over all possibilities.
all possibilities. However, the spins of the particles
However, the spins of the particles exhibit perfectly anti-correlated
exhibit perfectly anti-correlated outcomes when measured along the same
outcomes when measured along the same axis. And this is a very bizarre state
axis. And this is a very bizarre state of affairs. Intuitively, you would think
of affairs. Intuitively, you would think that such a state is not possible. And
that such a state is not possible. And yet, the singlet state has been measured
yet, the singlet state has been measured in all kinds of experiments. So, this
in all kinds of experiments. So, this really is possible. This is something
really is possible. This is something that is real. And as we'll talk about
that is real. And as we'll talk about later in the paper, even though it's
later in the paper, even though it's very hard to imagine and it seems kind
very hard to imagine and it seems kind of surreal, the experimental data very
of surreal, the experimental data very strongly indicates that the singlet
strongly indicates that the singlet state is actually a legit thing that can
state is actually a legit thing that can exist. And you sometimes hear the
exist. And you sometimes hear the singlet state described as the particles
singlet state described as the particles having equal and opposite spin. But
having equal and opposite spin. But that's not exactly true, or rather
that's not exactly true, or rather that's too narrow of a description.
that's too narrow of a description. It is true that if you measure the two
It is true that if you measure the two particles along the same axis, you'll
particles along the same axis, you'll always find that their spins are equal
always find that their spins are equal and opposite. But, and this is really a
and opposite. But, and this is really a super important fact about the singlet
super important fact about the singlet spin state. So, I want to reemphasize
spin state. So, I want to reemphasize this. Before the measurement, neither of
this. Before the measurement, neither of the particles has a preferred spin
the particles has a preferred spin direction. This is very hard to imagine
direction. This is very hard to imagine but that is a super important aspect of
but that is a super important aspect of what it is for the particles to be in
what it is for the particles to be in the singlet state.
the singlet state. All right. So that's the singlet state.
All right. So that's the singlet state. Now imagine that we have some process
Now imagine that we have some process which produces pairs of spin 1/2
which produces pairs of spin 1/2 particles in the singlet state and then
particles in the singlet state and then each particle goes its separate way and
each particle goes its separate way and they're both moving freely in opposite
they're both moving freely in opposite directions.
directions. Now then suppose we send each particle
Now then suppose we send each particle into a detector say maybe a sternerlock
into a detector say maybe a sternerlock magnet and then we measure the spin of
magnet and then we measure the spin of both particles to get a sense of the
both particles to get a sense of the kind of thing that happens here. At
kind of thing that happens here. At first we're going to say that the
first we're going to say that the detectors are measuring along the same
detectors are measuring along the same axis.
axis. Let's go ahead and denote that with the
Let's go ahead and denote that with the unit vectors A and B respectively. And
unit vectors A and B respectively. And for starters, those unit vectors are
for starters, those unit vectors are going to be precisely aligned so that
going to be precisely aligned so that we're measuring both particles along the
we're measuring both particles along the same spin axis. And now because the
same spin axis. And now because the particles are in the singlet spin state,
particles are in the singlet spin state, if we measure the spin of particle one
if we measure the spin of particle one along the direction A and we get the
along the direction A and we get the value of + one, right? So suppose
value of + one, right? So suppose particle one measures spin up along a
particle one measures spin up along a then according to quantum mechanics and
then according to quantum mechanics and what it means for the particles to be in
what it means for the particles to be in the singlet state. For sure it's 100%
the singlet state. For sure it's 100% guaranteed that measuring the spin of
guaranteed that measuring the spin of particle 2 along the same axis is going
particle 2 along the same axis is going to yield a value of -1 that is spin down
to yield a value of -1 that is spin down and vice versa. Had we measured particle
and vice versa. Had we measured particle one in the spin down state then we would
one in the spin down state then we would know for sure that particle 2 would be
know for sure that particle 2 would be spin up along the same axis.
spin up along the same axis. By the way, just a comment on the
By the way, just a comment on the notation here. So, as we talked about
notation here. So, as we talked about earlier, the expression sigma A is
earlier, the expression sigma A is shorthand for measuring the spin of the
shorthand for measuring the spin of the particle along the axis A. And this
particle along the axis A. And this operator returns a plus one if it's spin
operator returns a plus one if it's spin up along A and a minus one if it's spin
up along A and a minus one if it's spin down along A. Now, then the subscripts
down along A. Now, then the subscripts here 1 and two, all that indicates is
here 1 and two, all that indicates is that in the first case we're measuring
that in the first case we're measuring particle one and in the second case
particle one and in the second case we're measuring particle two. So it's
we're measuring particle two. So it's not like we have two different sigma
not like we have two different sigma vectors. No, it's the same poly
vectors. No, it's the same poly matrices. It's the same operator. It's
matrices. It's the same operator. It's just that in the first case we apply it
just that in the first case we apply it to the first particle. And in the second
to the first particle. And in the second case, sigma 2, we apply that to the
case, sigma 2, we apply that to the second particle.
second particle. So now we make the hypothesis of local
So now we make the hypothesis of local causality. And it seems one at least
causality. And it seems one at least worth considering that if the two
worth considering that if the two measurements are made at places remote
measurements are made at places remote from one another, the orientation of one
from one another, the orientation of one magnet does not influence the result
magnet does not influence the result obtained with the other. And just to
obtained with the other. And just to really emphasize that point, imagine
really emphasize that point, imagine that detector A and detector B are
that detector A and detector B are separated so far and that the
separated so far and that the measurement of particle one and the
measurement of particle one and the measurement of particle 2 happen so
measurement of particle 2 happen so closely together in time that whatever
closely together in time that whatever tiny time difference there is between
tiny time difference there is between these two measurements, not even light
these two measurements, not even light could travel between detectors A and B
could travel between detectors A and B during that time. So, we imagine that
during that time. So, we imagine that the measurements going on at detector A
the measurements going on at detector A and detector B are completely causally
and detector B are completely causally disconnected if local causality is to be
disconnected if local causality is to be believed.
believed. But here's where we run into the APR
But here's where we run into the APR paradox. Since we can predict in advance
paradox. Since we can predict in advance the result of measuring any chosen
the result of measuring any chosen component of the spin of particle 2 by
component of the spin of particle 2 by previously measuring the same component
previously measuring the same component of the spin of particle 1, it follows
of the spin of particle 1, it follows that the result of any such measurement
that the result of any such measurement must actually be predetermined.
must actually be predetermined. That is to say because the particles
That is to say because the particles start off in the singlet state with no
start off in the singlet state with no preferred spin direction. Then imagine
preferred spin direction. Then imagine particle one is measured in detector A
particle one is measured in detector A ever so slightly before particle 2 is
ever so slightly before particle 2 is measured in detector B. You know by 0001
measured in detector B. You know by 0001 ns or whatever. Well, as soon as we've
ns or whatever. Well, as soon as we've measured particle 1 along the axis A,
measured particle 1 along the axis A, now we can predict with certainty the
now we can predict with certainty the component of the spin of particle 2
component of the spin of particle 2 along the same axis. And yet that
along the same axis. And yet that certainty does not exist in quantum
certainty does not exist in quantum physics. Now we can tell a story about
physics. Now we can tell a story about non-local wave function collapse where
non-local wave function collapse where you measure particle one along axis A
you measure particle one along axis A and the wave function instantly
and the wave function instantly collapses and then particle 2 is no
collapses and then particle 2 is no longer in the singlet state but now it's
longer in the singlet state but now it's for sure going to be polarized in
for sure going to be polarized in accordance with that measurement
accordance with that measurement direction A. But assuming that we don't
direction A. But assuming that we don't allow for non-local wave function
allow for non-local wave function collapse because we want to preserve our
collapse because we want to preserve our sanity and we want to hold on to this
sanity and we want to hold on to this concept of local causality, then we find
concept of local causality, then we find here an apparent contradiction because
here an apparent contradiction because the spin of particle 2 along the axis A
the spin of particle 2 along the axis A should definitely not be predictable
should definitely not be predictable with certainty given the wave function
with certainty given the wave function of the singlet state. A quantum physics
of the singlet state. A quantum physics just doesn't allow for that level of
just doesn't allow for that level of predictability unless we allow for the
predictability unless we allow for the possibility of instantaneous wave
possibility of instantaneous wave function collapse. So then since the
function collapse. So then since the initial quantum mechanical wave function
initial quantum mechanical wave function that is the singlet state does not
that is the singlet state does not determine the result of an individual
determine the result of an individual measurement this predetermination
measurement this predetermination implies the possibility of a more
implies the possibility of a more complete specification of the state. And
complete specification of the state. And so that is apparently the EPR paradox
so that is apparently the EPR paradox this time thought of in terms of spins
this time thought of in terms of spins rather than momentum and position
rather than momentum and position states. And so in other words, all of
states. And so in other words, all of this thought process leads us to think
this thought process leads us to think that surely there must be some kind of
that surely there must be some kind of hidden variables that go along with
hidden variables that go along with particles one and two in a way that
particles one and two in a way that quantum mechanics doesn't account for.
quantum mechanics doesn't account for. And if only we had some kind of more
And if only we had some kind of more complete model where we could figure out
complete model where we could figure out what are those hidden variables and what
what are those hidden variables and what are their dynamics and how do they
are their dynamics and how do they influence the spin measurements. Then
influence the spin measurements. Then surely we can find a more complete and
surely we can find a more complete and more sane and more understandable
more sane and more understandable explanation of what's going on here than
explanation of what's going on here than what quantum mechanics currently has to
what quantum mechanics currently has to offer. Well, all right then. So we want
offer. Well, all right then. So we want a more complete theory involving some
a more complete theory involving some kind of hidden variables. So let this
kind of hidden variables. So let this more complete specification be affected
more complete specification be affected by means of parameters lambda. These are
by means of parameters lambda. These are going to be our hidden variables. So in
going to be our hidden variables. So in this video whenever you see this yellow
this video whenever you see this yellow lambda that's going to stand for
lambda that's going to stand for whatever hidden variables we want to put
whatever hidden variables we want to put into our model that's going to give us a
into our model that's going to give us a more complete description of what's
more complete description of what's happening. So you know earlier we were
happening. So you know earlier we were looking at the Sternerlock experiment
looking at the Sternerlock experiment and we were trying to explain it in
and we were trying to explain it in terms of particles carrying with them
terms of particles carrying with them this yellow vector. And so that was an
this yellow vector. And so that was an example of lambda. But now we're going
example of lambda. But now we're going to broaden that up a little bit. Or
to broaden that up a little bit. Or actually we're going to broaden it up
actually we're going to broaden it up all the way and say lambda can be
all the way and say lambda can be whatever you want it to be. Whatever you
whatever you want it to be. Whatever you can imagine. A vector, a scalar, a
can imagine. A vector, a scalar, a tensor, a function, a set, whatever you
tensor, a function, a set, whatever you want it to be. It is a matter of
want it to be. It is a matter of indifference in the following. Whether
indifference in the following. Whether lambda denotes a single variable or a
lambda denotes a single variable or a set or even a set of functions and
set or even a set of functions and whether the variables are discrete or
whether the variables are discrete or continuous. The beautiful thing about
continuous. The beautiful thing about Belle's paper is it accounts for all
Belle's paper is it accounts for all possible hidden variable models in one
possible hidden variable models in one fell swoop because it's such a generic
fell swoop because it's such a generic argument as we'll see. However, we write
argument as we'll see. However, we write as if lambda were a single continuous
as if lambda were a single continuous parameter. So the notation that we'll be
parameter. So the notation that we'll be using, for example, we'll integrate over
using, for example, we'll integrate over all possible lambda and it'll look like
all possible lambda and it'll look like we're assuming that lambda is a
we're assuming that lambda is a continuous parameter. However, what
continuous parameter. However, what Belle is saying here is that if you want
Belle is saying here is that if you want to modify the argument so that lambda is
to modify the argument so that lambda is not a continuous parameter but is rather
not a continuous parameter but is rather a discrete parameter or a set or
a discrete parameter or a set or whatever contrived thing you want to
whatever contrived thing you want to come up with, you can trivially modify
come up with, you can trivially modify the argument to account for that.
the argument to account for that. Replace an integral with a sum or
Replace an integral with a sum or whatever you have to do. Those kinds of
whatever you have to do. Those kinds of modifications won't have any effect on
modifications won't have any effect on the logical structure of the argument
the logical structure of the argument put forward in this paper. So now let's
put forward in this paper. So now let's think about what's happening in these
think about what's happening in these detectors. And at this moment we can go
detectors. And at this moment we can go ahead and say that the axis of
ahead and say that the axis of measurement in detector B does not have
measurement in detector B does not have to be the same as the axis of
to be the same as the axis of measurement in detector A. So we're
measurement in detector A. So we're going to make this more generic. Oh, and
going to make this more generic. Oh, and one thing that I'll point out is that in
one thing that I'll point out is that in everything we're about to talk about,
everything we're about to talk about, what matters as far as the orientations
what matters as far as the orientations of the unit vectors of A and B is only
of the unit vectors of A and B is only the angle between those two vectors, the
the angle between those two vectors, the extent to which they're aligned or
extent to which they're aligned or misaligned.
misaligned. And when you think about two vectors in
And when you think about two vectors in three-dimensional space, the two vectors
three-dimensional space, the two vectors are going to span a plane, and then
are going to span a plane, and then there's going to be some angle between
there's going to be some angle between them in that plane. And that angle
them in that plane. And that angle between them, that theta angle is the
between them, that theta angle is the relevant quantity when we're thinking
relevant quantity when we're thinking about how the orientations of these two
about how the orientations of these two measurement axes are going to matter.
measurement axes are going to matter. And so if you want, you can imagine a
And so if you want, you can imagine a fully generic three-dimensional
fully generic three-dimensional situation where A and B can point
situation where A and B can point whichever ways you want to imagine them
whichever ways you want to imagine them pointing. But because it's only the
pointing. But because it's only the theta angle between them that matters in
theta angle between them that matters in whatever plane they happen to span, we
whatever plane they happen to span, we may as well imagine the A vector
may as well imagine the A vector pointing straight up. And then we can
pointing straight up. And then we can imagine the B vector having some random
imagine the B vector having some random orientation in the plane. And so the
orientation in the plane. And so the diagram shown here on your
diagram shown here on your two-dimensional screen with A pointing
two-dimensional screen with A pointing up and B pointing wherever, imagine
up and B pointing wherever, imagine rotating the B- axis a full 360. Well,
rotating the B- axis a full 360. Well, for all intents and purposes, that 360
for all intents and purposes, that 360 sweep is going to span all of the
sweep is going to span all of the possibilities as far as the ways in
possibilities as far as the ways in which we can misorient our detectors
which we can misorient our detectors relative to each other. And actually,
relative to each other. And actually, you only need 180 cuz once you tilt it
you only need 180 cuz once you tilt it past 180, theta starts to come back in.
past 180, theta starts to come back in. See what I mean? And then technically by
See what I mean? And then technically by symmetry, all the interesting stuff
symmetry, all the interesting stuff happens between 0 and 90°.
happens between 0 and 90°. Okay. So then what is actually going on
Okay. So then what is actually going on in these detectors? Well, if we assume
in these detectors? Well, if we assume this hidden variable model, then the
this hidden variable model, then the result A of measuring the spin of
result A of measuring the spin of particle 1 along the AIS is then
particle 1 along the AIS is then determined by the AIS and the hidden
determined by the AIS and the hidden variable lambda.
variable lambda. So, particle 1 is coming in, it's
So, particle 1 is coming in, it's carrying with it some kind of hidden
carrying with it some kind of hidden variable, maybe some vector, some
variable, maybe some vector, some scalar, some tensor, whatever it is,
scalar, some tensor, whatever it is, whatever hidden variable we want to
whatever hidden variable we want to imagine. And as particle one goes into
imagine. And as particle one goes into detector A and detector A is oriented
detector A and detector A is oriented along the A axis, then the only things
along the A axis, then the only things that are going to affect the spin
that are going to affect the spin measurement at particle 1 are the
measurement at particle 1 are the orientation that A vector and the hidden
orientation that A vector and the hidden variable lambda that goes with particle
variable lambda that goes with particle 1. Because particles one and two are in
1. Because particles one and two are in the singlet state, they don't have any a
the singlet state, they don't have any a priori preferred directions. So the
priori preferred directions. So the result of the spin measurement is going
result of the spin measurement is going to be deterministically well determined
to be deterministically well determined by however the hidden variable lambda
by however the hidden variable lambda interacts with the detector oriented
interacts with the detector oriented along A. And likewise then the result B
along A. And likewise then the result B of measuring the spin of particle 2
of measuring the spin of particle 2 along the B ais in the same instance is
along the B ais in the same instance is determined by the B ais and lambda for
determined by the B ais and lambda for exactly the same reason. And so we can
exactly the same reason. And so we can write that the measurement outcome at A
write that the measurement outcome at A as a function of the measurement
as a function of the measurement direction A and the hidden variables
direction A and the hidden variables lambda can take on a value of + one or
lambda can take on a value of + one or minus1 depending on whether particle 1
minus1 depending on whether particle 1 is measured spin up or spin down
is measured spin up or spin down respectively. And likewise the
respectively. And likewise the measurement result at detector B which
measurement result at detector B which is a function of the B ais and the
is a function of the B ais and the hidden variables lambda is also going to
hidden variables lambda is also going to take on a value of +1 or minus1 for spin
take on a value of +1 or minus1 for spin up and spin down respectively.
up and spin down respectively. And we're going to leave this fully
And we're going to leave this fully generic as far as in what way or by what
generic as far as in what way or by what function do the hidden variables
function do the hidden variables interact with the measurement axis.
interact with the measurement axis. Whatever it is you can imagine, whatever
Whatever it is you can imagine, whatever principle you want to go ahead and
principle you want to go ahead and postulate, then it's still for sure the
postulate, then it's still for sure the case whatever these functions actually
case whatever these functions actually are, by definition, they're going to
are, by definition, they're going to have values of plus or minus one
have values of plus or minus one depending on the outcome of the spin
depending on the outcome of the spin measurement. Now the vital assumption of
measurement. Now the vital assumption of local causality is that the result B for
local causality is that the result B for particle 2 does not depend on the
particle 2 does not depend on the setting A of the magnet for particle 1.
setting A of the magnet for particle 1. Nor does A depend on B. So in equation
Nor does A depend on B. So in equation one you see that A is a function of the
one you see that A is a function of the A vector and the hidden variables
A vector and the hidden variables lambda. b is a function of the B vector
lambda. b is a function of the B vector and the hidden variables lambda. But
and the hidden variables lambda. But notice that A is not a function of the B
notice that A is not a function of the B vector, nor is B a function of the A
vector, nor is B a function of the A vector. The reason being detectors A and
vector. The reason being detectors A and B are separated out so far and these
B are separated out so far and these measurements happen so quickly. So
measurements happen so quickly. So there's no way that the information
there's no way that the information about which way one detector is oriented
about which way one detector is oriented can propagate over to the other detector
can propagate over to the other detector and affect the measurement result in any
and affect the measurement result in any way. No, these two things happen in
way. No, these two things happen in different light cones. And so by local
different light cones. And so by local causality, you can't have the
causality, you can't have the measurement result of A depending on the
measurement result of A depending on the B vector or vice versa.
B vector or vice versa. And one of the things that we're going
And one of the things that we're going to show in this paper is that any hidden
to show in this paper is that any hidden variable model is going to have to
variable model is going to have to violate that assumption. And the only
violate that assumption. And the only way to get it to work is if you're going
way to get it to work is if you're going to relax that constraint and say, okay,
to relax that constraint and say, okay, the measurement outcome at A depends on
the measurement outcome at A depends on the orientation at B and vice versa. And
the orientation at B and vice versa. And then it's like, oh, that's weird. That's
then it's like, oh, that's weird. That's non-local. That is absurd. But you know
non-local. That is absurd. But you know that's like super weird. And then so at
that's like super weird. And then so at that point there's no advantage of using
that point there's no advantage of using a hidden variable model because whether
a hidden variable model because whether you take ordinary quantum mechanics or
you take ordinary quantum mechanics or some speculative hidden variable model
some speculative hidden variable model in both cases you're going to have a
in both cases you're going to have a non-local model. And so no matter how
non-local model. And so no matter how you look at it it's a glitch in reality.
you look at it it's a glitch in reality. All right. Then suppose we define row of
All right. Then suppose we define row of lambda as the probability distribution
lambda as the probability distribution of the hidden variables lambda.
of the hidden variables lambda. So in other words, imagine all possible
So in other words, imagine all possible configurations of our hidden variables
configurations of our hidden variables lambda, whether they're vectors or
lambda, whether they're vectors or scalers or tensors or functions or sets,
scalers or tensors or functions or sets, whatever you want to imagine for lambda.
whatever you want to imagine for lambda. There's going to be some space of
There's going to be some space of configurations, some space of
configurations, some space of possibilities that lambda can take on.
possibilities that lambda can take on. And you can assign a probability to each
And you can assign a probability to each and every configuration. And so row of
and every configuration. And so row of lambda is precisely the distribution
lambda is precisely the distribution which defines how likely our hidden
which defines how likely our hidden variables are to exist in whatever state
variables are to exist in whatever state we can imagine them existing in. So this
we can imagine them existing in. So this is quite a generic thing and as we go
is quite a generic thing and as we go through the paper we'll imagine some
through the paper we'll imagine some specific cases with some simple
specific cases with some simple functions for row of lambda. But notice
functions for row of lambda. But notice the power in keeping this generic. See
the power in keeping this generic. See so far we haven't narrowed down what
so far we haven't narrowed down what lambda can be. Our hidden variables can
lambda can be. Our hidden variables can be whatever you can imagine. And then
be whatever you can imagine. And then row of lambda as a probability
row of lambda as a probability distribution on those hidden variables
distribution on those hidden variables can also be whatever you want to
can also be whatever you want to imagine. Whatever distribution you want
imagine. Whatever distribution you want to take over whatever space of variables
to take over whatever space of variables you want to define. And even though our
you want to define. And even though our setup is so generic, one of the things
setup is so generic, one of the things we can still say for sure is that the
we can still say for sure is that the expectation value of the product of the
expectation value of the product of the two components measuring particle one
two components measuring particle one along the A axis and measuring particle
along the A axis and measuring particle 2 along the B ais is going to be P of A
2 along the B ais is going to be P of A and B where here P is the expectation
and B where here P is the expectation value of the products of A and B that is
value of the products of A and B that is the plus or minus one that's recorded at
the plus or minus one that's recorded at each detector. We can say that P of A
each detector. We can say that P of A and B is going to be the integral over
and B is going to be the integral over all possible configurations of hidden
all possible configurations of hidden variables. Each one weighted by row of
variables. Each one weighted by row of lambda that is how likely that
lambda that is how likely that configuration is to be. And then as
configuration is to be. And then as we're integrating over that space of
we're integrating over that space of possible hidden variables for each
possible hidden variables for each possibility, we simply multiply the
possibility, we simply multiply the outcome of the measurement at detector
outcome of the measurement at detector A, that is A of A and lambda times the
A, that is A of A and lambda times the measurement outcome at detector B, that
measurement outcome at detector B, that is B of B and lambda.
is B of B and lambda. By the way, in Belle's paper, he writes
By the way, in Belle's paper, he writes this integral as integral row lambda D
this integral as integral row lambda D lambda A * B. I like to write it in the
lambda A * B. I like to write it in the sandwich notation where you have the
sandwich notation where you have the integral sign on the left and the
integral sign on the left and the differential element on the right and
differential element on the right and then whatever you're integrating over in
then whatever you're integrating over in between. It doesn't matter either way.
between. It doesn't matter either way. It's just a stylistic choice. So, well,
It's just a stylistic choice. So, well, anyway, I want to reflect on exactly
anyway, I want to reflect on exactly what this equation means, equation two,
what this equation means, equation two, because it is of central importance to
because it is of central importance to everything that follows. So, this
everything that follows. So, this parameter P, we're going to go ahead and
parameter P, we're going to go ahead and call that the correlation between our
call that the correlation between our measurements.
measurements. And this correlation has a really
And this correlation has a really intuitive meaning. So the first thing to
intuitive meaning. So the first thing to notice is that P the correlation has to
notice is that P the correlation has to be somewhere in between -1 and 1. When
be somewhere in between -1 and 1. When it's negative 1, then the measurement
it's negative 1, then the measurement outcomes at detector A are going to be
outcomes at detector A are going to be perfectly anti-correlated with the
perfectly anti-correlated with the measurement outcomes at detector B. So
measurement outcomes at detector B. So for example, this would be when detector
for example, this would be when detector A and detector B are aligned along
A and detector B are aligned along precisely the same axis. Because if we
precisely the same axis. Because if we have a pair of particles in the singlet
have a pair of particles in the singlet state and we measure them both along the
state and we measure them both along the same axis then if one is spin up the
same axis then if one is spin up the other spin down and vice versa. So if a
other spin down and vice versa. So if a is + one then b is minus1 and vice
is + one then b is minus1 and vice versa. And so when we're measuring the
versa. And so when we're measuring the singlet state along the same axis then
singlet state along the same axis then the product of a and b is always going
the product of a and b is always going to be -1 because 1 *1 is -1 and -1 * 1
to be -1 because 1 *1 is -1 and -1 * 1 is -1. And so in that configuration if
is -1. And so in that configuration if the product of a and b is always neg -1
the product of a and b is always neg -1 then equation 2 is simply the negative
then equation 2 is simply the negative integral over row of lambda d lambda.
integral over row of lambda d lambda. Now this is a normalized probability
Now this is a normalized probability distribution. So when you integrate over
distribution. So when you integrate over all possibilities and each one is
all possibilities and each one is weighted by the probability distribution
weighted by the probability distribution the result of that integral is always
the result of that integral is always going to equal one because there's a
going to equal one because there's a 100% chance that the hidden variables
100% chance that the hidden variables are in some kind of configuration.
are in some kind of configuration. And so then we find that P of A and B
And so then we find that P of A and B when A and B are the same vector is
when A and B are the same vector is equal to -1.
equal to -1. Conversely, if we flip B around so that
Conversely, if we flip B around so that now B is equal to A and our measurement
now B is equal to A and our measurement axes are pointing in equal and opposite
axes are pointing in equal and opposite directions, then we find a correlation
directions, then we find a correlation of one. That is the product of A and B
of one. That is the product of A and B is always going to equal one. Because if
is always going to equal one. Because if we measure the particle spin up in
we measure the particle spin up in detector A, but then detector B is
detector A, but then detector B is flipped upside down relative to detector
flipped upside down relative to detector A, then the other particle is also going
A, then the other particle is also going to be measured spin up in detector B,
to be measured spin up in detector B, but along the upside down axis. So the
but along the upside down axis. So the singlet correlation is still there. It's
singlet correlation is still there. It's just that when you flip the vector B
just that when you flip the vector B upside down, that's kind of a
upside down, that's kind of a redefinition of what spin up and spin
redefinition of what spin up and spin down means in detector B. And so in that
down means in detector B. And so in that case if the product of a and b is always
case if the product of a and b is always equal to 1 because 1 * 1 is 1 and also -
equal to 1 because 1 * 1 is 1 and also - 1 * 1 is 1 then equation 2 simply
1 * 1 is 1 then equation 2 simply reduces to the integral of row of lambda
reduces to the integral of row of lambda d lambda which because row is a
d lambda which because row is a normalized probability distribution
normalized probability distribution equals 1.
equals 1. Now there's one more special case that
Now there's one more special case that we can imagine which is when a and b are
we can imagine which is when a and b are perpendicular.
perpendicular. So suppose a is pointing straight up and
So suppose a is pointing straight up and b is pointing straight to the right.
b is pointing straight to the right. Well, in that case, we should expect a
Well, in that case, we should expect a correlation of zero. The reason being in
correlation of zero. The reason being in the singlet state, say you measure spin
the singlet state, say you measure spin up along A, well, if B is perpendicular
up along A, well, if B is perpendicular to A, then it could go either way. You
to A, then it could go either way. You could get a spin up or a spin down. And
could get a spin up or a spin down. And so on average, the product of A and B is
so on average, the product of A and B is going to be a + one or a minus1 about
going to be a + one or a minus1 about 50/50. And so that'll average out to
50/50. And so that'll average out to zero. So if we have a value of P equals
zero. So if we have a value of P equals Z, there is no correlation between the
Z, there is no correlation between the two detectors.
two detectors. Okay, so that's equation two. The
Okay, so that's equation two. The correlation between our measurement
correlation between our measurement outcomes is found simply by integrating
outcomes is found simply by integrating over the space of all possible hidden
over the space of all possible hidden variables weighted by the probability of
variables weighted by the probability of each configuration of the products of
each configuration of the products of the plus -1 outcome at A times the plus
the plus -1 outcome at A times the plus orus one outcome at B. Now that
orus one outcome at B. Now that correlation given by equation 2 based on
correlation given by equation 2 based on a hidden variable model should equal the
a hidden variable model should equal the quantum mechanical expectation value
quantum mechanical expectation value which for the singlet state the
which for the singlet state the expectation value of that product is
expectation value of that product is going to be a b or as we saw earlier
going to be a b or as we saw earlier negative cosine of theta where theta is
negative cosine of theta where theta is the angle between the two measurement
the angle between the two measurement axis vectors a and b. And the way to
axis vectors a and b. And the way to prove that equation three is true, that
prove that equation three is true, that this is the quantum mechanical
this is the quantum mechanical expectation value, and that this does
expectation value, and that this does match the experimental data is just to
match the experimental data is just to imagine that particle 1 gets to detector
imagine that particle 1 gets to detector A ever so slightly before particle 2
A ever so slightly before particle 2 gets to detector B. So then particle 1
gets to detector B. So then particle 1 is measured along the AIS and the wave
is measured along the AIS and the wave function instantly collapses. And now
function instantly collapses. And now particle 2 is going to be polarized
particle 2 is going to be polarized opposite to the AIS. And so then when
opposite to the AIS. And so then when you measure the spin of particle 2 along
you measure the spin of particle 2 along the direction B, you can think about it
the direction B, you can think about it sort of like the two-stage sternerlock
sort of like the two-stage sternerlock experiment where we create a beam of
experiment where we create a beam of purely polarized spin- up particles and
purely polarized spin- up particles and we send that through a second detector
we send that through a second detector which is tilted by some angle theta. And
which is tilted by some angle theta. And then as we know we have a cosine^ 2
then as we know we have a cosine^ 2 probability of measuring spin up sin^ 2
probability of measuring spin up sin^ 2 thet2 probability of measuring spin
thet2 probability of measuring spin down. And if you take the expectation
down. And if you take the expectation value, you think like a gambler and
value, you think like a gambler and calculate the expectation value, you end
calculate the expectation value, you end up with an expectation value of cosine
up with an expectation value of cosine of theta for the measurement outcome at
of theta for the measurement outcome at the second detector if spin up is + one
the second detector if spin up is + one and spin down is ne1. And we saw that
and spin down is ne1. And we saw that earlier. And then the minus sign here
earlier. And then the minus sign here simply comes from the fact that the two
simply comes from the fact that the two particles in the singlet state are
particles in the singlet state are anti-correlated.
anti-correlated. So if particle one is spin up along the
So if particle one is spin up along the axis A, then particle 2 is actually
axis A, then particle 2 is actually going to be polarized spin down along A.
going to be polarized spin down along A. And so that's where the minus sign comes
And so that's where the minus sign comes from. It's basically just a 180 flip of
from. It's basically just a 180 flip of the two-stage Stern Garlock experiment
the two-stage Stern Garlock experiment that we were looking at earlier.
that we were looking at earlier. Well, anyway, all that's to say, quantum
Well, anyway, all that's to say, quantum mechanics tells us that the correlation
mechanics tells us that the correlation of the measurement outcomes for unit
of the measurement outcomes for unit vector A at detector A and unit vector B
vector A at detector A and unit vector B at detector B for two particles in the
at detector B for two particles in the singlet state should be negative cosine
singlet state should be negative cosine of theta where theta is the angle
of theta where theta is the angle between the two vectors. And so the main
between the two vectors. And so the main question of this paper is is it possible
question of this paper is is it possible to have some hidden variable model based
to have some hidden variable model based on some set of possible lambdas and some
on some set of possible lambdas and some probability distribution which describes
probability distribution which describes the likelihood of each lambda. Based on
the likelihood of each lambda. Based on a model like that, can we get equation 2
a model like that, can we get equation 2 to match the quantum mechanical and the
to match the quantum mechanical and the experimental value of negative cosine of
experimental value of negative cosine of theta between the vectors a and b? If
theta between the vectors a and b? If so, then such a hidden variable model
so, then such a hidden variable model might be plausible, you know, because it
might be plausible, you know, because it would match the data. It would match
would match the data. It would match quantum theory and yet it would be an
quantum theory and yet it would be an alternate way of looking at things. So
alternate way of looking at things. So that's cool. But what we're going to
that's cool. But what we're going to show in this paper, in particular, part
show in this paper, in particular, part four in the contradiction, is that no
four in the contradiction, is that no local hidden variable model can actually
local hidden variable model can actually have an equation 2 correlation which
have an equation 2 correlation which matches the quantum mechanical
matches the quantum mechanical correlation and the experimental data.
correlation and the experimental data. And so therefore, we cannot have a local
And so therefore, we cannot have a local hidden variable explanation of what's
hidden variable explanation of what's going on here. And so therefore, we have
going on here. And so therefore, we have to confront the fact that quantum
to confront the fact that quantum mechanics genuinely is super weird and
mechanics genuinely is super weird and non-local and a glitch in reality.
non-local and a glitch in reality. Oh, and then one little caveat on the
Oh, and then one little caveat on the way we've formulated things here. Some
way we've formulated things here. Some might prefer a formulation in which the
might prefer a formulation in which the hidden variables fall into two sets with
hidden variables fall into two sets with the measurement outcome at A dependent
the measurement outcome at A dependent on one set of hidden variables and the
on one set of hidden variables and the measurement at B depending on another
measurement at B depending on another set of hidden variables.
set of hidden variables. However, this possibility is contained
However, this possibility is contained in the above since lambda stands for any
in the above since lambda stands for any number of variables and the dependencies
number of variables and the dependencies thereon of A and B are unrestricted. So
thereon of A and B are unrestricted. So in other words, if you want to have a
in other words, if you want to have a hidden variable model where particle one
hidden variable model where particle one carries with it some kind of set of
carries with it some kind of set of hidden variables and particle 2 carries
hidden variables and particle 2 carries with it a whole another set of hidden
with it a whole another set of hidden variables, go right ahead. That's fine.
variables, go right ahead. That's fine. We're not ruling out that possibility.
We're not ruling out that possibility. When we use this character lambda to
When we use this character lambda to stand for any imaginable hidden
stand for any imaginable hidden variables, you can go ahead and imagine
variables, you can go ahead and imagine that in whatever way you want, including
that in whatever way you want, including the situation where you have two sets of
the situation where you have two sets of hidden variables, one for each particle.
hidden variables, one for each particle. You know, go for it. That's totally
You know, go for it. That's totally fine. were not restricting that
fine. were not restricting that possibility at all. And likewise, in a
possibility at all. And likewise, in a complete physical theory of the type
complete physical theory of the type envisaged by Einstein, the hidden
envisaged by Einstein, the hidden variables would have dynamical
variables would have dynamical significance and laws of motion.
significance and laws of motion. Our lambda can then be thought of as
Our lambda can then be thought of as initial values of these variables at
initial values of these variables at some suitable instant. So in other
some suitable instant. So in other words, if you want to think about hidden
words, if you want to think about hidden variables as some kind of fields with
variables as some kind of fields with dynamical significance, that's cool,
dynamical significance, that's cool, too. Everything we're about to argue
too. Everything we're about to argue doesn't rule out that possibility at
doesn't rule out that possibility at all. And if you want, you can imagine
all. And if you want, you can imagine lambda representing a snapshot in time
lambda representing a snapshot in time of those fields. And then you can
of those fields. And then you can imagine those fields evolving in
imagine those fields evolving in accordance with some dynamical
accordance with some dynamical equations. But none of that time
equations. But none of that time evolution is going to break that thought
evolution is going to break that thought experiment outside of the framework that
experiment outside of the framework that we're setting up because our argument is
we're setting up because our argument is fully generic. Anything you can imagine
fully generic. Anything you can imagine for lambda, lambda can be. You know, I
for lambda, lambda can be. You know, I just noticed this yellow lambda. It kind
just noticed this yellow lambda. It kind of looks like a banana peel. You
of looks like a banana peel. You wouldn't want that as a hidden variable.
wouldn't want that as a hidden variable. [laughter]
[laughter] Hey, that would affect the measurement
Hey, that would affect the measurement of your spin state.
of your spin state. All right, moving on.
All right, moving on. Part three of the paper begins. The
Part three of the paper begins. The proof of the main result is quite
proof of the main result is quite simple. Well, according to Belle, at
simple. Well, according to Belle, at least. I don't know if I would say it's
least. I don't know if I would say it's quite simple, but uh anyway, before
quite simple, but uh anyway, before giving it in part four, however, a
giving it in part four, however, a number of illustrations may serve to put
number of illustrations may serve to put it in perspective.
it in perspective. So part three is all about establishing
So part three is all about establishing some context for part four looking at
some context for part four looking at some specific examples which we're then
some specific examples which we're then going to generalize in part four when we
going to generalize in part four when we give the formal argumentation that local
give the formal argumentation that local hidden variable models don't work.
hidden variable models don't work. Now I'm going to go ahead and break up
Now I'm going to go ahead and break up part three into three parts 3 A 3 B and
part three into three parts 3 A 3 B and 3 C because this part of the paper is
3 C because this part of the paper is kind of naturally broken up into those
kind of naturally broken up into those three parts anyway and I want to take
three parts anyway and I want to take the time to zoom in on each part of this
the time to zoom in on each part of this individually.
individually. So the first part of part three is that
So the first part of part three is that for a single particle we can make up a
for a single particle we can make up a hidden variable story of what's going on
hidden variable story of what's going on with the spin and it's okay it seems to
with the spin and it's okay it seems to work.
work. Firstly there is no difficulty in giving
Firstly there is no difficulty in giving a hidden variable account of spin
a hidden variable account of spin measurements on a single particle.
measurements on a single particle. Suppose we have a spin half particle in
Suppose we have a spin half particle in a pure spin state with polarization
a pure spin state with polarization denoted by a unit vector P. And all that
denoted by a unit vector P. And all that means is imagine we send a beam of spin
means is imagine we send a beam of spin 1/2 particles through a sternerlock
1/2 particles through a sternerlock magnet and then filter it out like what
magnet and then filter it out like what we saw before where we allow only the
we saw before where we allow only the spin up particles through. Well, then if
spin up particles through. Well, then if the axis of that sternerlock magnet is
the axis of that sternerlock magnet is the vector P, then the outgoing beam of
the vector P, then the outgoing beam of particles are polarized with reference
particles are polarized with reference to that vector P. That is to say, if you
to that vector P. That is to say, if you were to do a subsequent spin measurement
were to do a subsequent spin measurement on that particle along the direction P,
on that particle along the direction P, then for sure the result of that
then for sure the result of that measurement is going to be spin up. So
measurement is going to be spin up. So that's what it means for the particle to
that's what it means for the particle to be polarized along the direction P.
be polarized along the direction P. All right. Now then suppose we let our
All right. Now then suppose we let our hidden variable be for example a unit
hidden variable be for example a unit vector lambda with uniform probability
vector lambda with uniform probability distribution over the hemisphere
distribution over the hemisphere lambda.p is greater than zero. That is
lambda.p is greater than zero. That is to say a lambda is going to be some
to say a lambda is going to be some additional directional or orientational
additional directional or orientational degree of freedom that travels along
degree of freedom that travels along with the particle. And we don't know
with the particle. And we don't know exactly what lambda is going to be. All
exactly what lambda is going to be. All we know about it is that it's going to
we know about it is that it's going to have a uniform probability distribution
have a uniform probability distribution over the hemisphere which points in the
over the hemisphere which points in the same direction as P. And so this
same direction as P. And so this constraint that the dotproduct of lambda
constraint that the dotproduct of lambda and P is greater than zero, all that
and P is greater than zero, all that means is that lambda kind of points
means is that lambda kind of points towards P and it doesn't kind of point
towards P and it doesn't kind of point away from P. Now, if you think back to
away from P. Now, if you think back to what we saw earlier in this video, where
what we saw earlier in this video, where we sent our particle through a two-stage
we sent our particle through a two-stage Sternerlock experiment, and we supposed
Sternerlock experiment, and we supposed that all the magnet does is filters out
that all the magnet does is filters out the particles that point a little up
the particles that point a little up versus a little down without actively
versus a little down without actively flipping the arrow up and down. You'll
flipping the arrow up and down. You'll see that that thought experiment
see that that thought experiment actually gives us a beam of this kind of
actually gives us a beam of this kind of particle where we start off with the
particle where we start off with the assumption that the incoming particle,
assumption that the incoming particle, those evaporated silver atoms, have a
those evaporated silver atoms, have a totally randomly oriented lambda vector,
totally randomly oriented lambda vector, but then we send it through the first
but then we send it through the first sternerlock magnet to get a beam that's
sternerlock magnet to get a beam that's purely polarized along the axis of that
purely polarized along the axis of that magnet. And then at that point, what we
magnet. And then at that point, what we know about the lambda vector is it's
know about the lambda vector is it's still going to be totally random, but
still going to be totally random, but only on the half of the sphere that kind
only on the half of the sphere that kind of points along the direction P because
of points along the direction P because the particles for which lambda pointed
the particles for which lambda pointed away from P were sent into the spin down
away from P were sent into the spin down beam and those didn't go forward.
beam and those didn't go forward. And so the question then comes up, what
And so the question then comes up, what happens if we measure the spin of this
happens if we measure the spin of this kind of particle along some axis A?
kind of particle along some axis A? Well, we already know what the
Well, we already know what the expectation value is going to be. The
expectation value is going to be. The expectation value of the spin of this
expectation value of the spin of this kind of particle from quantum mechanics
kind of particle from quantum mechanics and from experiment is going to be the
and from experiment is going to be the cosine of the tilt angle of the second
cosine of the tilt angle of the second detector relative to the first. That is
detector relative to the first. That is in this language we would say it's going
in this language we would say it's going to be the coine of the angle theta
to be the coine of the angle theta between the polarization vector P and
between the polarization vector P and the measurement vector A.
the measurement vector A. So then suppose that as we're building
So then suppose that as we're building our hidden variable model, we speculate
our hidden variable model, we speculate that the result of measuring along some
that the result of measuring along some axis A is going to be the sign of the
axis A is going to be the sign of the hidden variable lambda vector dotted
hidden variable lambda vector dotted with the effective measurement axis A
with the effective measurement axis A prime. See, we're going to have to do a
prime. See, we're going to have to do a sketchy move here of the kind we talked
sketchy move here of the kind we talked about earlier.
about earlier. And so A prime is going to be a unit
And so A prime is going to be a unit vector which depends on A and P in a way
vector which depends on A and P in a way to be specified. We're going to talk
to be specified. We're going to talk about exactly what that has to be in a
about exactly what that has to be in a moment, but this is exactly the same
moment, but this is exactly the same kind of sketchy move we looked at
kind of sketchy move we looked at earlier when we were thinking about how
earlier when we were thinking about how can we modify our hidden variable model
can we modify our hidden variable model into something that matches the data.
into something that matches the data. And in fact, the example we looked at
And in fact, the example we looked at earlier in the video is mathematically
earlier in the video is mathematically equivalent to what we're talking about
equivalent to what we're talking about now. Oh, and then the sign function here
now. Oh, and then the sign function here simply takes on the values of + one or
simply takes on the values of + one or minus one according to the sign of its
minus one according to the sign of its argument. So the sign of the dotproduct
argument. So the sign of the dotproduct of the lambda vector and the effective
of the lambda vector and the effective measurement axis a prime is going to be
measurement axis a prime is going to be positive if lambda kind of points along
positive if lambda kind of points along a prime and it's going to be negative if
a prime and it's going to be negative if lambda kind of points away from a prime.
lambda kind of points away from a prime. And so all this is to say the
And so all this is to say the measurement result is going to be spin
measurement result is going to be spin up if lambda is in the hemisphere whose
up if lambda is in the hemisphere whose pole is a prime and otherwise it'll be
pole is a prime and otherwise it'll be spin down if lambda is outside of that
spin down if lambda is outside of that hemisphere.
hemisphere. And then you can say what if lambda is
And then you can say what if lambda is right on the equator relative to the
right on the equator relative to the north pole of a prime. Well, the
north pole of a prime. Well, the probability of lambda being perfectly on
probability of lambda being perfectly on the equator is zero. And so we don't
the equator is zero. And so we don't have to worry about it. As Bell says in
have to worry about it. As Bell says in his paper, actually this leaves the
his paper, actually this leaves the result undetermined when lambda a prime
result undetermined when lambda a prime equals zero. But as the probability of
equals zero. But as the probability of this is zero, we will not make special
this is zero, we will not make special prescriptions for it. So we don't have
prescriptions for it. So we don't have to worry about that. Now then if you
to worry about that. Now then if you average over all possible hidden
average over all possible hidden variable vectors lambda in accordance
variable vectors lambda in accordance with the setup we've described here the
with the setup we've described here the expectation value of the spin
expectation value of the spin measurement is going to be 1 - 2 theta
measurement is going to be 1 - 2 theta prime over pi. Call that equation 5
prime over pi. Call that equation 5 where theta prime is the angle between
where theta prime is the angle between the effective measurement axis a prime
the effective measurement axis a prime and the polarization vector p. That's
and the polarization vector p. That's the same theta prime from our sketchy
the same theta prime from our sketchy move we talked about earlier. And so
move we talked about earlier. And so let's go ahead and see where equation 5
let's go ahead and see where equation 5 comes from. Why does this model give us
comes from. Why does this model give us an expectation value of 1 - 2 theta
an expectation value of 1 - 2 theta prime over pi?
prime over pi? Well, the reason being is that the
Well, the reason being is that the expectation value of the spin
expectation value of the spin measurement along the measurement axis A
measurement along the measurement axis A in accordance with the equation for the
in accordance with the equation for the rule that we've stipulated here is going
rule that we've stipulated here is going to be the probability that the lambda
to be the probability that the lambda vector is in the hemisphere defined with
vector is in the hemisphere defined with A prime at the pole times a + one for
A prime at the pole times a + one for the spin up result plus the probability
the spin up result plus the probability of lambda not being in a prime's
of lambda not being in a prime's hemisphere times the negative 1 value
hemisphere times the negative 1 value which goes along with the spin down
which goes along with the spin down measurement.
measurement. So we're thinking like a gambler here
So we're thinking like a gambler here and we're calculating that expectation
and we're calculating that expectation value. And then when we think about
value. And then when we think about this, what we realize is that the
this, what we realize is that the expectation value of the spin
expectation value of the spin measurement is going to be one, its
measurement is going to be one, its maximum value, when the theta prime
maximum value, when the theta prime angle is zero. That is when our
angle is zero. That is when our polarization vector is exactly aligned
polarization vector is exactly aligned with the effective measurement axis a
with the effective measurement axis a prime, then we're always going to get
prime, then we're always going to get spin up. like for sure 100% guarantee
spin up. like for sure 100% guarantee because when you think about the
because when you think about the hemisphere of possible lambda vectors,
hemisphere of possible lambda vectors, well, those are going to be in the same
well, those are going to be in the same hemisphere as the polarization vector.
hemisphere as the polarization vector. So if the polarization vector and the a
So if the polarization vector and the a prime vector point in exactly the same
prime vector point in exactly the same direction, then lambda is guaranteed to
direction, then lambda is guaranteed to be in a prime's hemisphere. So you're
be in a prime's hemisphere. So you're always going to get a plus one in that
always going to get a plus one in that case. And conversely, the expectation
case. And conversely, the expectation value of the spin measurement if the
value of the spin measurement if the polarization vector P is completely
polarization vector P is completely antiparallel to the effective
antiparallel to the effective measurement axis A prime that is if
measurement axis A prime that is if theta prime is pi or 180° then we're
theta prime is pi or 180° then we're always going to get a negative one a
always going to get a negative one a spin down measurement in that case. If
spin down measurement in that case. If the polarization vector is pointing
the polarization vector is pointing completely away from a prime then the
completely away from a prime then the space of possible lambda vectors is
space of possible lambda vectors is precisely the opposite of a prime's
precisely the opposite of a prime's hemisphere. And so you're always going
hemisphere. And so you're always going to get a spin down measurement in that
to get a spin down measurement in that case. And then if you think about
case. And then if you think about rotating the polarization vector P
rotating the polarization vector P relative to A prime and think about the
relative to A prime and think about the overlap in the hemispheres of P and A
overlap in the hemispheres of P and A prime, you see that the overlap varies
prime, you see that the overlap varies linearly with the angle theta prime.
linearly with the angle theta prime. This goes back to what we were talking
This goes back to what we were talking about earlier. When you imagine the
about earlier. When you imagine the board game with the spinny thing and you
board game with the spinny thing and you spin the needle and the probability of
spin the needle and the probability of it landing somewhere simply has to do
it landing somewhere simply has to do with the area of the wedge that it's
with the area of the wedge that it's going to land on. Well, as you rotate
going to land on. Well, as you rotate theta prime, you see that our
theta prime, you see that our expectation value is going to vary
expectation value is going to vary linearly with the angle theta prime for
linearly with the angle theta prime for precisely the same reason. And you can
precisely the same reason. And you can think about that as a two-dimensional
think about that as a two-dimensional circle and a board game spinner thing.
circle and a board game spinner thing. Or you can think about it in the full
Or you can think about it in the full three dimensions as if it's like an
three dimensions as if it's like an orange and you have the volume of the
orange and you have the volume of the orange slice going along with the wedge
orange slice going along with the wedge angle. But in any case, this model is
angle. But in any case, this model is going to give us an expectation value of
going to give us an expectation value of the spin measurement which is linearly
the spin measurement which is linearly dependent on theta prime. And so if you
dependent on theta prime. And so if you consider the two boundary conditions
consider the two boundary conditions we've looked at for theta prime= 0 and
we've looked at for theta prime= 0 and theta prime= pi and then apply the fact
theta prime= pi and then apply the fact that this is a linear function and then
that this is a linear function and then just think in terms of y = mx + b. You
just think in terms of y = mx + b. You see that our equation for the
see that our equation for the expectation value of the spin
expectation value of the spin measurement is necessarily 1 - 2 theta
measurement is necessarily 1 - 2 theta prime over pi. And as we know this
prime over pi. And as we know this linear function is not what quantum
linear function is not what quantum mechanics predicts and is not a match of
mechanics predicts and is not a match of the experimental data because in both
the experimental data because in both cases that's going to be the coine of
cases that's going to be the coine of the angle, not a linear function of the
the angle, not a linear function of the angle. But here's where the sketchy move
angle. But here's where the sketchy move comes in. Right? Here's why we have a
comes in. Right? Here's why we have a prime instead of just a. Suppose then
prime instead of just a. Suppose then that a prime is obtained from a by
that a prime is obtained from a by rotation towards the polarization vector
rotation towards the polarization vector p until 1 - 2 thet prime / pi equals
p until 1 - 2 thet prime / pi equals cosine of theta. Call that equation 6
cosine of theta. Call that equation 6 where theta is the angle between the
where theta is the angle between the measurement axis a and the polarization
measurement axis a and the polarization vector p. So that's that sketchy move
vector p. So that's that sketchy move that we use in order to warp the linear
that we use in order to warp the linear function into a cosine function. Well
function into a cosine function. Well then if we do that if we apply equation
then if we do that if we apply equation six then we have the desired result that
six then we have the desired result that the expectation value of the spin
the expectation value of the spin measurement is cosine of theta which is
measurement is cosine of theta which is in alignment with quantum physics and
in alignment with quantum physics and it's in alignment with the experimental
it's in alignment with the experimental data. And so technically we haven't done
data. And so technically we haven't done anything illegal here. We haven't broken
anything illegal here. We haven't broken any rules and this model therefore
any rules and this model therefore cannot be completely dismissed though it
cannot be completely dismissed though it is contrived and it is implausible and
is contrived and it is implausible and it's like we don't want to have to
it's like we don't want to have to believe this because if we have a
believe this because if we have a detector which is oriented along the
detector which is oriented along the vector A and we have to stipulate that
vector A and we have to stipulate that no actually what's happening there is
no actually what's happening there is the effective measurement axis is bent a
the effective measurement axis is bent a little bit in towards the polarization
little bit in towards the polarization vector. It's like uh well you can say
vector. It's like uh well you can say that but why would that be the case?
that but why would that be the case? This is not a very convincing model but
This is not a very convincing model but we will not dismiss it on the basis that
we will not dismiss it on the basis that it's not convincing. Instead we're going
it's not convincing. Instead we're going to go ahead and say look it's possible.
to go ahead and say look it's possible. We're not going to rule it out just yet.
We're not going to rule it out just yet. And so by lowering the epistemic
And so by lowering the epistemic standards for the hidden variable model,
standards for the hidden variable model, then that's going to hold us to a higher
then that's going to hold us to a higher standard when later on we rule out all
standard when later on we rule out all possible local hidden variable models.
possible local hidden variable models. Because then we'll be able to say, look,
Because then we'll be able to say, look, we went along with the sketchy move. We
we went along with the sketchy move. We allowed it. But even allowing that, our
allowed it. But even allowing that, our proof later on is going to be so strong
proof later on is going to be so strong that we're going to actually show that
that we're going to actually show that despite our generosity here, despite
despite our generosity here, despite being maximally charitable to the local
being maximally charitable to the local hidden variable model perspective, later
hidden variable model perspective, later on we're going to show that it just
on we're going to show that it just doesn't work. All right. So in this
doesn't work. All right. So in this simple case there is no difficulty in
simple case there is no difficulty in the view that the result of every
the view that the result of every measurement is determined by the value
measurement is determined by the value of an extra variable lambda and that the
of an extra variable lambda and that the statistical features of quantum
statistical features of quantum mechanics arise because the value of
mechanics arise because the value of this variable is unknown in individual
this variable is unknown in individual instances.
instances. That is in this particular case we can
That is in this particular case we can come up with a story involving local
come up with a story involving local hidden variables and it kind of appears
hidden variables and it kind of appears to work even though it is a little bit
to work even though it is a little bit sketchy.
sketchy. Okay, so part three of the paper then
Okay, so part three of the paper then goes on to show that hidden variables
goes on to show that hidden variables also seem to work for special cases in
also seem to work for special cases in which the two detectors have special
which the two detectors have special orientations for their measurement axis.
orientations for their measurement axis. Secondly, there is no difficulty in
Secondly, there is no difficulty in reproducing in the form of equation two
reproducing in the form of equation two that is the correlation function based
that is the correlation function based on local hidden variables the only
on local hidden variables the only features of the quantum mechanical and
features of the quantum mechanical and experimental correlation function three
experimental correlation function three commonly used in verbal discussions of
commonly used in verbal discussions of this problem. That is when our two
this problem. That is when our two measurement directions are the same in
measurement directions are the same in which case we have P of A and A cuz A
which case we have P of A and A cuz A and B are the same when they're aligned
and B are the same when they're aligned the same way. And that'll give us the
the same way. And that'll give us the negative of the correlation that we
negative of the correlation that we would find when B is equal to negative A
would find when B is equal to negative A and that's equal to1.
and that's equal to1. So when the unit vectors A and B are
So when the unit vectors A and B are aligned the same way, we get a perfect
aligned the same way, we get a perfect anti-correlation of -1. And when A and B
anti-correlation of -1. And when A and B are oppositely aligned, then we get a
are oppositely aligned, then we get a perfect correlation of 1. And the other
perfect correlation of 1. And the other special case is when the dotproduct of A
special case is when the dotproduct of A and B equals zero. That is when A and B
and B equals zero. That is when A and B are perfectly perpendicular to each
are perfectly perpendicular to each other. in which case we have no
other. in which case we have no correlation.
correlation. So aligned the same way we have negative
So aligned the same way we have negative 1. Aligned opposite ways P is 1.
1. Aligned opposite ways P is 1. Perpendicular P is zero. And these three
Perpendicular P is zero. And these three special cases can be explained by a
special cases can be explained by a local hidden variable model. For
local hidden variable model. For example, let lambda now be the unit
example, let lambda now be the unit vector lambda with uniform probability
vector lambda with uniform probability distribution over all directions and
distribution over all directions and take the rules that the measurement
take the rules that the measurement outcome a as a function of the unit
outcome a as a function of the unit vector a and this hidden variable lambda
vector a and this hidden variable lambda vector is going to be the sign of a dol
vector is going to be the sign of a dol lambda. And conversely, the measurement
lambda. And conversely, the measurement outcome at b as a function of the unit
outcome at b as a function of the unit vector b and the hidden variable vector
vector b and the hidden variable vector lambda is going to be the negative sign
lambda is going to be the negative sign of b do lambda. By the way, in Belle's
of b do lambda. By the way, in Belle's paper, there's a typo here. In the
paper, there's a typo here. In the paper, it's written as B is a function
paper, it's written as B is a function of A and B, but that should be B as a
of A and B, but that should be B as a function of B and lambda. All right. So,
function of B and lambda. All right. So, what are we doing here? Well, what we're
what are we doing here? Well, what we're saying is that we have the two particles
saying is that we have the two particles in the singlet state, and we're going to
in the singlet state, and we're going to stick a unit vector onto this pair of
stick a unit vector onto this pair of particles. So you can imagine particles
particles. So you can imagine particles one and particles 2 both carrying along
one and particles 2 both carrying along this orientational piece of information.
this orientational piece of information. This unit vector lambda which is chosen
This unit vector lambda which is chosen totally randomly out of all possible
totally randomly out of all possible directions. And [clears throat] then
directions. And [clears throat] then when particle 1 gets to detector A, if
when particle 1 gets to detector A, if lambda is pointing kind of along the
lambda is pointing kind of along the direction of A, that is if the
direction of A, that is if the dotproduct of A and lambda is positive,
dotproduct of A and lambda is positive, then you measure a spin up of particle 1
then you measure a spin up of particle 1 in detector A. And likewise, as particle
in detector A. And likewise, as particle 2 is measured in detector B, if the
2 is measured in detector B, if the lambda vector is pointing in the same
lambda vector is pointing in the same kind of direction as B, then you measure
kind of direction as B, then you measure a spin down at B. So what this model is
a spin down at B. So what this model is is kind of uh what we might
is kind of uh what we might instinctively expect is happening with a
instinctively expect is happening with a pair of particles who have an entangled
pair of particles who have an entangled spin because you might expect that there
spin because you might expect that there is some kind of orientational quantity
is some kind of orientational quantity that each particle intrinsically has,
that each particle intrinsically has, but that quantum mechanics doesn't
but that quantum mechanics doesn't account for. and that this hidden
account for. and that this hidden variable which carries with it a kind of
variable which carries with it a kind of orientation is what predetermines how
orientation is what predetermines how particles 1 and two are going to be
particles 1 and two are going to be measured at A and B respectively.
measured at A and B respectively. And so the claim is that this rule given
And so the claim is that this rule given by equation 9 works in the special cases
by equation 9 works in the special cases that the vectors A and B are perfectly
that the vectors A and B are perfectly parallel, perfectly antiparallel or
parallel, perfectly antiparallel or perfectly perpendicular. And you can
perfectly perpendicular. And you can show that that's the case. So in the
show that that's the case. So in the first case, imagine A and B being
first case, imagine A and B being perfectly parallel. Well then in
perfectly parallel. Well then in equation 9 you see that the rules for
equation 9 you see that the rules for the measurement outcomes at a and b are
the measurement outcomes at a and b are going to be equal and opposite in that
going to be equal and opposite in that case because for a we have the sign of a
case because for a we have the sign of a dot lambda but if a and b are the same
dot lambda but if a and b are the same vector then for b the rule is that it's
vector then for b the rule is that it's the negative sign of b do lambda which
the negative sign of b do lambda which is equal to a dot lambda. So you have
is equal to a dot lambda. So you have the negative of the outcome of particle
the negative of the outcome of particle a. Therefore we find perfect
a. Therefore we find perfect anti-correlation in the case that the
anti-correlation in the case that the unit vector a equals the unit vector b.
unit vector a equals the unit vector b. Likewise, then if you reverse that logic
Likewise, then if you reverse that logic and you look at rule 9 in the case that
and you look at rule 9 in the case that A and B are antiparallel, so B equals
A and B are antiparallel, so B equals negative A, then the measurement outcome
negative A, then the measurement outcome at detector A is s of A dot lambda. And
at detector A is s of A dot lambda. And the measurement outcome at detector B is
the measurement outcome at detector B is negative sign of B do lambda. B dot
negative sign of B do lambda. B dot lambda in this case would equal A dot
lambda in this case would equal A dot lambda. And you can carry that negative
lambda. And you can carry that negative sign outside of the sign function. So
sign outside of the sign function. So that then the two negatives cancel out
that then the two negatives cancel out and we find for the measurement outcome
and we find for the measurement outcome at B sine of A do lambda which is
at B sine of A do lambda which is precisely the same as the measurement
precisely the same as the measurement outcome at A. So in the case that the
outcome at A. So in the case that the measurement directions A and B are
measurement directions A and B are perfectly antiparallel we find a perfect
perfectly antiparallel we find a perfect correlation of one for the measurement
correlation of one for the measurement outcomes with this local hidden variable
outcomes with this local hidden variable model. And so in that case this model
model. And so in that case this model works just fine. And then finally, for
works just fine. And then finally, for the case that A and B are perpendicular,
the case that A and B are perpendicular, whatever the measurement outcome is at
whatever the measurement outcome is at A, you're going to have a 50/50 chance
A, you're going to have a 50/50 chance of it being the same or the opposite at
of it being the same or the opposite at B. And so in that case too, this model
B. And so in that case too, this model works just fine.
works just fine. But again, this model has a flaw, which
But again, this model has a flaw, which is that just like what we saw before in
is that just like what we saw before in part 3A, the dependence of the
part 3A, the dependence of the measurement correlation on the angle
measurement correlation on the angle theta between the vectors A and B is
theta between the vectors A and B is linear in theta. It's not the negative
linear in theta. It's not the negative cosine of theta that we expect from
cosine of theta that we expect from quantum physics and that is shown in
quantum physics and that is shown in experiments.
experiments. And to see that let's draw a picture
And to see that let's draw a picture where we imagine all possibilities for
where we imagine all possibilities for lambda selected uniformly across all
lambda selected uniformly across all possible directions and then we draw the
possible directions and then we draw the measurement direction a and you consider
measurement direction a and you consider the hemisphere of all possible vectors
the hemisphere of all possible vectors that sort of point in the same direction
that sort of point in the same direction as a that is all vectors for which a dot
as a that is all vectors for which a dot that vector is positive. Well, then the
that vector is positive. Well, then the measurement result at detector A is
measurement result at detector A is going to be spin up if lambda is in the
going to be spin up if lambda is in the same hemisphere as A or it'll be spin
same hemisphere as A or it'll be spin down if lambda is in the opposite
down if lambda is in the opposite hemisphere. So, we have a 50/50 chance
hemisphere. So, we have a 50/50 chance of measuring spin up or spin down, which
of measuring spin up or spin down, which is an agreement with experiment. But
is an agreement with experiment. But then things get a little tricky when you
then things get a little tricky when you also draw the measurement direction B in
also draw the measurement direction B in detector B and then you apply the same
detector B and then you apply the same reasoning about what the measurement
reasoning about what the measurement result is going to be in detector B. In
result is going to be in detector B. In this case, the result is going to be
this case, the result is going to be spin down if lambda is in the same
spin down if lambda is in the same hemisphere as B. Spin down because we're
hemisphere as B. Spin down because we're in the singlet state where the spins are
in the singlet state where the spins are anti-correlated and that's encoded in
anti-correlated and that's encoded in the minus sign in the second part of
the minus sign in the second part of equation 9. And then conversely,
equation 9. And then conversely, detector B will measure spin up if
detector B will measure spin up if lambda is not in the same hemisphere as
lambda is not in the same hemisphere as the measurement direction B.
the measurement direction B. And then if we want to go ahead and
And then if we want to go ahead and imagine this as an animation where we're
imagine this as an animation where we're sweeping the theta angle and considering
sweeping the theta angle and considering simultaneously all possibilities for the
simultaneously all possibilities for the hidden variable lambda that are
hidden variable lambda that are uniformly distributed over the sphere
uniformly distributed over the sphere which you may as well imagine as a
which you may as well imagine as a circle or a sphere because in either
circle or a sphere because in either case the area or the volume respectively
case the area or the volume respectively changes the same way as a function of
changes the same way as a function of the theta angle. Well, then just think
the theta angle. Well, then just think about what is the probability of having
about what is the probability of having the same outcome at both detectors
the same outcome at both detectors versus the probability of having
versus the probability of having opposite outcomes. And what you realize
opposite outcomes. And what you realize is that you're going to have the same
is that you're going to have the same outcome at both detectors when lambda is
outcome at both detectors when lambda is in the hemisphere of one of the
in the hemisphere of one of the measurement directions, but not in the
measurement directions, but not in the hemisphere of the other measurement
hemisphere of the other measurement direction. So in this animation, if you
direction. So in this animation, if you look at the two sectors with the blue
look at the two sectors with the blue arc, for both of those sectors, you're
arc, for both of those sectors, you're going to have the same measurement
going to have the same measurement outcome for both A and B. And so the
outcome for both A and B. And so the product of the outcomes at A and B is
product of the outcomes at A and B is going to equal one if lambda lies in one
going to equal one if lambda lies in one of the two blue sectors shown here. And
of the two blue sectors shown here. And then on the other hand, if lambda is in
then on the other hand, if lambda is in the hemispheres of both measurement
the hemispheres of both measurement directions or neither measurement
directions or neither measurement directions, then in that case you're
directions, then in that case you're going to have opposite outcomes at the
going to have opposite outcomes at the two detectors. And so then the product
two detectors. And so then the product of the outcomes A and B is going to be
of the outcomes A and B is going to be -1.
-1. And so to find the correlation, all we
And so to find the correlation, all we have to do is compare the area of the
have to do is compare the area of the blue sectors to the area of the red
blue sectors to the area of the red sectors. And so all the formula is is 1
sectors. And so all the formula is is 1 * the fraction of the circle taken up by
* the fraction of the circle taken up by the blue sectors minus 1 * the fraction
the blue sectors minus 1 * the fraction of the circle taken up by the red
of the circle taken up by the red sectors.
sectors. And then as we sweep theta around, we
And then as we sweep theta around, we can see the linear dependence of the
can see the linear dependence of the correlation on the theta angle. And this
correlation on the theta angle. And this linear dependence of the correlation on
linear dependence of the correlation on theta, which now we've seen a few times
theta, which now we've seen a few times in a few different contexts, is really
in a few different contexts, is really at the heart of Bell's argument, as
at the heart of Bell's argument, as we're going to see in part four.
we're going to see in part four. And so in part 3B of this paper, Bell
And so in part 3B of this paper, Bell shows us that the local hidden variable
shows us that the local hidden variable model does work for the three special
model does work for the three special cases where A and B are either parallel,
cases where A and B are either parallel, antiparallel, or perpendicular.
antiparallel, or perpendicular. And when you look at the plot of the
And when you look at the plot of the correlation that we get from our local
correlation that we get from our local hidden variable model that is this blue
hidden variable model that is this blue line and you compare it to the quantum
line and you compare it to the quantum mechanical correlation that we would
mechanical correlation that we would expect namely negative cosine of theta
expect namely negative cosine of theta you see that even though these two
you see that even though these two curves are different they do in fact
curves are different they do in fact intersect at precisely these three
intersect at precisely these three special cases. And so part 3B of Belle's
special cases. And so part 3B of Belle's paper is all about saying like, yeah,
paper is all about saying like, yeah, the local hidden variable model does
the local hidden variable model does seem to work for those three special
seem to work for those three special cases. But nonetheless, the local hidden
cases. But nonetheless, the local hidden variable model breaks down for anything
variable model breaks down for anything other than those three special cases
other than those three special cases because a line is not a cosine. And
because a line is not a cosine. And there's actually a couple of ways in
there's actually a couple of ways in which a line is not a cosine. The most
which a line is not a cosine. The most obvious one is that there's just a
obvious one is that there's just a mismatch in these two curves for most
mismatch in these two curves for most values. So pick a theta value at random
values. So pick a theta value at random and negative cosine of theta is just not
and negative cosine of theta is just not the same value as what our linear
the same value as what our linear correlation gives us. So it doesn't
correlation gives us. So it doesn't match. But the other noticeable thing
match. But the other noticeable thing that differs between this linear
that differs between this linear correlation that we get from our local
correlation that we get from our local hidden variable model and the quantum
hidden variable model and the quantum mechanical correlation is that the
mechanical correlation is that the linear correlation has a nonzero slope
linear correlation has a nonzero slope at a theta angle of 0. Whereas the
at a theta angle of 0. Whereas the quantum mechanical correlation has a
quantum mechanical correlation has a flat slope of zero at theta= 0.
flat slope of zero at theta= 0. And this is kind of a subtle difference
And this is kind of a subtle difference between these two correlation functions,
between these two correlation functions, but nonetheless, it is a difference and
but nonetheless, it is a difference and it's a difference that's totally generic
it's a difference that's totally generic to all local hidden variable models. So,
to all local hidden variable models. So, one of the things that we're going to
one of the things that we're going to prove in this paper in part 4 a is that
prove in this paper in part 4 a is that any local hidden variable model is going
any local hidden variable model is going to have a nonzero slope at a theta angle
to have a nonzero slope at a theta angle of zero.
of zero. So this animation gives us a great
So this animation gives us a great intuition for how the local hidden
intuition for how the local hidden variable model gives us a correlation
variable model gives us a correlation which depends linearly on the angle
which depends linearly on the angle theta between the vectors a and b. And
theta between the vectors a and b. And therefore bell goes on to say this gives
therefore bell goes on to say this gives a correlation as a function of a and b
a correlation as a function of a and b of1 + 2 thet pi. Call that equation 10
of1 + 2 thet pi. Call that equation 10 where theta is the angle between the
where theta is the angle between the vectors a and b and 10 has the
vectors a and b and 10 has the properties of equation 8. that is it
properties of equation 8. that is it works for the three special cases. And
works for the three special cases. And of course, the precise form of equation
of course, the precise form of equation 10, this 2 pi, that's just y= mx plus b.
10, this 2 pi, that's just y= mx plus b. That's just what it has to be to be a
That's just what it has to be to be a line that goes through the boundary
line that goes through the boundary conditions given by equation 8. But
conditions given by equation 8. But noticeably, the blue curve and the
noticeably, the blue curve and the purple curve are not the same in
purple curve are not the same in general. Not only do their values not
general. Not only do their values not match in general but also at theta
match in general but also at theta equals z the blue line has a non-zero
equals z the blue line has a non-zero slope whereas the purple quantum curve
slope whereas the purple quantum curve has a slope of zero. Now here Belle
has a slope of zero. Now here Belle abruptly brings up a very important
abruptly brings up a very important point although it is kind of jarring the
point although it is kind of jarring the way in which he brings it up so abruptly
way in which he brings it up so abruptly but in any case following the paper um
but in any case following the paper um for comparison consider the result of a
for comparison consider the result of a modified theory in which the pure
modified theory in which the pure singlet state is replaced in the course
singlet state is replaced in the course of time by an isotropic mixture of
of time by an isotropic mixture of product states. This gives the
product states. This gives the correlation function a b / 3. Call that
correlation function a b / 3. Call that equation 11. Now, what does that mean? I
equation 11. Now, what does that mean? I mean, that sentence just comes out of
mean, that sentence just comes out of nowhere, right? And there is a lot that
nowhere, right? And there is a lot that Belle is communicating in this one
Belle is communicating in this one sentence. So, I want to take a moment to
sentence. So, I want to take a moment to unpack exactly what he means because
unpack exactly what he means because this is actually a really profound
this is actually a really profound point. So when we have our purple curve
point. So when we have our purple curve of negative cosine theta for the
of negative cosine theta for the correlation between the measurement
correlation between the measurement outcome at detector A and detector B.
outcome at detector A and detector B. This is based on the two particles being
This is based on the two particles being in the singlet spin state where before
in the singlet spin state where before the measurement neither particle has a
the measurement neither particle has a preferred spin direction. But the spin
preferred spin direction. But the spin measurement outcomes for the two
measurement outcomes for the two particles are guaranteed to be
particles are guaranteed to be anti-correlated along the same
anti-correlated along the same measurement axis. whatever that
measurement axis. whatever that measurement axis may be. On the other
measurement axis may be. On the other hand, if instead of the singlet state,
hand, if instead of the singlet state, we imagine that the two particles
we imagine that the two particles already have some preferred spin
already have some preferred spin direction before they're measured, but
direction before they're measured, but still their spins are equal and opposite
still their spins are equal and opposite relative to that particular spin
relative to that particular spin direction, then we would expect
direction, then we would expect anti-correlated spin measurements if the
anti-correlated spin measurements if the particles are measured along that
particles are measured along that particular spin direction. But if the
particular spin direction. But if the particles are measured perpendicularly
particles are measured perpendicularly to that spin direction, then in that
to that spin direction, then in that case we would expect no correlation
case we would expect no correlation between the spin outcomes of those two
between the spin outcomes of those two particles.
particles. And so what Belle means by isotropic
And so what Belle means by isotropic mixture of product states is that
mixture of product states is that imagine when we're producing these
imagine when we're producing these particles instead of being in the
particles instead of being in the singlet state with pure rotational
singlet state with pure rotational symmetry and no preferred spin axis a
symmetry and no preferred spin axis a priori instead of that the particle
priori instead of that the particle pairs do have an intrinsic preferred
pairs do have an intrinsic preferred spin direction relative to which they're
spin direction relative to which they're equal and opposite and then by isotropic
equal and opposite and then by isotropic all that means is that that direction
all that means is that that direction call it n hat is selected uniformly from
call it n hat is selected uniformly from the sphere. So the particles preferred
the sphere. So the particles preferred direction is going to be totally random.
direction is going to be totally random. And so now if you imagine measuring over
And so now if you imagine measuring over many such pairs of particles and for the
many such pairs of particles and for the sake of argument suppose we imagine them
sake of argument suppose we imagine them along the same measurement axis A. Well
along the same measurement axis A. Well sometimes that spin axis n is going to
sometimes that spin axis n is going to be aligned but usually it's not going to
be aligned but usually it's not going to be very aligned in which case we won't
be very aligned in which case we won't really see much of a correlation. And
really see much of a correlation. And when you work out the math of on
when you work out the math of on average, what correlation strength would
average, what correlation strength would we expect, you find a correlation
we expect, you find a correlation strength which is the same as for the
strength which is the same as for the singlet state, but divided by a factor
singlet state, but divided by a factor of three, which represents the fact that
of three, which represents the fact that when you average over all three
when you average over all three dimensions of space, more often than
dimensions of space, more often than not, our measurement directions are not
not, our measurement directions are not going to be aligned with the spin
going to be aligned with the spin direction n.
direction n. And so we actually see a very strong
And so we actually see a very strong theoretical and experimental difference
theoretical and experimental difference between the singlet state and a
between the singlet state and a situation where the particles have equal
situation where the particles have equal and opposite spin along some random
and opposite spin along some random axis. The correlation we get from the
axis. The correlation we get from the singlet state is weirdly strong in a
singlet state is weirdly strong in a surreal kind of way. And this reflects
surreal kind of way. And this reflects the fact that in the singlet state
the fact that in the singlet state neither particle has a preferred
neither particle has a preferred direction before it's measured. And so
direction before it's measured. And so if you think in terms of one of the
if you think in terms of one of the particles being measured ever so
particles being measured ever so slightly before the other, then you're
slightly before the other, then you're guaranteed to collapse the wave function
guaranteed to collapse the wave function along that measurement direction. And so
along that measurement direction. And so in the singlet state, your measurement
in the singlet state, your measurement axes are always going to be more
axes are always going to be more aligned. Whereas for an isotropic
aligned. Whereas for an isotropic mixture of product states, in general,
mixture of product states, in general, you're not going to have this kind of
you're not going to have this kind of alignment.
alignment. All right. So Belle then goes on to say
All right. So Belle then goes on to say it is probably less easy experimentally
it is probably less easy experimentally to distinguish equation 10 from equation
to distinguish equation 10 from equation 3 than equation 11 from equation 3. So
3 than equation 11 from equation 3. So equation 10 is the linear correlation
equation 10 is the linear correlation that we get from our local hidden
that we get from our local hidden variable model. And equation 11 is the
variable model. And equation 11 is the A.B3
A.B3 that is negative cosine theta over 3
that is negative cosine theta over 3 correlation that we get from a quantum
correlation that we get from a quantum mechanical model in which the two
mechanical model in which the two particles are not in the singlet state
particles are not in the singlet state but rather are in a product state with
but rather are in a product state with some preferred direction. And what Bell
some preferred direction. And what Bell is saying here is that there's really a
is saying here is that there's really a big contrast in the experimental data
big contrast in the experimental data between a singlet state and an isotropic
between a singlet state and an isotropic mixture of product states. whereas the
mixture of product states. whereas the linear correlation from a local hidden
linear correlation from a local hidden variable model is going to be a better
variable model is going to be a better approximation to the actual quantum
approximation to the actual quantum mechanical singlet correlation. So
mechanical singlet correlation. So that's just a point about experimental
that's just a point about experimental practicality.
practicality. Now before moving on from part 3B, Bell
Now before moving on from part 3B, Bell makes one final comment which is that
makes one final comment which is that unlike equation 3, the quantum
unlike equation 3, the quantum mechanical correlation negative cosine
mechanical correlation negative cosine of theta, the function of equation 10,
of theta, the function of equation 10, this linear correlation we get from the
this linear correlation we get from the local hidden variable model is not
local hidden variable model is not stationary. That is the slope is non
stationary. That is the slope is non zero at the minimum value -1 where theta
zero at the minimum value -1 where theta equals 0. So we talked about that
equals 0. So we talked about that earlier when thinking about the
earlier when thinking about the differences between the blue line and
differences between the blue line and the magenta curve that is between the
the magenta curve that is between the local hidden variable model and the
local hidden variable model and the quantum mechanical correlation.
quantum mechanical correlation. One of the differences is that the
One of the differences is that the values in general are not the same
values in general are not the same value. But another difference is that
value. But another difference is that the quantum mechanical correlation has a
the quantum mechanical correlation has a slope of zero at its minimum value
slope of zero at its minimum value whereas the local hidden variable line
whereas the local hidden variable line does not. It'll be seen in part 4 a that
does not. It'll be seen in part 4 a that this is characteristic of functions of
this is characteristic of functions of type two that is where the correlation
type two that is where the correlation is given by a local hidden variable
is given by a local hidden variable model. So in part 4 a we're going to
model. So in part 4 a we're going to prove that any local hidden variable
prove that any local hidden variable model is going to have a nonzero slope
model is going to have a nonzero slope in its correlation function at the
in its correlation function at the minimum value which is incompatible with
minimum value which is incompatible with quantum mechanics and with the
quantum mechanics and with the experimental data. And then in part 4B,
experimental data. And then in part 4B, we're going to prove that in general,
we're going to prove that in general, the two correlation curves for a local
the two correlation curves for a local hidden variable model and for quantum
hidden variable model and for quantum mechanics in general cannot take on the
mechanics in general cannot take on the same values everywhere.
same values everywhere. So in part four, we're going to prove in
So in part four, we're going to prove in two different ways that local hidden
two different ways that local hidden variable models are not compatible with
variable models are not compatible with quantum mechanics and not compatible
quantum mechanics and not compatible with the experimental data.
with the experimental data. Okay, so then Bell wraps up part three
Okay, so then Bell wraps up part three by talking about how a hidden variable
by talking about how a hidden variable model could work if we allow for
model could work if we allow for non-locality.
non-locality. Thirdly and finally, there is no
Thirdly and finally, there is no difficulty in reproducing the quantum
difficulty in reproducing the quantum mechanical correlation of equation three
mechanical correlation of equation three if the results of the spin measurements
if the results of the spin measurements at A and B in equation two, the
at A and B in equation two, the correlation function of the local hidden
correlation function of the local hidden variable model are allowed to depend on
variable model are allowed to depend on the measurement directions B and A
the measurement directions B and A respectively as well as on A and B. And
respectively as well as on A and B. And Belle shows this by saying if we do a
Belle shows this by saying if we do a non-local sketchy move, we can warp the
non-local sketchy move, we can warp the blue line into the magenta curve. So the
blue line into the magenta curve. So the reasoning here is exactly the same as
reasoning here is exactly the same as what we've seen before when we thought
what we've seen before when we thought about doing a sketchy move to warp the
about doing a sketchy move to warp the line into the curve. But the key
line into the curve. But the key difference now is that when you have two
difference now is that when you have two entangled particles that are separated
entangled particles that are separated in space, you can't do this sketchy move
in space, you can't do this sketchy move unless you know the angle between the
unless you know the angle between the measurement directions A and B, which
measurement directions A and B, which are in different light cones. And so
are in different light cones. And so this is a non-local sketchy move because
this is a non-local sketchy move because somehow what's happening at detector A
somehow what's happening at detector A depends on the measurement axis at
depends on the measurement axis at detector B and vice versa. So as a
detector B and vice versa. So as a concrete example of this, we can replace
concrete example of this, we can replace the vector A in equation 9 by an
the vector A in equation 9 by an effective measurement axis A prime
effective measurement axis A prime obtained from A by rotation towards the
obtained from A by rotation towards the measurement vector B until 1 - 2 theta
measurement vector B until 1 - 2 theta prime over pi equals cosine of theta
prime over pi equals cosine of theta where theta prime is the angle between
where theta prime is the angle between the effective measurement axis A prime
the effective measurement axis A prime and B. So if you make that sketchy move
and B. So if you make that sketchy move then the blue line is going to warp into
then the blue line is going to warp into the magenta quantum curve and then in
the magenta quantum curve and then in that case we would have a match between
that case we would have a match between our hidden variable model and quantum
our hidden variable model and quantum mechanics and the experimental data. And
mechanics and the experimental data. And so this is exactly the same reasoning as
so this is exactly the same reasoning as the sketchy moves that we looked at
the sketchy moves that we looked at before. In fact, it's exactly the same
before. In fact, it's exactly the same mathematical maneuver. However, for
mathematical maneuver. However, for given values of the hidden variables,
given values of the hidden variables, the results of measurements with one
the results of measurements with one magnet now depend on the setting of the
magnet now depend on the setting of the distant magnet, which is just what we
distant magnet, which is just what we would wish to avoid, that is
would wish to avoid, that is non-locality.
non-locality. And there's really no way around that.
And there's really no way around that. If you look at the example shown here
If you look at the example shown here where we replaced a with a prime and you
where we replaced a with a prime and you think maybe there's some way to do the
think maybe there's some way to do the sketchy move differently in a way that
sketchy move differently in a way that doesn't violate locality, well, try to
doesn't violate locality, well, try to do that and you find it doesn't work. So
do that and you find it doesn't work. So for example, what if instead of rotating
for example, what if instead of rotating A into A prime, we leave A alone and
A into A prime, we leave A alone and rotate B into B prime in a way that
rotate B into B prime in a way that gives us the same result. Well, that
gives us the same result. Well, that would require for B prime to be a vector
would require for B prime to be a vector that's slightly rotated towards A. And
that's slightly rotated towards A. And again, it's the same thing. And in fact,
again, it's the same thing. And in fact, by symmetry, that reasoning is the same
by symmetry, that reasoning is the same as before, where now we're just saying
as before, where now we're just saying that what's happening at detector B is
that what's happening at detector B is somehow bent towards the measurement
somehow bent towards the measurement direction A. And so really, it's the
direction A. And so really, it's the same kind of nonsense.
same kind of nonsense. And then also philosophically we might
And then also philosophically we might expect there to be some symmetry here.
expect there to be some symmetry here. So if we wanted an idea like this to
So if we wanted an idea like this to work maybe we should actually bend A to
work maybe we should actually bend A to A prime and B to B prime where A prime
A prime and B to B prime where A prime is bent towards B and B prime is bent
is bent towards B and B prime is bent towards A in an equal and opposite kind
towards A in an equal and opposite kind of way. But in that case then both
of way. But in that case then both detectors know something about how the
detectors know something about how the other detector is configured. And so
other detector is configured. And so fundamentally it's exactly the same
fundamentally it's exactly the same problem no matter how you look at it.
problem no matter how you look at it. So reflecting on part three, we've seen
So reflecting on part three, we've seen some specific examples of how hidden
some specific examples of how hidden variable models don't really work. They
variable models don't really work. They just don't match the experimental data,
just don't match the experimental data, whereas quantum mechanics does. And so
whereas quantum mechanics does. And so what follows in part four is going to be
what follows in part four is going to be very abstract, very mathematical, very
very abstract, very mathematical, very algebraic, and we're going to take our
algebraic, and we're going to take our time with it because it's a whole lot of
time with it because it's a whole lot of equations and symbols and all that. But
equations and symbols and all that. But if you followed along part three, then
if you followed along part three, then you already have the fundamental insight
you already have the fundamental insight required to make sense of part four. All
required to make sense of part four. All we're doing in part 4 is generalizing on
we're doing in part 4 is generalizing on this specific example to show first that
this specific example to show first that every local hidden variable model is
every local hidden variable model is going to have a correlation function
going to have a correlation function with nonzero slope at its minimum value,
with nonzero slope at its minimum value, which is in contradiction with quantum
which is in contradiction with quantum mechanics and the experimental data.
mechanics and the experimental data. And then second, in part 4B, we're going
And then second, in part 4B, we're going to show that in general, the correlation
to show that in general, the correlation function given by a local hidden
function given by a local hidden variable model cannot take on the same
variable model cannot take on the same value as the correlation given by
value as the correlation given by quantum mechanics and experiment at
quantum mechanics and experiment at every theta point. That is for every
every theta point. That is for every possible configuration of the
possible configuration of the measurement axes A and B. And so it's
measurement axes A and B. And so it's the same kind of reasoning that we've
the same kind of reasoning that we've seen in part three, but just in a much
seen in part three, but just in a much more abstract and generic kind of way.
more abstract and generic kind of way. And the abstraction is worth it. Even
And the abstraction is worth it. Even though it is somewhat impenetrable and
though it is somewhat impenetrable and it takes a lot of time to digest, it's
it takes a lot of time to digest, it's going to be a very powerful result. And
going to be a very powerful result. And so, as usual, ask not for easier
so, as usual, ask not for easier equations, but for stronger coffee. You
equations, but for stronger coffee. You got to prepare yourself for this because
got to prepare yourself for this because it's going to be a bit of work, but it
it's going to be a bit of work, but it is well worth the effort.
is well worth the effort. All right, my friends. We're now ready
All right, my friends. We're now ready to approach the core argument of Bell's
to approach the core argument of Bell's paper, part four, contradiction.
paper, part four, contradiction. Okay, so in the first part of part four,
Okay, so in the first part of part four, we're going to show that the correlation
we're going to show that the correlation function that we get from a local hidden
function that we get from a local hidden variable model cannot be stationary at
variable model cannot be stationary at its minimum value when theta equals 0
its minimum value when theta equals 0 unlike the quantum correlation which is
unlike the quantum correlation which is stationary that is does have zero slope
stationary that is does have zero slope at its minimum value for theta equals 0.
at its minimum value for theta equals 0. And so this is going to be a generic
And so this is going to be a generic difference between the kinds of
difference between the kinds of correlations that local hidden variable
correlations that local hidden variable models can give us and the correlation
models can give us and the correlation that we expect from quantum mechanics
that we expect from quantum mechanics which is also the correlation measured
which is also the correlation measured in experiments.
in experiments. All right, the main result will now be
All right, the main result will now be proved because row is a normalized
proved because row is a normalized probability distribution. The integral
probability distribution. The integral over row d lambda equals 1. And we saw
over row d lambda equals 1. And we saw that before. That just means if you
that before. That just means if you consider every possible configuration of
consider every possible configuration of hidden variables and add them all up,
hidden variables and add them all up, each one weighted by its probability,
each one weighted by its probability, then the result is going to be one. In
then the result is going to be one. In other words, the hidden variables have
other words, the hidden variables have to be in some kind of configuration.
to be in some kind of configuration. And next, because of the properties of
And next, because of the properties of equation one, where we saw that the
equation one, where we saw that the measurement outcomes at detectors A and
measurement outcomes at detectors A and B can only take on the values of + one
B can only take on the values of + one or minus one depending on whether that
or minus one depending on whether that detector measured spin up or spin down
detector measured spin up or spin down respectively. Then if we consider the
respectively. Then if we consider the definition of our local hidden variable
definition of our local hidden variable correlation function in equation two
correlation function in equation two where we found that P is going to be the
where we found that P is going to be the integral over all possible
integral over all possible configurations of the hidden variables
configurations of the hidden variables of the measurement result at A times the
of the measurement result at A times the measurement result at B and this
measurement result at B and this correlation is going to be a function of
correlation is going to be a function of the measurement axes A and B. Then as
the measurement axes A and B. Then as you can see this correlation P cannot be
you can see this correlation P cannot be less than -1.
less than -1. That is the lowest value our correlation
That is the lowest value our correlation can be is a perfectly anti-correlated
can be is a perfectly anti-correlated value of negative 1.
value of negative 1. And when can it take on that value?
And when can it take on that value? Well, as we've seen, the correlation
Well, as we've seen, the correlation function can only reach -1 at a equals
function can only reach -1 at a equals b. That is when the two measurements are
b. That is when the two measurements are aligned along the same axis. Then for
aligned along the same axis. Then for the singlet state, you're going to have
the singlet state, you're going to have perfectly anti-correlated results.
perfectly anti-correlated results. Measure spin up at detector A along the
Measure spin up at detector A along the axis A. And for sure you know you're
axis A. And for sure you know you're going to measure spin down at detector B
going to measure spin down at detector B for an axis B which is equal to A. So
for an axis B which is equal to A. So we've seen that before. That's nothing
we've seen that before. That's nothing new. And now Belle makes a technically
new. And now Belle makes a technically nuanced comment which is that this is
nuanced comment which is that this is only the case if A as a function of A
only the case if A as a function of A and lambda is equal to B as a function
and lambda is equal to B as a function of A and lambda except at a set of
of A and lambda except at a set of points lambda of zero probability.
points lambda of zero probability. Now this is a technical caveat that is
Now this is a technical caveat that is designed to keep this argument fully
designed to keep this argument fully generic. We know from experiments that
generic. We know from experiments that for the singlet state, it is going to be
for the singlet state, it is going to be true that the measurement result at A
true that the measurement result at A for measurement axis A is indeed going
for measurement axis A is indeed going to be equal to the negative of the
to be equal to the negative of the measurement result at B for measurement
measurement result at B for measurement along the same axis A. But because we're
along the same axis A. But because we're trying to rule out the possibility of
trying to rule out the possibility of all imaginable hidden variable models,
all imaginable hidden variable models, you could in theory imagine a model
you could in theory imagine a model where these functions A as a function of
where these functions A as a function of A and lambda is not necessarily equal to
A and lambda is not necessarily equal to B as a function of A and lambda. But you
B as a function of A and lambda. But you could have some superfluous
could have some superfluous configurations of hidden variables. And
configurations of hidden variables. And that's technically fine as long as those
that's technically fine as long as those configurations of hidden variables have
configurations of hidden variables have zero probability. So this is a really
zero probability. So this is a really minor point and honestly it probably
minor point and honestly it probably kind of goes without saying because we
kind of goes without saying because we know from the experimental data that for
know from the experimental data that for sure the result at detector A is going
sure the result at detector A is going to be the negative of the result at
to be the negative of the result at detector B when you're measuring along
detector B when you're measuring along the same axis. So you can think of that
the same axis. So you can think of that as an experimental boundary condition.
as an experimental boundary condition. And if any local hidden variable model
And if any local hidden variable model disagrees with that, that is if you have
disagrees with that, that is if you have a local hidden variable model that goes
a local hidden variable model that goes against equation 13, well, that can only
against equation 13, well, that can only match the experiment if the lambda which
match the experiment if the lambda which violate equation 13 have zero
violate equation 13 have zero probability of occurring. Anyway, I
probability of occurring. Anyway, I think the paper probably could have gone
think the paper probably could have gone without that little comment about a set
without that little comment about a set of points lambda of zero probability,
of points lambda of zero probability, but it's in there just for the reader
but it's in there just for the reader who's going to be very pedantic about
who's going to be very pedantic about that. So, all right then. If we assume
that. So, all right then. If we assume equation 13, which is really less of an
equation 13, which is really less of an assumption and more of an experimental
assumption and more of an experimental boundary condition, then equation 2, the
boundary condition, then equation 2, the correlation for a local hidden variable
correlation for a local hidden variable model, can be written as P as a function
model, can be written as P as a function of A and B is equal to the negative of
of A and B is equal to the negative of the integral over all possible
the integral over all possible configurations of hidden variables of
configurations of hidden variables of the result at detector A as a function
the result at detector A as a function of A and lambda times the hypothetical
of A and lambda times the hypothetical result at detector A as a function
result at detector A as a function and lambda. Now let's linger on that for
and lambda. Now let's linger on that for a second. What is this term as a
a second. What is this term as a function of lambda? Well, what that is
function of lambda? Well, what that is is imagine a generic case where we have
is imagine a generic case where we have our detectors A and B and A is aligned
our detectors A and B and A is aligned with some axis A and the alignment of
with some axis A and the alignment of detector B is some axis B. Well, we know
detector B is some axis B. Well, we know that our correlation is going to depend
that our correlation is going to depend on the product of the measurement
on the product of the measurement results at detectors A and B. And all
results at detectors A and B. And all equation 14 is is that the result at
equation 14 is is that the result at detector B can be thought of as the
detector B can be thought of as the negative of the result that detector A
negative of the result that detector A would measure if A were aligned along
would measure if A were aligned along the B axis. And so you see the only
the B axis. And so you see the only difference between equation 14 and
difference between equation 14 and equation two is that the measurement
equation two is that the measurement result at B aligned along the axis B as
result at B aligned along the axis B as a function of our hidden variables
a function of our hidden variables lambda has been replaced with what would
lambda has been replaced with what would have been the results of the measurement
have been the results of the measurement at A if A were aligned along the same
at A if A were aligned along the same axis B and we had the same hidden
axis B and we had the same hidden variables lambda. So this is just a way
variables lambda. So this is just a way of writing our correlation in terms of
of writing our correlation in terms of measurement results at detector A.
measurement results at detector A. All right. And next what we're going to
All right. And next what we're going to do is we're going to let C be another
do is we're going to let C be another unit vector which is an alternative
unit vector which is an alternative option for B. So imagine C as the
option for B. So imagine C as the alignment in detector B. In fact, at
alignment in detector B. In fact, at first imagine that C is the same thing
first imagine that C is the same thing as B and then give it just a little
as B and then give it just a little nudge so that C is just a little
nudge so that C is just a little different than B. And then the question
different than B. And then the question we can ask is if you imagine two
we can ask is if you imagine two hypothetical scenarios, one where you
hypothetical scenarios, one where you had the measurement axes A and B and
had the measurement axes A and B and another where you had the measurement
another where you had the measurement axes A and C where C is just a little
axes A and C where C is just a little nudge away from B. Then how do we
nudge away from B. Then how do we calculate the difference in the
calculate the difference in the correlations P of A and B and P of A and
correlations P of A and B and P of A and C? In other words, what kind of
C? In other words, what kind of difference in the correlation do we get
difference in the correlation do we get when we apply a small little nudge on
when we apply a small little nudge on the axis of detector B?
the axis of detector B? Well, all we have to do is replace P
Well, all we have to do is replace P with the integral formula given by
with the integral formula given by equation 14. And we can go ahead and
equation 14. And we can go ahead and smush these together into one integral.
smush these together into one integral. And we see that we have the negative
And we see that we have the negative integral over all possibilities for the
integral over all possibilities for the hidden variables of a as a function of a
hidden variables of a as a function of a and lambda * a as a function of b and
and lambda * a as a function of b and lambda minus a as a function of a and
lambda minus a as a function of a and lambda time a as a function of c and
lambda time a as a function of c and lambda which is how the correlation
lambda which is how the correlation function would change if we slightly
function would change if we slightly changed the measurement axis at detector
changed the measurement axis at detector B from the vector B to the very similar
B from the vector B to the very similar vector C. So now Belle goes on to
vector C. So now Belle goes on to algebraically massage this integral
algebraically massage this integral expression into a different form shown
expression into a different form shown here.
here. And to see what he's done here, let's go
And to see what he's done here, let's go ahead and color code this like so. So
ahead and color code this like so. So first of all, you see that both parts of
first of all, you see that both parts of the integrant have in common this factor
the integrant have in common this factor of a as a function of a and lambda. So
of a as a function of a and lambda. So we can go ahead and factor that out and
we can go ahead and factor that out and pull that to the left. And the next
pull that to the left. And the next thing you want to look at is in the top
thing you want to look at is in the top expression there, we have that factor of
expression there, we have that factor of a of b and lambda. And we also have a
a of b and lambda. And we also have a minus sign. So now what we're going to
minus sign. So now what we're going to do to bring that into the bottom
do to bring that into the bottom expression is we're going to factor out
expression is we're going to factor out that term a of b and lambda. So we're
that term a of b and lambda. So we're going to bring that to the left. And
going to bring that to the left. And then what remains is just the number
then what remains is just the number one. But then we're going to go ahead
one. But then we're going to go ahead and pull in that minus sign from the
and pull in that minus sign from the outside of the integral to the inside.
outside of the integral to the inside. And so that term is just going to be a
And so that term is just going to be a -1 inside of the brackets from which we
-1 inside of the brackets from which we factored out a of b and lambda.
factored out a of b and lambda. And then the final thing that we have to
And then the final thing that we have to prove is that in that top expression,
prove is that in that top expression, the term on the right involving the a of
the term on the right involving the a of c and lambda can be brought down below
c and lambda can be brought down below and turned into this expression a of b
and turned into this expression a of b and lambda time a of c and lambda. And
and lambda time a of c and lambda. And to show that this is in fact a
to show that this is in fact a legitimate move, first of all, in the
legitimate move, first of all, in the top equation, notice how we have two
top equation, notice how we have two minus signs. And so those are going to
minus signs. And so those are going to cancel each other out. And then the only
cancel each other out. And then the only question that remains is, is the product
question that remains is, is the product of these two purple expressions times a
of these two purple expressions times a of cm and lambda equal to a of c and
of cm and lambda equal to a of c and lambda? Well, yeah, it is. The reason
lambda? Well, yeah, it is. The reason being that purple expression is the
being that purple expression is the square of a of b and lambda. But
square of a of b and lambda. But remember this capital A, this is the
remember this capital A, this is the measurement result at detector A. And
measurement result at detector A. And the only values it can take on are
the only values it can take on are either plus or minus one. But in either
either plus or minus one. But in either case, the square of plus or -1 equals 1.
case, the square of plus or -1 equals 1. And so yeah, the purple expression then
And so yeah, the purple expression then collapses onto the number one. And we
collapses onto the number one. And we see that this was in fact a legitimate
see that this was in fact a legitimate move the way we've factored things out
move the way we've factored things out here. So what we end up with is the same
here. So what we end up with is the same integral we had before, but just
integral we had before, but just massaged into a different form.
massaged into a different form. All right. So now bell is going to claim
All right. So now bell is going to claim that this integral expression is less
that this integral expression is less than another integral. So using equation
than another integral. So using equation one which is where we specified that the
one which is where we specified that the measurement outcomes at detectors A and
measurement outcomes at detectors A and B can only either be +1 or minus1 then
B can only either be +1 or minus1 then we can show that our integral expression
we can show that our integral expression is going to be less than or equal to the
is going to be less than or equal to the integral over row D lambda of 1 minus A
integral over row D lambda of 1 minus A as a function of B and lambda* A of C
as a function of B and lambda* A of C and lambda.
and lambda. Now, when I got to this part of the
Now, when I got to this part of the paper, I was looking at it and I was
paper, I was looking at it and I was like, "Uh, hm. Okay, [clears throat]