YouTube Transcript:
But what is quantum computing? (Grover's Algorithm)
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
Available languages:
View:
A lot of pop science outlets give a certain summary of quantum
computing that I can almost guarantee leads to misconceptions.
The summary goes something like this.
In a classical computer, data is stored with bits, some sequence of ones and zeros,
but in a quantum computer, you are able to represent every possible sequence of
bits of some fixed length all at once in one big thing known as a superposition.
And sometimes the implication of these summaries is that quantum
computers can be faster by basically doing whatever a classical computer would do,
but to all of these sequences in parallel.
Now, this does gesture at something that's kind of true,
but let me see if I can prove to you why I think this leads to misconceptions,
and I'll prove it using a quiz.
To set it up, I want you to imagine that I have a mystery function,
and I tell you that there's a certain secret number among all the numbers
from 0 up to n-1, where if you plug that value into my function,
it returns true, but if you were to plug in any other value, it returns false.
And let's say you can't look at the innards of the function to learn anything about it,
the only thing you're allowed to do with it is just try it out on numbers.
The warmup question is, how many times, on average,
would you have to apply this mystery function in order to find the secret key?
Well, if the setup was in an ordinary classical computer,
there's really nothing better that you can do than guess and check.
You go through all the numbers, and maybe you're lucky and find it early,
maybe you're unlucky and it doesn't come up until later, but on average,
with a list of n possibilities, it takes one half of n attempts to find the key.
Now, in computer science, people care about how runtimes scale.
If you had a list ten times as big, how much longer would it take?
And computer scientists have a way of categorizing runtimes.
They would call this one O of n, where that big O communicates that maybe
there's some constants like the one half, or maybe some factors that grow slower than n,
but the factor of n is what explains how quickly it scales as n grows.
If n goes up by a factor of ten, the runtime also goes up by a factor of ten.
Now, here's your quiz.
For the equivalent of this setup, but in a quantum computer,
what's the best runtime for finding the secret key?
I've asked a variant of this quiz many times in a certain lecture that I've given
over the years, and the options I typically offer are O of square root of n,
O of log of n, O of log of log of n, and O of one,
where here O of one would mean that the runtime is just some constant that
doesn't actually grow as n grows.
Now, to be fair, I have not defined quantum computing.
In fact, doing so from the ground up is going to be the goal of this video.
So without showing you what this mystery function would look like in that setting,
it's kind of an incoherent question.
But it's not really meant as a quiz that I'm grading you on or anything,
it's just meant to be a gut check of intuition before we dive in.
In principle, it's the same task.
Finding a needle in a haystack, where you want to discover
which value out of many options uniquely triggers some function.
I asked this as a YouTube post last month, which 100,000 of you kindly answered.
The most recent time I gave it live was to a group of Stanford students.
I also posed it to those attending the International Math Olympiad.
And in all of these and many other instances, the answer distribution looks very similar.
The most common answer is always O of one.
And this is wrong, and I'm pretty sure that it stems from that misleading summary.
That summary implies that you would put all of the n values that you need
to search into this mysterious superposition,
and then you would process them all in parallel,
and then somehow the answer would be revealed.
The second most common answer is typically O of log of n, and this is also wrong.
You would call this an exponential speedup.
For example, if you increase the size of the list by factors of 10,
an O of log n runtime would only tick up by the same additive increment each time.
Now this wrong answer I suspect stems from a misconception
about how much better quantum computers are in general.
There are very certain special problems where you can achieve an exponential speedup.
The most famous case is probably Shor's algorithm for factoring large numbers.
But most problems are not like that.
In this case, the correct answer is O of square root of n.
And this is a lot more representative of the typical
speedup that you could get with a quantum computer.
In 1994, it was proven that a quantum computer could not
possibly do any better than O of square root of n on this task.
And then two years later, Lav Grover found a specific
procedure that actually achieves that runtime.
So searching through a bag of a million options takes on the order of a thousand steps.
A bag of a trillion options takes on the order of a million.
This big O actually hides something that's pretty fun,
which is the constant of pi fourths in the precise runtime,
and that pi has its own whole fun story to tell, which I'll get to later.
You might think that this puzzle with a mystery function
triggered by one specific value is super contrived.
But I want you to keep in mind this is meant to be a generic stand-in
for any problem where you know how to quickly verify a solution,
even if you don't know how to find that solution in the first place.
This describes an enormous class of problems in computer science known as NP problems.
So while a square root speedup is frankly not as earth-shattering as an exponential
speedup would be, and while big O runtimes are often a lot less important than other
practical considerations, it is thought-provoking that something like Grover's algorithm
is even possible at all, providing this catch-all method for speeding up any NP problem.
My goal with this lesson is to build up to a step-by-step
walkthrough of how that algorithm works.
It's actually very geometric and very beautiful.
But we need to build up a lot of background in order to get there.
The first two-thirds or so of the video will be spent building up the fundamentals
of quantum computing, not with a set of analogies,
which as we've seen can lead to misconceptions, but as a piece of math,
which I think offers you a pair of glasses through which you can see a much more
honest depiction of the whole field.
This is one of those topics that has a few premises that are just going to feel a
little bit strange at first, and I should warn you they take a little getting used to.
My current plan is to follow this lesson with another one
about some of the underlying physics, which hopefully can
help motivate a few of the odd-looking rules that you'll see here.
But today the goal is to provide the minimal viable path to seeing a genuine,
bona fide quantum algorithm.
Let me pull up again that contrast between classical computing and quantum computing,
and let's see if we can build up something of a more representative mental model.
It is true of course that data in a classical computer looks like a series
of ones and zeros, and at a higher layer of abstraction that might represent
an actual data type, like an integer or some text, and at a lower layer of abstraction,
those ones and zeros represent some actual thing in the physical world,
like voltages across a capacitor or something like that.
These same layers of abstraction provide a pretty
helpful framing when we discuss quantum computing.
Over there, there's also some underlying physical measurement,
and again you represent the outcome of that measurement with some sequence of ones
and zeros, and again this might implement some actual data type that you care about,
like a number.
This symbol that I'm showing by the way is called a ket.
I'll explain it properly in a couple minutes, but for right now just
think of it as conveying that something is coming from a quantum computer.
Let's start things off with a description of quantum computing through the lens of that
middle layer of abstraction, meaning we're going to postpone all of the underlying
physics for now, which is a little bit like teaching computer science without discussing
hardware.
In a classical computer, there's no need to distinguish between the state of
memory and what you read out from the memory,
both of them just look like the same sequence of bits,
but it's a very different story in a quantum computer.
Our main job today is going to be to understand something called the state vector,
which is continuous, this is the thing the computer actually operates on,
but it has a very unusual relationship with the values that you actually read out,
those discrete sequences of bits.
Before I can define this state vector, you need to know one other key
difference from classical computers, which is that this value that you read out,
which again just looks like some sequence of ones and zeros, is random.
Or to be a little more accurate, I should say it's typically random.
The way you can think about this is that if you run a program on a quantum computer,
that program doesn't necessarily determine a particular output,
instead it determines a probability distribution across all possible outputs.
So for the example I'm showing on screen, this would be a very small quantum
computer where the thing that you read out has four bits,
meaning that there are 2 to the 4, or 16 possible outputs,
and the specific program you run determines some kind of distribution across
all those possible outputs.
Some programs might manage to concentrate more probability on just one of those outputs,
but other programs might give a more even spread across everything.
This example, by the way, where the thing you read out has four bits,
would be called a 4-qubit quantum computer, and more generally,
if you have a k-qubit quantum computer, that means there are 2 to the k distinct
possible outputs, and any program gives a distribution across all of those,
and the thing that you read out has k distinct bits.
That word qubit, by the way, is another thing I'm
going to define more precisely in just a minute.
I do want to emphasize that this distribution is implicit.
You never actually see it directly, you instead infer
what it must be based on the program that you run.
You never see all bit strings coexisting at once in some kind of way.
You just see one of them, drawn at random, according to this distribution.
At a lower layer of abstraction, what I'm describing as reading out from memory looks
like a physical measurement, and the randomness stems from the laws of quantum mechanics.
If you're curious about the physics, that lower layer of abstraction,
that's exactly what the next video is for.
Up in this layer, you just think about probability
distributions over all possible bit strings.
One more funny rule here, which does bubble up from the underlying quantum mechanics,
is that after you read out from memory, and you see some particular value,
the underlying state of the computer changes such that now all of the
probability is concentrated on whatever value you read out.
So if you kept reading out from memory over and over,
you would just keep seeing that same value.
You might imagine these programs as creating a very delicate and
sensitive probability distribution, where the moment you look at it,
sampling from that distribution, the whole thing collapses to one value.
Now you might be wondering, where does this distribution come from?
This is both the most important and the most confusing part.
You think of the state of the computer as being described by a big vector.
Right now when I say the word vector, you can just think big list of numbers,
although as you'll see later on, it can be helpful to think of this
as a direction in some super high dimensional space.
Each component of this vector corresponds to one of the possible
values you might read out, one of those distinct bit strings.
So in this example where what you read out has 4 bits,
the state vector would have 16 distinct components.
The state vector is not the same thing as the probability distribution
over all possible outputs, but it is very closely related.
The fundamental rule, which I admit is going to look very strange at first,
is that if you take the magnitude of each component in that state vector and you square
it, that gives you the probability of seeing the corresponding output,
the corresponding bit string.
Let me just say up front, a lot of people learning
quantum computing find this state vector a bit weird.
What is it actually, and why are we squaring things to get probability?
As it is, for the sake of simplicity, there's a certain important
detail that I'm neglecting until the end of the video here.
I just want to flag that for most people, this takes a little getting used to.
And to be super clear on what I mean with the fundamental rule here,
let's suppose that after this program processes this vector,
maybe the component of it associated with some specific bit string, like 0011,
happened to be 0.5.
Then when you square that value, 0.5 squared is 0.25,
so the observable implication of this is that when you read out from memory,
you have a 25% chance of seeing that bit string, 0011.
One thing I'll highlight is that it is perfectly valid for the values in this state
vector to be negative, and at first you might think that has no real impact,
since flipping the sign doesn't change the square,
and therefore all the probabilities stay the same.
It is true that the probabilities stay the same,
but we absolutely consider this to be a distinct state, and as you'll see,
the idea of flipping signs plays a very central role in Grover's algorithm.
Here, this example with four qubits has kind of a lot on screen,
with not a lot of visualization to back it up,
so let's scale things down to the smallest possible case where the computer has
just two possible outputs, represented with a 0 and a 1.
In this simplest possible case, the state vector would only be two-dimensional,
so we can actually represent it geometrically as an arrow inside a 2D space.
In this case, the x-coordinate corresponds to the outcome 0,
in the sense that the square of that coordinate tells you the
probability that when you read out from the computer, you would read a 0.
Maybe it's helpful if I add a little bar to show that probability.
You'll notice that as the vector points more in the horizontal direction,
more of that probability mass is concentrated on the 0.
Similarly, the y-coordinate corresponds to a 1 in the same way.
A more vertical state vector means you're more likely
to see a 1 when you read out from the computer.
Now, notice, because the two probabilities should add up to 1, after all,
something is going to happen, x squared plus y squared should equal 1.
Geometrically, this means that the state vector has a length of 1,
so you could think of it as being confined to a unit circle.
More generally, the state vector for a quantum computer will always have a length of 1,
and you can think of it as living on some very high-dimensional unit sphere.
This two-dimensional example has a special name, which I've already mentioned.
It's called a qubit, short for quantum bit.
The analogy with a classical bit is that when you read out from the computer,
you see either a 0 or a 1, but other than that, it is a completely different animal.
Mathematically, a qubit is a unit vector in a two-dimensional space,
together with a coordinate system where these two perpendicular x and y
directions correspond to the two values that you might read out when you measure.
You should know there is that added bit of complexity that I'm postponing,
but this is 90% of the right idea.
And again, you have this funny rule where when you measure the qubit,
seeing either a 0 or a 1, the vector then collapses to fall onto that corresponding
direction.
So unless something is done to prepare that qubit back into a diagonal direction,
any follow-on observations that you make are always going to show the same outcome.
It's very possible that at this point you're thinking something like, okay, grant,
this is a super bizarre set of premises you're asking me to accept, and if so,
you are not alone.
What I'm describing, as you can no doubt probably tell,
are basically the postulates of quantum mechanics.
There are many systems throughout physics, like the spin of an electron or the
polarization of a photon, that have this property,
where the outcome of a measurement is random,
and our best laws of physics have us model the state of that system using a vector,
just like the one I'm describing here, where squaring the magnitudes of that vector's
components give you the probabilities for seeing various possible outcomes.
That actually has a special name, it's called the Born rule.
This idea of a qubit is basically meant to be an abstraction over many possible
systems like this, in just the same way that a bit is meant to be an abstraction
over many possible physical systems that can toggle one of two directions.
The symbol I've been showing, by the way, is used throughout any subject with
the word quantum in its name to refer to a unit vector in this state space.
What you put inside that ket is often going to give some
kind of readable meaning for what that vector represents.
So in our example with a qubit, the unit vector to the right is often
shown with a zero inside the ket, because if that's the state vector,
it means you deterministically read out a zero from the computer.
Likewise, the unit vector in the vertical direction is
represented with a ket that has a one inside of it.
And if you go on and read more about this, something that you'll very commonly
see is that instead of writing down a general qubit with a column vector,
the way I've been showing you, a lot of people like to write it as an explicit
weighted sum of these two unit vectors in the two coordinate directions.
That's a very physicist kind of convention.
Now classical computing has this idea of logic gates,
certain basic operations like AND, OR, and NOT that you can use to
process bits and that you can string together to create arbitrarily
complicated functions.
Analogously we have what are called quantum gates,
which are certain fundamental operations that you can apply to a qubit or to
a system of multiple qubits.
And they always look like somehow flipping or rotating the state vector.
Now I'm not going to delve too deeply into the details of all the different quantum
gates, but if you are curious, I'll show you an example of what one of them looks like.
Here's a very standard one known as a Hadamard gate.
What it does is it maps the unit vector in that horizontal zero direction
into the diagonal northeast direction, and it maps the unit vector in
the vertical one direction into that kind of diagonal southeast direction.
You would very commonly use this to take a deterministic state,
something that's either a zero or a one, and turn it into something with a 50-50 equal
balance, or vice versa, too.
This is just one example, but there are a number of others forming the building blocks
for quantum computing, and the art of writing an algorithm in this setting is to somehow
compose a bunch of different quantum gates together that will progressively manipulate
and flip and massage this vector until it points almost entirely in one particular
coordinate direction, presumably one that actually answers a question you care about.
Now down with the simplest example of a qubit,
you only have two coordinate directions to work with,
so you would be constrained to answer simple yes-no questions.
And although I can't illustrate a geometric vector with more than three dimensions,
in principle, a system with k qubits is going to have 2 to the k
distinct coordinate directions, one for each bit string.
So if you can somehow manage to coerce this vector to point
along just one of those directions, you could potentially
answer some more interesting question, carrying more information.
Maybe one of them represents a prime divisor in a very large
number you're trying to factor, or maybe one of them represents
that secret key value from the opening puzzle of this video.
Even though the potential power of quantum computers has a long tradition now of
being greatly exaggerated, insofar as there really is potentially more power there,
one of the key reasons is that the size of the state vector grows exponentially.
As few as 100 qubits would already imply a mind-bogglingly massive state vector.
But the catch is that you have no direct access to the values inside this vector.
It's effectively invisible to you.
The only way it can be useful is if you have a way to manipulate it in such a way that
all of the probability, or at least most of it,
gets concentrated on one single component, and if that component corresponds to an
answer to a question that you care about.
Grover's algorithm offers us a really great example to actually see how this looks,
and it's high time that we get there.
Let me offer you a very high-level preview for how it looks.
I promise I will explain all of this in more detail, but here's the bird's eye view.
It initializes this state vector in such a way that there's
an equal balance of probability across all possible outcomes.
One of those outcomes is going to be the secret key that you're searching for,
and the tool that you'll have available, which I promise to motivate later,
is to flip the sign of the state vector at that coordinate.
This doesn't immediately affect the probabilities,
but when you interleave this with a certain other operation,
and you kind of go back and forth between the two of these,
what happens is that the probability mass slowly starts to get concentrated over that
secret key value, and at a certain point, almost all of it will be there,
so when you read out from the computer, you will almost certainly see the secret key
you're looking for.
Okay, so that's the high level, but let's unpack it in some more detail.
The first thing to address is this idea of flipping the
sign of the component associated with the secret key.
That might feel a little bit weird.
Why would we assume that that operation is available to us?
Backing up, remember that Grover's algorithm is meant to apply to any problem where you
can verify a solution quickly, even if finding a solution in the first place is hard.
Examples here would include solving Sudokus, finding a valid coloring of
a map where no two border regions share a color,
or countless tasks throughout cryptography, where security often depends
on a certain value being hard to find, even though for pragmatism it has
to be easily verifiable.
We began this video with a generic stand-in for all of these problems,
where you imagine some function that takes in any number from
0 to n-1 and returns true on one and only one of those.
In principle, we'll think of such a function as
being built out of a bunch of classical logic gates.
Those logic gates act on some binary representation of the inputs,
and the final output is either 0 or 1.
Now here's the key point.
Grover knew that given any ensemble of logic gates like this,
you can translate it into a system of quantum gates so that if in the classical case the
function takes in some binary input and returns a 1, for true,
then in the quantum case the effect of all of these gates is to flip the sign of that
state, the state associated with the same bit string.
Similarly, if in the classical case the function maps some binary input to 0,
for false, then in this quantum translation the effect on the
corresponding state would be to leave it unchanged.
More generally, because all of these quantum operations are linear,
if the state is a combination of multiple pure coordinate directions,
then the effect is to simply flip the sign for the component associated with whatever
bit string triggers that classical function.
It means that if you have any NP problem, anything where you can quickly verify
solutions, you're able to create an operation on a quantum computer that flips the
sign of a state vector at the position corresponding to a solution of that problem.
This might feel kind of useless at first.
After all, flipping signs doesn't affect the probabilities.
But Grover realized that this could be used in conjunction with
another step that slowly amplifies the probability of that key value.
There's actually a really nice way to visualize his algorithm.
To set it up, let's imagine that our state vector has only three dimensions.
Obviously in principle it would be way bigger, but this lets me draw the first picture.
The three directions here would correspond to the values 0,
1, and 2, and the whole problem statement here is that one
of those values would be a secret key that we're searching for.
Many different quantum algorithms will begin by putting that state vector into
a kind of equal balance, where all of the components have the same value.
I want to give that equal balance vector a name.
Let's call it B.
I hope you don't object too much to me simply declaring that this is possible
without dwelling on the underlying quantum gates that make it happen.
In this case it essentially looks like a big pile of Hadamard gates,
but all you need to know is that this equal balance direction is abundantly accessible.
So starting from here, the goal is to somehow coerce this
vector to instead point up in that secret key direction.
And what's very helpful for the visualization purposes here
is that throughout Grover's algorithm, the vector only ever
moves inside the 2D plane that's spanned by these two vectors.
So what I'm going to do is draw everything on that two-dimensional slice.
And this is going to give us a faithful representation even when
the full dimension is way too big for me to draw literally.
The convention I'll use here in drawing that slice will be to put the secret key
direction, whatever it is, along this y-axis,
and then the x-axis is going to represent something that's an equal balance of all
of the other states, the non-key states.
So in our very small three-dimensional example,
if that secret key was the 2 state up in the z direction,
that would mean this perpendicular is an equal balance of 0 and 1,
which sits on the xy plane perpendicular to that z-axis.
Notice the fully equally balanced state, b, has some component in that secret key
direction, since by its definition it has a little bit of that secret key within it.
Instead of drawing this slice from three dimensions,
here's what it would look like if it was taken from some larger number of dimensions.
It's almost identical, but the main difference is that that equal balance state
vector gets closer and closer to being perpendicular to the secret key direction.
Crucially though, this angle is never quite 90 degrees,
since that balance state always has a little bit of that secret key value inside of it.
In fact, calculating this angle is going to be essential
for understanding the runtime of Grover's algorithm.
This is the main bit of math you actually have to do for it.
You can find this angle by taking a dot product
between the balance state and the key direction.
The components of that balance state vector are all going to look like
1 divided by the square root of n, since remember it needs to be true
that when you add the squares of all of these, it would give you 1.
Since the key vector is 0 almost everywhere but 1 in one of the components,
it means that the dot product between these two is 1 divided by the square root of n.
If you know a little about dot products, you'll know that this
is also the cosine of the angle between those two vectors.
I'm going to translate this fact a little to make it more useful for our purposes later.
This is the same as saying the sine of the complementary angle down over here,
which I'll call theta, is 1 over the square root of n.
For really small angles, the sine of theta and theta are approximately the same,
so if n is very large, we can safely say that this angle is about 1
divided by the square root of n, as long as it's measured in radians.
This value for theta is ultimately where the square
root in the runtime is going to come from.
So let's remember it.
I'll throw it up in the corner here.
Okay, having spent a while setting up the actual picture,
let me show you the procedure here.
It's surprisingly simple.
Remember that key operation we were looking at earlier?
The one where you take the verification function for whatever NP problem
you're trying to solve, and you translate it into some quantum gates,
and the effect is to flip the sign associated with that key value?
Well, what does that look like inside our diagram?
In this diagram, if you flip the sign for the component associated with the key value,
but all other components stay the same, what it looks like is flipping about the x-axis.
And the final ingredient you need to know is that it is also possible
to flip this state vector around this equal balance direction.
I realize it might be a little unsatisfying for me to keep declaring that certain
operations are available, but one thing to know is that in general,
if you can clearly describe and access one of these state vectors,
it's also perfectly possible to reflect around it.
There are some quantum gates, they let you flip around this balance direction,
but trust me that dwelling on the details of how those look isn't really
going to add much clarity or intuition to the whole algorithm.
The insight really just comes from geometry.
Notice how if you first flip around the x-axis,
and then flip around this off-diagonal direction,
the state now points slightly more in the vertical direction.
If you were to read out from memory now, you would have a slightly
larger chance of seeing the secret key value than all the others.
Here, maybe it's helpful if I show the actual coordinates,
in this case where n equals 100, along with some probability bars based on the
squares of those coordinates.
Notice how each time I flip around the x-axis,
and then flip around this off-diagonal axis, the component associated with that
secret key gets a little bit larger.
So Grover's algorithm is remarkably simple.
All you do is repeat this over and over until the vector
points as close as it can get to the secret key direction.
The final bit of reasoning you have to do is to
figure out how many repetitions that should be.
A wonderful fact from geometry is that if you successively flip about two
different lines like this, the overall effect is the same as a rotation,
more specifically a rotation by two times the angle between those two lines.
So in our case, applying the two operations we have available one after the other,
the net effect is to rotate the state vector by two times theta,
where again theta is that little angle that we calculated earlier,
approximately one over the square root of n.
The ultimate goal is to rotate our initial state a little under 90 degrees,
or about pi halves radians.
This means the optimal number of repetitions looks like pi halves divided by two theta,
which is pi fourths times one over theta, and critically,
because theta is about one divided by the square root of n,
it means the total number of steps looks like pi fourths times the square root of n.
So what Grover's algorithm says is first find whatever whole number is closest to
this value and then repeat your two available flips that specific number of times.
As a concrete example, let's say that n was two to the power of twenty,
meaning you're searching for a secret key out of about a million options.
You would be running this on a computer with at least twenty qubits,
and what the algorithm would say is first compute pi fourths times the square
root of this number, which is around eight hundred and four,
meaning you now repeat those two operations you have available,
the one flipping the sign of the key state and the one that's flipping around
the balance direction eight hundred and four times.
Now remember, this state vector is invisible to you.
There is nothing that you can do that will read out the values that it has.
You instead have to infer where it must be through reasoning, and in this case,
through all the geometric reasoning, you can conclude that after this specific number
of times, the vector should be pointed almost entirely in that secret key direction.
So when you read out from the computer, you are
almost guaranteed to see that secret key value.
Now to be clear, you're not guaranteed to see it.
There is some small chance that after reading out,
you would see something that's not the secret key.
So to be sure, you could always quickly verify the answer, since after all,
the whole premise of this situation is that you have some quick way of verifying answers.
You can just do that on a classical computer.
Worst case, if you got unlucky and sampled a different number,
you run the whole thing again, and it becomes vanishingly unlikely
that you would ever need to run this more than just a couple times.
To wrap up, I want to come clean about a lie that I've been telling you,
and also reflect a little bit about where this speedup came from,
and then highlight a surprising analogy.
Before I do, now might be as good a time as any to say a
thanks to the community of channel supporters on Patreon.
Putting together visualized lessons like this takes an enormous amount of time.
As you probably know, most YouTubers monetize their content with in-video sponsorships,
but for many years now that's something that I've opted to decline.
I think it makes the videos better, and the only reason it's not a wildly costly decision
is that enough viewers who agree with that directly support the channel through Patreon.
In exchange, I offer supporters early views of new content,
which is actually very helpful for developing it, and there's other perks in there too.
In general though, if you like this content, it
would mean a lot if you considered joining.
No pressure though, one of the big values is that the content can be free.
Anyway, back to those three finishing points.
The lie is a lie by omission.
I've been showing these state vectors with positive and negative real number values,
but more generally, these components can be complex numbers.
Now my hope is to motivate why that's the case in the follow-on video about physics.
The general idea is that any time you're working with waves,
you care about both amplitude and phase, and a complex number
is a really elegant way to encode an amplitude and phase together.
So if you look at one of the components inside the state vector,
the fuller picture for how to think about it is that it has some magnitude
and some phase.
The magnitude is the thing that you square to get the probability,
and the phase is essentially a more general version of our
whole discussion here around positive and negative values.
Changing phase doesn't immediately affect the probabilities,
but it does affect the state, which in turn affects how it gets processed and how it
interacts with the world.
Now needless to say, throwing in a bunch of complex numbers adds
a lot of potential confusion to an already complicated topic.
That's why I avoided it.
But we were safe to ignore complex values for the purposes of Grover's algorithm,
since very mercifully, during that algorithm,
you only ever see positive and negative values.
I want you to know, though, that this availability of complex values plays a
crucial role in other quantum algorithms, like Shor's for factoring numbers.
Next, even if you understood everything that I described with this whole algorithm,
it's not easy to summarize where exactly the speedup came from.
The fact that you begin by applying a certain operation to this
equal balance state makes it very tempting to say that the speedup
comes from parallelizing the operation over all possible inputs.
But like I mentioned at the start, that summary, at least to me,
just really doesn't feel right, and it definitely leads to misconceptions.
As you now know, that step, on its own, does nothing to reveal the key value.
I'll leave it up to your interpretation whether it feels apt to describe this
first step as applying a function to many inputs in parallel,
or if it feels better to say that the balance state is just its own new thing,
and the function we have always applies to individual inputs one at a time,
never multiple at once.
It's just that those inputs are now a new kind of thing.
In my view, for this algorithm, if you want the one-word summary for
where the speedup comes from, I think a better choice would be Pythagoras.
As a loose analogy, if you want to get from one corner of a unit square to the opposite,
if you're limited to move only in the x and y directions, you have to walk two units.
But if you're allowed to move diagonally, you can get there in the square root of two.
And more generally, if you're up in n dimensions,
you would have to walk n units to get from one corner of a cube to the opposite
one if you can only move along the edges, but if you can go diagonally,
you can get there in the square root of n.
In the worldview of quantum mechanics, different observable states
all represent perpendicular directions in some state space.
So viewed from this framework, what the classical deterministic world looks
like is one where you only have access to these pure coordinate directions.
If you think about it, anytime you're doing computation,
your computer is somehow walking through a series of different states
that are available to it, and algorithmic runtime is all about
understanding how many steps you have to walk in the space of all possible states.
So from this quantum worldview, where classical states look like pure coordinate
directions, the key difference with quantum computing is that you now also have
available to you a panoply of additional diagonal directions that you can work with.
Now, to be clear, the analogy is not too literal, you should take it with a grain of salt.
It's not like runtime necessarily looks like a distance
travelled through this particular state space.
But it is true that if you follow the state vector throughout Grover's algorithm,
what it's doing is slowly walking along a quarter circle arc from an initial
condition to the target condition, tracing a path that would be entirely
unavailable if you were limited to move only in pure coordinate directions.
And the effect is to provide this square root-sized shortcut.
And as the very last point before I let you go,
regular viewers will know that one of my reasons for covering this topic is because of
a promised analogy between this algorithm and something we covered last video,
about two colliding blocks that can compute pi.
In that video, we are also studying a point in a certain two-dimensional state
space bouncing around a circle, and in fact, the series of bounces that it
took is essentially identical to what we just saw here with Grover's algorithm.
The story here is that when a physicist friend of mine, Adam Brown,
saw the first version of that video, he had recently been reading up on
Grover's algorithm and immediately realized that both processes were identical.
Now I had this whole cockamamie plan for this video where I was
going to explain quantum computing and Grover's algorithm in the
context of that analogy, but it turned out to be just a terrible idea.
The whole thing only really made sense if you already understood Grover's algorithm.
So instead, now that you have seen both topics in isolation,
what I'll do is leave up a rough outline of how the analogy looks.
Think of this as an open-ended homework puzzle,
where the task is to draw the connection for yourself.
As an answer key of sorts, I will link to the very delightful paper
by Adam Brown outlining precisely what that analogy looks like.
If you want to learn more of the fundamentals of quantum computing,
several years ago two very smart friends of mine, Andy Matuszczak and Michael Nielsen,
put together a really nice resource for learning the topic,
offering a pretty unique approach to making sure that you actually remember
it in the long term.
To learn some of the fundamental quantum mechanics,
Mithina Yoganathan from the channel Looking Glass Universe
has been putting together a very beginner-friendly course on the topic.
She was actually the one to teach me how Grover's algorithm works many years ago,
and honestly I owe a lot of the content of this video to many helpful conversations with
her.
And as a very last note to end on, one of the conversations I had
while making this video was with Scott Aaronson,
a very widely respected researcher and utterly delightful author on the topic.
I recorded the Zoom call for notes, and there's one
little piece of it which I just wanted to share with you.
You know, I have this dream of writing a science fiction novel where I know
it's going to be the climactic scene, where the heroes will be, you know,
Ron and Grover's algorithm to try to find this cryptographic key, right?
The whole fate of the world will depend on whether they can find it, OK?
The bad guys have surrounded their base, you know, they're bashing down the walls.
But Grover's algorithm is still running, and it has only like a
30 percent probability of giving you the solution if you measure.
So the question is, like, do you measure now or do you let it run for another minute?
Right?
And if you measure now, you know, and you don't get it, then you've lost everything.
Then you have to restart from the beginning.
So this is this is not a plot that you could have with any classical algorithm.
Right.
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.
Works with YouTube, Coursera, Udemy and more educational platforms
Get Instant Transcripts: Just Edit the Domain in Your Address Bar!
YouTube
←
→
↻
https://www.youtube.com/watch?v=UF8uR6Z6KLc
YoutubeToText
←
→
↻
https://youtubetotext.net/watch?v=UF8uR6Z6KLc