YouTube Transcript:
But what is quantum computing? (Grover's Algorithm)
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
A lot of pop science outlets give a certain summary of quantum computing that I can almost guarantee leads to misconceptions. The summary goes something like this. In a classical computer, data is stored with bits, some sequence of ones and zeros, but in a quantum computer, you are able to represent every possible sequence of bits of some fixed length all at once in one big thing known as a superposition. And sometimes the implication of these summaries is that quantum computers can be faster by basically doing whatever a classical computer would do, but to all of these sequences in parallel. Now, this does gesture at something that's kind of true, but let me see if I can prove to you why I think this leads to misconceptions, and I'll prove it using a quiz. To set it up, I want you to imagine that I have a mystery function, and I tell you that there's a certain secret number among all the numbers from 0 up to n-1, where if you plug that value into my function, it returns true, but if you were to plug in any other value, it returns false. And let's say you can't look at the innards of the function to learn anything about it, the only thing you're allowed to do with it is just try it out on numbers. The warmup question is, how many times, on average, would you have to apply this mystery function in order to find the secret key? Well, if the setup was in an ordinary classical computer, there's really nothing better that you can do than guess and check. You go through all the numbers, and maybe you're lucky and find it early, maybe you're unlucky and it doesn't come up until later, but on average, with a list of n possibilities, it takes one half of n attempts to find the key. Now, in computer science, people care about how runtimes scale. If you had a list ten times as big, how much longer would it take? And computer scientists have a way of categorizing runtimes. They would call this one O of n, where that big O communicates that maybe there's some constants like the one half, or maybe some factors that grow slower than n, but the factor of n is what explains how quickly it scales as n grows. If n goes up by a factor of ten, the runtime also goes up by a factor of ten. Now, here's your quiz. For the equivalent of this setup, but in a quantum computer, what's the best runtime for finding the secret key? I've asked a variant of this quiz many times in a certain lecture that I've given over the years, and the options I typically offer are O of square root of n, O of log of n, O of log of log of n, and O of one, where here O of one would mean that the runtime is just some constant that doesn't actually grow as n grows. Now, to be fair, I have not defined quantum computing. In fact, doing so from the ground up is going to be the goal of this video. So without showing you what this mystery function would look like in that setting, it's kind of an incoherent question. But it's not really meant as a quiz that I'm grading you on or anything, it's just meant to be a gut check of intuition before we dive in. In principle, it's the same task. Finding a needle in a haystack, where you want to discover which value out of many options uniquely triggers some function. I asked this as a YouTube post last month, which 100,000 of you kindly answered. The most recent time I gave it live was to a group of Stanford students. I also posed it to those attending the International Math Olympiad. And in all of these and many other instances, the answer distribution looks very similar. The most common answer is always O of one. And this is wrong, and I'm pretty sure that it stems from that misleading summary. That summary implies that you would put all of the n values that you need to search into this mysterious superposition, and then you would process them all in parallel, and then somehow the answer would be revealed. The second most common answer is typically O of log of n, and this is also wrong. You would call this an exponential speedup. For example, if you increase the size of the list by factors of 10, an O of log n runtime would only tick up by the same additive increment each time. Now this wrong answer I suspect stems from a misconception about how much better quantum computers are in general. There are very certain special problems where you can achieve an exponential speedup. The most famous case is probably Shor's algorithm for factoring large numbers. But most problems are not like that. In this case, the correct answer is O of square root of n. And this is a lot more representative of the typical speedup that you could get with a quantum computer. In 1994, it was proven that a quantum computer could not possibly do any better than O of square root of n on this task. And then two years later, Lav Grover found a specific procedure that actually achieves that runtime. So searching through a bag of a million options takes on the order of a thousand steps. A bag of a trillion options takes on the order of a million. This big O actually hides something that's pretty fun, which is the constant of pi fourths in the precise runtime, and that pi has its own whole fun story to tell, which I'll get to later. You might think that this puzzle with a mystery function triggered by one specific value is super contrived. But I want you to keep in mind this is meant to be a generic stand-in for any problem where you know how to quickly verify a solution, even if you don't know how to find that solution in the first place. This describes an enormous class of problems in computer science known as NP problems. So while a square root speedup is frankly not as earth-shattering as an exponential speedup would be, and while big O runtimes are often a lot less important than other practical considerations, it is thought-provoking that something like Grover's algorithm is even possible at all, providing this catch-all method for speeding up any NP problem. My goal with this lesson is to build up to a step-by-step walkthrough of how that algorithm works. It's actually very geometric and very beautiful. But we need to build up a lot of background in order to get there. The first two-thirds or so of the video will be spent building up the fundamentals of quantum computing, not with a set of analogies, which as we've seen can lead to misconceptions, but as a piece of math, which I think offers you a pair of glasses through which you can see a much more honest depiction of the whole field. This is one of those topics that has a few premises that are just going to feel a little bit strange at first, and I should warn you they take a little getting used to. My current plan is to follow this lesson with another one about some of the underlying physics, which hopefully can help motivate a few of the odd-looking rules that you'll see here. But today the goal is to provide the minimal viable path to seeing a genuine, bona fide quantum algorithm. Let me pull up again that contrast between classical computing and quantum computing, and let's see if we can build up something of a more representative mental model. It is true of course that data in a classical computer looks like a series of ones and zeros, and at a higher layer of abstraction that might represent an actual data type, like an integer or some text, and at a lower layer of abstraction, those ones and zeros represent some actual thing in the physical world, like voltages across a capacitor or something like that. These same layers of abstraction provide a pretty helpful framing when we discuss quantum computing. Over there, there's also some underlying physical measurement, and again you represent the outcome of that measurement with some sequence of ones and zeros, and again this might implement some actual data type that you care about, like a number. This symbol that I'm showing by the way is called a ket. I'll explain it properly in a couple minutes, but for right now just think of it as conveying that something is coming from a quantum computer. Let's start things off with a description of quantum computing through the lens of that middle layer of abstraction, meaning we're going to postpone all of the underlying physics for now, which is a little bit like teaching computer science without discussing hardware. In a classical computer, there's no need to distinguish between the state of memory and what you read out from the memory, both of them just look like the same sequence of bits, but it's a very different story in a quantum computer. Our main job today is going to be to understand something called the state vector, which is continuous, this is the thing the computer actually operates on, but it has a very unusual relationship with the values that you actually read out, those discrete sequences of bits. Before I can define this state vector, you need to know one other key difference from classical computers, which is that this value that you read out, which again just looks like some sequence of ones and zeros, is random. Or to be a little more accurate, I should say it's typically random. The way you can think about this is that if you run a program on a quantum computer, that program doesn't necessarily determine a particular output, instead it determines a probability distribution across all possible outputs. So for the example I'm showing on screen, this would be a very small quantum computer where the thing that you read out has four bits, meaning that there are 2 to the 4, or 16 possible outputs, and the specific program you run determines some kind of distribution across all those possible outputs. Some programs might manage to concentrate more probability on just one of those outputs, but other programs might give a more even spread across everything. This example, by the way, where the thing you read out has four bits, would be called a 4-qubit quantum computer, and more generally, if you have a k-qubit quantum computer, that means there are 2 to the k distinct possible outputs, and any program gives a distribution across all of those, and the thing that you read out has k distinct bits. That word qubit, by the way, is another thing I'm going to define more precisely in just a minute. I do want to emphasize that this distribution is implicit. You never actually see it directly, you instead infer what it must be based on the program that you run. You never see all bit strings coexisting at once in some kind of way. You just see one of them, drawn at random, according to this distribution. At a lower layer of abstraction, what I'm describing as reading out from memory looks like a physical measurement, and the randomness stems from the laws of quantum mechanics. If you're curious about the physics, that lower layer of abstraction, that's exactly what the next video is for. Up in this layer, you just think about probability distributions over all possible bit strings. One more funny rule here, which does bubble up from the underlying quantum mechanics, is that after you read out from memory, and you see some particular value, the underlying state of the computer changes such that now all of the probability is concentrated on whatever value you read out. So if you kept reading out from memory over and over, you would just keep seeing that same value. You might imagine these programs as creating a very delicate and sensitive probability distribution, where the moment you look at it, sampling from that distribution, the whole thing collapses to one value. Now you might be wondering, where does this distribution come from? This is both the most important and the most confusing part. You think of the state of the computer as being described by a big vector. Right now when I say the word vector, you can just think big list of numbers, although as you'll see later on, it can be helpful to think of this as a direction in some super high dimensional space. Each component of this vector corresponds to one of the possible values you might read out, one of those distinct bit strings. So in this example where what you read out has 4 bits, the state vector would have 16 distinct components. The state vector is not the same thing as the probability distribution over all possible outputs, but it is very closely related. The fundamental rule, which I admit is going to look very strange at first, is that if you take the magnitude of each component in that state vector and you square it, that gives you the probability of seeing the corresponding output, the corresponding bit string. Let me just say up front, a lot of people learning quantum computing find this state vector a bit weird. What is it actually, and why are we squaring things to get probability? As it is, for the sake of simplicity, there's a certain important detail that I'm neglecting until the end of the video here. I just want to flag that for most people, this takes a little getting used to. And to be super clear on what I mean with the fundamental rule here, let's suppose that after this program processes this vector, maybe the component of it associated with some specific bit string, like 0011, happened to be 0.5. Then when you square that value, 0.5 squared is 0.25, so the observable implication of this is that when you read out from memory, you have a 25% chance of seeing that bit string, 0011. One thing I'll highlight is that it is perfectly valid for the values in this state vector to be negative, and at first you might think that has no real impact, since flipping the sign doesn't change the square, and therefore all the probabilities stay the same. It is true that the probabilities stay the same, but we absolutely consider this to be a distinct state, and as you'll see, the idea of flipping signs plays a very central role in Grover's algorithm. Here, this example with four qubits has kind of a lot on screen, with not a lot of visualization to back it up, so let's scale things down to the smallest possible case where the computer has just two possible outputs, represented with a 0 and a 1. In this simplest possible case, the state vector would only be two-dimensional, so we can actually represent it geometrically as an arrow inside a 2D space. In this case, the x-coordinate corresponds to the outcome 0, in the sense that the square of that coordinate tells you the probability that when you read out from the computer, you would read a 0. Maybe it's helpful if I add a little bar to show that probability. You'll notice that as the vector points more in the horizontal direction, more of that probability mass is concentrated on the 0. Similarly, the y-coordinate corresponds to a 1 in the same way. A more vertical state vector means you're more likely to see a 1 when you read out from the computer. Now, notice, because the two probabilities should add up to 1, after all, something is going to happen, x squared plus y squared should equal 1. Geometrically, this means that the state vector has a length of 1, so you could think of it as being confined to a unit circle. More generally, the state vector for a quantum computer will always have a length of 1, and you can think of it as living on some very high-dimensional unit sphere. This two-dimensional example has a special name, which I've already mentioned. It's called a qubit, short for quantum bit. The analogy with a classical bit is that when you read out from the computer, you see either a 0 or a 1, but other than that, it is a completely different animal. Mathematically, a qubit is a unit vector in a two-dimensional space, together with a coordinate system where these two perpendicular x and y directions correspond to the two values that you might read out when you measure. You should know there is that added bit of complexity that I'm postponing, but this is 90% of the right idea. And again, you have this funny rule where when you measure the qubit, seeing either a 0 or a 1, the vector then collapses to fall onto that corresponding direction. So unless something is done to prepare that qubit back into a diagonal direction, any follow-on observations that you make are always going to show the same outcome. It's very possible that at this point you're thinking something like, okay, grant, this is a super bizarre set of premises you're asking me to accept, and if so, you are not alone. What I'm describing, as you can no doubt probably tell, are basically the postulates of quantum mechanics. There are many systems throughout physics, like the spin of an electron or the polarization of a photon, that have this property, where the outcome of a measurement is random, and our best laws of physics have us model the state of that system using a vector, just like the one I'm describing here, where squaring the magnitudes of that vector's components give you the probabilities for seeing various possible outcomes. That actually has a special name, it's called the Born rule. This idea of a qubit is basically meant to be an abstraction over many possible systems like this, in just the same way that a bit is meant to be an abstraction over many possible physical systems that can toggle one of two directions. The symbol I've been showing, by the way, is used throughout any subject with the word quantum in its name to refer to a unit vector in this state space. What you put inside that ket is often going to give some kind of readable meaning for what that vector represents. So in our example with a qubit, the unit vector to the right is often shown with a zero inside the ket, because if that's the state vector, it means you deterministically read out a zero from the computer. Likewise, the unit vector in the vertical direction is represented with a ket that has a one inside of it. And if you go on and read more about this, something that you'll very commonly see is that instead of writing down a general qubit with a column vector, the way I've been showing you, a lot of people like to write it as an explicit weighted sum of these two unit vectors in the two coordinate directions. That's a very physicist kind of convention. Now classical computing has this idea of logic gates, certain basic operations like AND, OR, and NOT that you can use to process bits and that you can string together to create arbitrarily complicated functions. Analogously we have what are called quantum gates, which are certain fundamental operations that you can apply to a qubit or to a system of multiple qubits. And they always look like somehow flipping or rotating the state vector. Now I'm not going to delve too deeply into the details of all the different quantum gates, but if you are curious, I'll show you an example of what one of them looks like. Here's a very standard one known as a Hadamard gate. What it does is it maps the unit vector in that horizontal zero direction into the diagonal northeast direction, and it maps the unit vector in the vertical one direction into that kind of diagonal southeast direction. You would very commonly use this to take a deterministic state, something that's either a zero or a one, and turn it into something with a 50-50 equal balance, or vice versa, too. This is just one example, but there are a number of others forming the building blocks for quantum computing, and the art of writing an algorithm in this setting is to somehow compose a bunch of different quantum gates together that will progressively manipulate and flip and massage this vector until it points almost entirely in one particular coordinate direction, presumably one that actually answers a question you care about. Now down with the simplest example of a qubit, you only have two coordinate directions to work with, so you would be constrained to answer simple yes-no questions. And although I can't illustrate a geometric vector with more than three dimensions, in principle, a system with k qubits is going to have 2 to the k distinct coordinate directions, one for each bit string. So if you can somehow manage to coerce this vector to point along just one of those directions, you could potentially answer some more interesting question, carrying more information. Maybe one of them represents a prime divisor in a very large number you're trying to factor, or maybe one of them represents that secret key value from the opening puzzle of this video. Even though the potential power of quantum computers has a long tradition now of being greatly exaggerated, insofar as there really is potentially more power there, one of the key reasons is that the size of the state vector grows exponentially. As few as 100 qubits would already imply a mind-bogglingly massive state vector. But the catch is that you have no direct access to the values inside this vector. It's effectively invisible to you. The only way it can be useful is if you have a way to manipulate it in such a way that all of the probability, or at least most of it, gets concentrated on one single component, and if that component corresponds to an answer to a question that you care about. Grover's algorithm offers us a really great example to actually see how this looks, and it's high time that we get there. Let me offer you a very high-level preview for how it looks. I promise I will explain all of this in more detail, but here's the bird's eye view. It initializes this state vector in such a way that there's an equal balance of probability across all possible outcomes. One of those outcomes is going to be the secret key that you're searching for, and the tool that you'll have available, which I promise to motivate later, is to flip the sign of the state vector at that coordinate. This doesn't immediately affect the probabilities, but when you interleave this with a certain other operation, and you kind of go back and forth between the two of these, what happens is that the probability mass slowly starts to get concentrated over that secret key value, and at a certain point, almost all of it will be there, so when you read out from the computer, you will almost certainly see the secret key you're looking for. Okay, so that's the high level, but let's unpack it in some more detail. The first thing to address is this idea of flipping the sign of the component associated with the secret key. That might feel a little bit weird. Why would we assume that that operation is available to us? Backing up, remember that Grover's algorithm is meant to apply to any problem where you can verify a solution quickly, even if finding a solution in the first place is hard. Examples here would include solving Sudokus, finding a valid coloring of a map where no two border regions share a color, or countless tasks throughout cryptography, where security often depends on a certain value being hard to find, even though for pragmatism it has to be easily verifiable. We began this video with a generic stand-in for all of these problems, where you imagine some function that takes in any number from 0 to n-1 and returns true on one and only one of those. In principle, we'll think of such a function as being built out of a bunch of classical logic gates. Those logic gates act on some binary representation of the inputs, and the final output is either 0 or 1. Now here's the key point. Grover knew that given any ensemble of logic gates like this, you can translate it into a system of quantum gates so that if in the classical case the function takes in some binary input and returns a 1, for true, then in the quantum case the effect of all of these gates is to flip the sign of that state, the state associated with the same bit string. Similarly, if in the classical case the function maps some binary input to 0, for false, then in this quantum translation the effect on the corresponding state would be to leave it unchanged. More generally, because all of these quantum operations are linear, if the state is a combination of multiple pure coordinate directions, then the effect is to simply flip the sign for the component associated with whatever bit string triggers that classical function. It means that if you have any NP problem, anything where you can quickly verify solutions, you're able to create an operation on a quantum computer that flips the sign of a state vector at the position corresponding to a solution of that problem. This might feel kind of useless at first. After all, flipping signs doesn't affect the probabilities. But Grover realized that this could be used in conjunction with another step that slowly amplifies the probability of that key value. There's actually a really nice way to visualize his algorithm. To set it up, let's imagine that our state vector has only three dimensions. Obviously in principle it would be way bigger, but this lets me draw the first picture. The three directions here would correspond to the values 0, 1, and 2, and the whole problem statement here is that one of those values would be a secret key that we're searching for. Many different quantum algorithms will begin by putting that state vector into a kind of equal balance, where all of the components have the same value. I want to give that equal balance vector a name. Let's call it B. I hope you don't object too much to me simply declaring that this is possible without dwelling on the underlying quantum gates that make it happen. In this case it essentially looks like a big pile of Hadamard gates, but all you need to know is that this equal balance direction is abundantly accessible. So starting from here, the goal is to somehow coerce this vector to instead point up in that secret key direction. And what's very helpful for the visualization purposes here is that throughout Grover's algorithm, the vector only ever moves inside the 2D plane that's spanned by these two vectors. So what I'm going to do is draw everything on that two-dimensional slice. And this is going to give us a faithful representation even when the full dimension is way too big for me to draw literally. The convention I'll use here in drawing that slice will be to put the secret key direction, whatever it is, along this y-axis, and then the x-axis is going to represent something that's an equal balance of all of the other states, the non-key states. So in our very small three-dimensional example, if that secret key was the 2 state up in the z direction, that would mean this perpendicular is an equal balance of 0 and 1, which sits on the xy plane perpendicular to that z-axis. Notice the fully equally balanced state, b, has some component in that secret key direction, since by its definition it has a little bit of that secret key within it. Instead of drawing this slice from three dimensions, here's what it would look like if it was taken from some larger number of dimensions. It's almost identical, but the main difference is that that equal balance state vector gets closer and closer to being perpendicular to the secret key direction. Crucially though, this angle is never quite 90 degrees, since that balance state always has a little bit of that secret key value inside of it. In fact, calculating this angle is going to be essential for understanding the runtime of Grover's algorithm. This is the main bit of math you actually have to do for it. You can find this angle by taking a dot product between the balance state and the key direction. The components of that balance state vector are all going to look like 1 divided by the square root of n, since remember it needs to be true that when you add the squares of all of these, it would give you 1. Since the key vector is 0 almost everywhere but 1 in one of the components, it means that the dot product between these two is 1 divided by the square root of n. If you know a little about dot products, you'll know that this is also the cosine of the angle between those two vectors. I'm going to translate this fact a little to make it more useful for our purposes later. This is the same as saying the sine of the complementary angle down over here, which I'll call theta, is 1 over the square root of n. For really small angles, the sine of theta and theta are approximately the same, so if n is very large, we can safely say that this angle is about 1 divided by the square root of n, as long as it's measured in radians. This value for theta is ultimately where the square root in the runtime is going to come from. So let's remember it. I'll throw it up in the corner here. Okay, having spent a while setting up the actual picture, let me show you the procedure here. It's surprisingly simple. Remember that key operation we were looking at earlier? The one where you take the verification function for whatever NP problem you're trying to solve, and you translate it into some quantum gates, and the effect is to flip the sign associated with that key value? Well, what does that look like inside our diagram? In this diagram, if you flip the sign for the component associated with the key value, but all other components stay the same, what it looks like is flipping about the x-axis. And the final ingredient you need to know is that it is also possible to flip this state vector around this equal balance direction. I realize it might be a little unsatisfying for me to keep declaring that certain operations are available, but one thing to know is that in general, if you can clearly describe and access one of these state vectors, it's also perfectly possible to reflect around it. There are some quantum gates, they let you flip around this balance direction, but trust me that dwelling on the details of how those look isn't really going to add much clarity or intuition to the whole algorithm. The insight really just comes from geometry. Notice how if you first flip around the x-axis, and then flip around this off-diagonal direction, the state now points slightly more in the vertical direction. If you were to read out from memory now, you would have a slightly larger chance of seeing the secret key value than all the others. Here, maybe it's helpful if I show the actual coordinates, in this case where n equals 100, along with some probability bars based on the squares of those coordinates. Notice how each time I flip around the x-axis, and then flip around this off-diagonal axis, the component associated with that secret key gets a little bit larger. So Grover's algorithm is remarkably simple. All you do is repeat this over and over until the vector points as close as it can get to the secret key direction. The final bit of reasoning you have to do is to figure out how many repetitions that should be. A wonderful fact from geometry is that if you successively flip about two different lines like this, the overall effect is the same as a rotation, more specifically a rotation by two times the angle between those two lines. So in our case, applying the two operations we have available one after the other, the net effect is to rotate the state vector by two times theta, where again theta is that little angle that we calculated earlier, approximately one over the square root of n. The ultimate goal is to rotate our initial state a little under 90 degrees, or about pi halves radians. This means the optimal number of repetitions looks like pi halves divided by two theta, which is pi fourths times one over theta, and critically, because theta is about one divided by the square root of n, it means the total number of steps looks like pi fourths times the square root of n. So what Grover's algorithm says is first find whatever whole number is closest to this value and then repeat your two available flips that specific number of times. As a concrete example, let's say that n was two to the power of twenty, meaning you're searching for a secret key out of about a million options. You would be running this on a computer with at least twenty qubits, and what the algorithm would say is first compute pi fourths times the square root of this number, which is around eight hundred and four, meaning you now repeat those two operations you have available, the one flipping the sign of the key state and the one that's flipping around the balance direction eight hundred and four times. Now remember, this state vector is invisible to you. There is nothing that you can do that will read out the values that it has. You instead have to infer where it must be through reasoning, and in this case, through all the geometric reasoning, you can conclude that after this specific number of times, the vector should be pointed almost entirely in that secret key direction. So when you read out from the computer, you are almost guaranteed to see that secret key value. Now to be clear, you're not guaranteed to see it. There is some small chance that after reading out, you would see something that's not the secret key. So to be sure, you could always quickly verify the answer, since after all, the whole premise of this situation is that you have some quick way of verifying answers. You can just do that on a classical computer. Worst case, if you got unlucky and sampled a different number, you run the whole thing again, and it becomes vanishingly unlikely that you would ever need to run this more than just a couple times. To wrap up, I want to come clean about a lie that I've been telling you, and also reflect a little bit about where this speedup came from, and then highlight a surprising analogy. Before I do, now might be as good a time as any to say a thanks to the community of channel supporters on Patreon. Putting together visualized lessons like this takes an enormous amount of time. As you probably know, most YouTubers monetize their content with in-video sponsorships, but for many years now that's something that I've opted to decline. I think it makes the videos better, and the only reason it's not a wildly costly decision is that enough viewers who agree with that directly support the channel through Patreon. In exchange, I offer supporters early views of new content, which is actually very helpful for developing it, and there's other perks in there too. In general though, if you like this content, it would mean a lot if you considered joining. No pressure though, one of the big values is that the content can be free. Anyway, back to those three finishing points. The lie is a lie by omission. I've been showing these state vectors with positive and negative real number values, but more generally, these components can be complex numbers. Now my hope is to motivate why that's the case in the follow-on video about physics. The general idea is that any time you're working with waves, you care about both amplitude and phase, and a complex number is a really elegant way to encode an amplitude and phase together. So if you look at one of the components inside the state vector, the fuller picture for how to think about it is that it has some magnitude and some phase. The magnitude is the thing that you square to get the probability, and the phase is essentially a more general version of our whole discussion here around positive and negative values. Changing phase doesn't immediately affect the probabilities, but it does affect the state, which in turn affects how it gets processed and how it interacts with the world. Now needless to say, throwing in a bunch of complex numbers adds a lot of potential confusion to an already complicated topic. That's why I avoided it. But we were safe to ignore complex values for the purposes of Grover's algorithm, since very mercifully, during that algorithm, you only ever see positive and negative values. I want you to know, though, that this availability of complex values plays a crucial role in other quantum algorithms, like Shor's for factoring numbers. Next, even if you understood everything that I described with this whole algorithm, it's not easy to summarize where exactly the speedup came from. The fact that you begin by applying a certain operation to this equal balance state makes it very tempting to say that the speedup comes from parallelizing the operation over all possible inputs. But like I mentioned at the start, that summary, at least to me, just really doesn't feel right, and it definitely leads to misconceptions. As you now know, that step, on its own, does nothing to reveal the key value. I'll leave it up to your interpretation whether it feels apt to describe this first step as applying a function to many inputs in parallel, or if it feels better to say that the balance state is just its own new thing, and the function we have always applies to individual inputs one at a time, never multiple at once. It's just that those inputs are now a new kind of thing. In my view, for this algorithm, if you want the one-word summary for where the speedup comes from, I think a better choice would be Pythagoras. As a loose analogy, if you want to get from one corner of a unit square to the opposite, if you're limited to move only in the x and y directions, you have to walk two units. But if you're allowed to move diagonally, you can get there in the square root of two. And more generally, if you're up in n dimensions, you would have to walk n units to get from one corner of a cube to the opposite one if you can only move along the edges, but if you can go diagonally, you can get there in the square root of n. In the worldview of quantum mechanics, different observable states all represent perpendicular directions in some state space. So viewed from this framework, what the classical deterministic world looks like is one where you only have access to these pure coordinate directions. If you think about it, anytime you're doing computation, your computer is somehow walking through a series of different states that are available to it, and algorithmic runtime is all about understanding how many steps you have to walk in the space of all possible states. So from this quantum worldview, where classical states look like pure coordinate directions, the key difference with quantum computing is that you now also have available to you a panoply of additional diagonal directions that you can work with. Now, to be clear, the analogy is not too literal, you should take it with a grain of salt. It's not like runtime necessarily looks like a distance travelled through this particular state space. But it is true that if you follow the state vector throughout Grover's algorithm, what it's doing is slowly walking along a quarter circle arc from an initial condition to the target condition, tracing a path that would be entirely unavailable if you were limited to move only in pure coordinate directions. And the effect is to provide this square root-sized shortcut. And as the very last point before I let you go, regular viewers will know that one of my reasons for covering this topic is because of a promised analogy between this algorithm and something we covered last video, about two colliding blocks that can compute pi. In that video, we are also studying a point in a certain two-dimensional state space bouncing around a circle, and in fact, the series of bounces that it took is essentially identical to what we just saw here with Grover's algorithm. The story here is that when a physicist friend of mine, Adam Brown, saw the first version of that video, he had recently been reading up on Grover's algorithm and immediately realized that both processes were identical. Now I had this whole cockamamie plan for this video where I was going to explain quantum computing and Grover's algorithm in the context of that analogy, but it turned out to be just a terrible idea. The whole thing only really made sense if you already understood Grover's algorithm. So instead, now that you have seen both topics in isolation, what I'll do is leave up a rough outline of how the analogy looks. Think of this as an open-ended homework puzzle, where the task is to draw the connection for yourself. As an answer key of sorts, I will link to the very delightful paper by Adam Brown outlining precisely what that analogy looks like. If you want to learn more of the fundamentals of quantum computing, several years ago two very smart friends of mine, Andy Matuszczak and Michael Nielsen, put together a really nice resource for learning the topic, offering a pretty unique approach to making sure that you actually remember it in the long term. To learn some of the fundamental quantum mechanics, Mithina Yoganathan from the channel Looking Glass Universe has been putting together a very beginner-friendly course on the topic. She was actually the one to teach me how Grover's algorithm works many years ago, and honestly I owe a lot of the content of this video to many helpful conversations with her. And as a very last note to end on, one of the conversations I had while making this video was with Scott Aaronson, a very widely respected researcher and utterly delightful author on the topic. I recorded the Zoom call for notes, and there's one little piece of it which I just wanted to share with you. You know, I have this dream of writing a science fiction novel where I know it's going to be the climactic scene, where the heroes will be, you know, Ron and Grover's algorithm to try to find this cryptographic key, right? The whole fate of the world will depend on whether they can find it, OK? The bad guys have surrounded their base, you know, they're bashing down the walls. But Grover's algorithm is still running, and it has only like a 30 percent probability of giving you the solution if you measure. So the question is, like, do you measure now or do you let it run for another minute? Right? And if you measure now, you know, and you don't get it, then you've lost everything. Then you have to restart from the beginning. So this is this is not a plot that you could have with any classical algorithm. Right.
Share:
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
How It Works
Copy YouTube Link
Grab any YouTube video URL from your browser
Paste & Extract
Paste the URL and we'll fetch the transcript
Use the Text
Search, copy, or save the transcript
Why you need YouTube Transcript?
Extract value from videos without watching every second - save time and work smarter
YouTube videos contain valuable information for learning and entertainment, but watching entire videos is time-consuming. This transcript tool helps you quickly access, search, and repurpose video content in text format.
For Note Takers
- Copy text directly into your study notes
- Get podcast transcripts for better retention
- Translate content to your native language
For Content Creators
- Create blog posts from video content
- Extract quotes for social media posts
- Add SEO-rich descriptions to videos
With AI Tools
- Generate concise summaries instantly
- Create quiz questions from content
- Extract key information automatically
Creative Ways to Use YouTube Transcripts
For Learning & Research
- Generate study guides from educational videos
- Extract key points from lectures and tutorials
- Ask AI tools specific questions about video content
For Content Creation
- Create engaging infographics from video content
- Extract quotes for newsletters and email campaigns
- Create shareable memes using memorable quotes
Power Up with AI Integration
Combine YouTube transcripts with AI tools like ChatGPT for powerful content analysis and creation:
Frequently Asked Questions
Is this tool really free?
Yes! YouTubeToText is completely free. No hidden fees, no registration needed, and no credit card required.
Can I translate the transcript to other languages?
Absolutely! You can translate subtitles to over 125 languages. After generating the transcript, simply select your desired language from the options.
Is there a limit to video length?
Nope, you can transcribe videos of any length - from short clips to multi-hour lectures.
How do I use the transcript with AI tools?
Simply use the one-click copy button to copy the transcript, then paste it into ChatGPT or your favorite AI tool. Ask the AI to summarize content, extract key points, or create notes.
Timestamp Navigation
Soon you'll be able to click any part of the transcript to jump to that exact moment in the video.
Have a feature suggestion? Let me know!Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.