 Bill Pefferman, Bill did his PhD at Caltech, and then passing by his postdoc at UMD in Berkeley, and now he's a professor at the University of Chicago. And he has lots of contributions in quantum complexity theory, especially on this focus on trying to formalize the notion of quantum advantage and trying to prove a formal statement about it. Okay, thanks. You guys hear me, right? Okay, phenomenal. So yeah, it's certainly a pleasure to be here and my thanks to the organizers for the kind invitation to lecture on this really beautiful venue. I'm gonna tell you in the next few lectures about the theory of near-term quantum advantage. Let me also say that I really would love this to be interactive. So if I don't get comments, I don't get questions, I'm just going to stop and wait for them. So please feel free to ask questions at any point. I really don't want to wait until the very end and then have hundreds of questions. Just ask them whenever you like. Interrupt me, no big deal. That's the goal of the lecture, okay? All right, so the starting point for this talk, which I think will surprise no one here, right, is that the first quantum advantage claims have now been made, and so we're living in a really exciting time where it's at least feasible, all right, that near-term quantum experiments that are being built in laboratories around the world are in principle capable of solving problems that can't be solved with a classical computer in a reasonable amount of time. Now we've seen these claims several times, right? First of course, very famously by Google in late 2019 who implemented their random quantum circuit sampling proposal on 53 superconducting qubits. More recently, a random circuit sampling experiment by USTC, and even very recently in the last few months, a second experiment by Google which sort of fixed some of the loopholes in their first experiments, kind of a larger experiment that has better gate fidelities and so on. Then separately, I would say, there's sort of a second proposal for quantum advantage, which was originally proposed by Aaronson and Arkapov, and it's called the Gaussian boson sampling proposal for quantum advantage. And this has now been implemented several, several times by a group at USTC. And in fact, this slide is a little bit out of date. There were some 2023 experiments by USTC. There's a Canadian company called Zanadu which implemented Gaussian boson sampling in 2022. But the point is there's a whole lot of these and they're getting better. But in these lectures, the goal is really to understand these claims in a rigorous way. So I'm going to give the latest complexity theoretic arguments to believe that there might be some hardness here. There might be that these experiments might be indeed classically hard. But then we're also going to think about the sort of the flip side of the same coin. We're going to think about classical simulation algorithms which have also been improving. And so rather than just telling you right away, oh, this is quantum supremacy, we've solved all our problems, it's very clear that these experiments are solving some difficult problem for a classical computer. The goal of these lectures really is going to be to sort of rigorously classify the difficulty of these experiments and to sort of be very clear about the shortcomings of these experiments as well as sort of the improvements that have been made over the past few years. Okay, great. And oh, by the way, one more thing I want to say. In fact, both of these are different from an experimental point of view, these experiments are very different. These random circuit experiments on the one hand and then these Gaussian-Bosan sampling experiments. But fundamentally, these are just two different special cases of what we call random quantum circuit sampling. So that they have a lot of especially theoretical similarities. Similarities in the theory we use to analyze these experiments. Even though the experiments are quite different. Okay, but who cares, right? Why am I standing up here for the next five lectures telling you about incredible details about these experiments? Well, I think there's sort of two reasons to care. One is from a computer science point of view. And from a computer science perspective, the importance of experimental quantum advantage sort of extends to trying to pinpoint the very foundations of computation, right? To answer questions like, what does computation even mean? That's because we can think about it as the first experimental violation of the so-called extended church-turing thesis. And what this violation would mean is that if we want to model computation by some theoretical universal model, right? Then we must consider quantum mechanics. I think this is actually a really profound statement, right? This is saying that the classical Turing machine, which is the model of computation, the theoretical model of computation that served us well in computer science for nearly 100 years, right? It's not the right model of computation. It's not the right model of computation because fundamentally it's not taking into account the nature of physical reality, right? Doesn't address quantum mechanical processes. And if we were to address quantum mechanics, right, we would get exponential speed-ups over the classical Turing machine. And of course, this would complement sort of theoretical evidence that this extended church-turing thesis is false. Starting from the early 90s with results like Bernstein-Bazarani's algorithm, Simon's algorithm, and of course, Schor's factoring algorithm. We've had theoretical evidence that quantum computers are a theoretical model that can get experimental, sorry, theoretical speed-ups over the classical Turing machine. But quantum advantage is the first time we'll see an experiment performing an exponential quantum speed-up, okay? So this has radical implications for the very foundations of computation. Okay, but for the physicists in the audience, I think this is also important from a physical perspective, right? Where we can think about the experimental quantum advantage as sort of validating quantum physics in a regime that we haven't seen before, right? No, I think we all know that the exponential growth of the Hilbert space, the fact that if I want to describe to you the state of an n-cubic quantum system and principle that would require specifying two-to-the-n separate parameters, right? This is something we take for granted in quantum mechanics, but in fact, it's really quite counter-intuitive. Now, I would even say it's one of the most counter-intuitive aspects of quantum mechanics, and we all know there are quite a few counter-intuitive aspects of quantum mechanics, so much so that from my perspective, since the very foundations of quantum theory, this is now over 100 years ago, right? People have asked essentially the same question from various different perspectives, which is, is this exponential description of a quantum state really necessary, or is there a sort of succinct, polynomial-sized description that's hiding behind the scenes that we just haven't discovered yet, right? And so I think this experimental quantum advantage gives us a new limit in which to test physics, which we call the regime of high complexity, okay? But in this regime, there's this incredible difficulty, that we're really sort of learning to deal with for the first time in the context of these experiments, which is, how do we verify something that's intrinsically exponentially complex, and how can we do that as efficiently as possible and preferably on a classical computer if we don't trust our quantum computer? And in fact, I think this is sort of what people often miss about quantum advantage, I think this might be the enduring legacy of the entire subject area, which is that at some point, I'm an optimist, so I think that we'll finally pinpoint that quantum advantage has been achieved, right? By some experiment, not today's experiment, but some future experiment. But even after that, even when everyone agrees that quantum advantage has been achieved, I think these tools that we're building to verify that quantum advantage has been achieved and to benchmark these near-term, high-complexity quantum systems will be with us for the indefinite future, okay? Okay, great, so now, what are we trying to achieve? Okay, so what's the ideal goal of quantum advantage? Well, I think we wanna find a problem that has sort of three properties. So first, we want the problem to be solved efficiently in polynomial time using a near-term quantum experiment. No, near-term is sort of the key word here, right? You know, the entire premise is that we don't want to assume things like error correction or fault tolerance, which, while in principle, you know, might be feasible several years from now, right? Is these are sort of things that are well beyond the current state of the art experimentally. And so we're really trying to show a rigorous advantage, you know, in the very near-term, okay? Then we want a problem that's classically hard to solve. And here we have to be a little careful. We wanna hold ourselves to a really high standard, okay? It's not sufficient to say, you know, the best classical algorithm that we know of, like a tensor network method, running on the fastest, you know, classical supercomputer that Google has at the moment doesn't work. That's nice, right? That's impressive. But we want more than that, okay? We want to prove rigorously that this problem cannot be solved in polynomial time by any classical algorithm as the system scales, okay? So, you know, this requires complexity, complexity theory, right? So complexity theory is the language by which we express these sort of algorithm, you know, algorithm-independent sort of lower bounds. And so it's not optional. We're gonna talk about a lot of complexity classes, but this is required if we want to make such a strong statement, okay? And then finally, we want the solution to the problem to be efficiently verified with a classical computer with minimal trust in the experiment. You know, this minimal trust is really, really important. And it's not because we don't trust our experimental colleagues, right? Generally speaking, we do. But, you know, we want to think of quantum advantage as a test of quantum mechanics. And if we're thinking about it that way, it's really important that we then don't go and make some very strong assumption, for example, about the noise model of the experiment to then prove quantum advantage, right? From our perspective, you know, that's like making a very strong assumption about quantum physics to try to test quantum physics, right, it gets circular. And if you're not careful, essentially meaningless, okay? So we're going to be operating in a very paranoid regime where we don't want to assume too much about the nature of the experiment. And yet we want to make a very rigorous claim about the power of that experiment. This is a very difficult and very subtle goal. Yeah, yeah, please, please. We'll get there. I promise you you'll be satisfied by the end of this talk. But yes, absolutely. Okay, now, like so many other things in life, there's a gigantic gap between the ideal goal that we might hope to achieve on the one hand and what I'll call the current goal, the thing that I think may be actually within our reach with near-term experiments. So let me give you a little bit of a status quo, okay? Before I even get into the technical detail of any of this, let me tell you what I think has actually been achieved and that we might be in danger of achieving another year or two. Okay, so current quantum advantage experiments solve sampling problems in which the goal, that is the problem that they're trying to solve is to sample from a complicated distribution, okay? The distribution specified implicitly by a quantum circuit followed by measurement, okay? Now, we have rigorous but imperfect, not completely waterproof evidence that these problems cannot be solved classically in polynomial time as a system-sized scales, okay? And I'll spend a lot of time talking about this evidence in these lectures. But I think one thing I want to get out right away and to be very frank about is that there's a giant gap between on the one hand these sort of complexity theoretic arguments that live in asymptopia, that are fundamentally about the hardness of implementing a system that scales, right? Whose input size is growing, right? And then the actual experiments which are fundamentally not scalable, okay? And they're not scalable for at least two reasons. So one, the current quantum advantage experiments require exponential time on a classical supercomputer to verify the answer to the problem, okay? So in principle, if we made the system size larger and larger and larger, at some point, even if you had a really good experiment, you know, the gate fidelities were really, really high and the qubits were really great. The number of qubits was very high as well. We wouldn't actually know how to check that this experiment was solving the right problem, okay? Second thing is that uncorrected noise, which is really a defining characteristic of this near-term error, the fact that we're not applying fault tolerance or error correction and so on, this gets worse and worse as the system size grows, right? So if you take the system, like, say, Google's current 53 qubit experimenter, they're 60 qubit, whatever it is, and you sort of add qubits without proportionally, you know, decreasing the error rate, okay? The quantum signal is going to be exponentially diminished, okay, and eventually we're not going to even be able to feel the signal. We're not gonna be able to measure it in any efficient way, all right? And so this is a fundamental gap, and what I would say is the current hope, which is very different from the ideal hope, and I outlined in the last slide, is to find what I call a Goldilocks system size. Now, if you know the Goldilocks story, you'll understand immediately what I'm talking about. If you don't, let me summarize it for you very briefly. Now, there's this little girl, right? And I think somewhat inadvisably, she goes into the woods, right? And she meets her friends, the three bears, okay? One is a big bear, and one is a medium bear, and one is a small bear. And then somehow the moral of this story is that the medium bear is always the right choice, okay? So this is exactly what we're trying to achieve here in quantum advantage. On the one hand, we need the quantum system, the number of qubits, to be large enough so that the system is classically challenging to simulate, all right? So we need it to be big enough, but on the other hand, it can't be too large. If it's too large, the effects of noise overwhelm, we can't find the signal, okay? And furthermore, we can't verify the experiment, okay? So we don't even know, even if the signal was there, we wouldn't know how to tell that it was the right signal in the first place, okay? So this is very, very important in the near-term era. I'll be talking a lot about this. Now, what I'll say is that the moment there's sort of measured optimism that current quantum experiments have reached this Goldilocks system size, okay? Like Google's recent experiment, for example. But this is very much a work in progress, okay? Classical simulation algorithms continually improve. Classical computers get faster, okay? And are able to simulate larger and larger quantum systems. But also the quantum experiments are getting better, right? And so if there's sort of one thing that you get out of this and you decide to fall asleep right after this slide, the thing I really, for whatever reason, the thing I really want to get across is that so much is still unknown, okay? This is a field that has been dominated, I think, rather unfortunately, by very categorical statements. Statements like, okay, we've now experimentally implemented a quantum system that will take thousands of years to simulate. Or on the other hand, we now have a new classical algorithm that takes advantage of some noise and therefore all these quantum advantage experiments are completely bunk. And what I want to get across right away is that neither of these two claims, which you hear again and again and again from many different people, are completely correct, okay? And if there's nothing else you get out of this, it's that this discussion is incredibly nuanced. And in many ways, when you say that quantum advantage has been achieved or has not been achieved, so much depends on what you even mean by quantum advantage. And we're gonna talk a lot about that in these lectures. Okay, now I'm gonna start with a little bit of technical material. Can I ask, yeah, questions are great at this, please. I wonder if this, the regime is so different. Any comments on that? Yeah, I mean, yes, I think your comment is exactly what I'm saying, in fact, right? There's sort of two things. On the one hand, we would hope that there are sort of theoretical arguments that are sort of resistant, in some sense, to changes in the current experiment, right? And I'll describe those to you, right? But fundamentally, at the end of the day, those are not going to be enough. Just some argument that says, you know, there's no polynomial time algorithm for simulating a system that's scaling, right? You know, an infinitely sized system. It's sort of never going to be, I think of it as necessary in some sense, but not sufficient, right? At the end of the day, what we really care about when, say, Google implements, there are 53 or 70 qubit experiment, is how hard is it in human time for the largest supercomputer that Google has to do something similar? That's not fundamentally a question that complexity theory can directly answer. It's a question that complexity theory can give us intuition about. But it's not really a question that it answers directly, right? In some sense, we don't really care if the algorithm that the classical supercomputer is using to simulate the system is an exponential time algorithm if the system is not scaling. We care about the time on the wall clock, right? How many seconds did it take and was that comparable to the experiment? So I would say all of these things are balanced. It's not to say that the theory is not important. You know, I spent much of the last 10 years working this theory out, so clearly I think it's important. I just think it gives us a starting point for our belief that these experiments might be difficult, but it's not the end of the story. And this question about the wall clock time it takes for a supercomputer to simulate these systems, that's something that's going to change wildly with the technology, right? Yeah, please. Yeah. Yeah. Achieve the goal you set. Yeah, I mean, good. If you talk about fault tolerance, then yes, you can make a lot of this discussion much easier, but then you could also say, well, haven't we already achieved quantum advantage with Schwarz's algorithm? Right? You know, I think that's a compelling point. So in some sense, the whole point of this is that we're not trying for error correction. Of course, error correction would make our life much, much easier, but we would have maybe already been done before we started since the 90s, right? Can I get more questions about this? It's funny, this is not a technical introduction, but I think it resolves a lot of issues that people are very confused about. So I'm, yes, yeah, yeah, yeah, yeah, yeah. Okay, I'm not sure I completely understand the question, but what I'm sensing is that the spirit of the question is sort of, how do we know when we're done with this, right? How do we know when the goal has been achieved? I think there's a hope in the community that eventually when the systems get large enough and noiseless enough, there's sort of an exponential sort of curve and that eventually, no matter how much engineering is going to be put into building faster supercomputers, classical supercomputers and better algorithms for simulation, the quantum computer will just pull away. That's the hope, we'll see what happens. It's okay. Certainly not to the current goal is sort of not, the current experiments have not reached that point for sure. Can I get more questions though? These are really good and important. Yeah, why wouldn't we prefer to have classical machines doing all this stuff? Well, I guess it's a fair question, but would any of the motivations that I outlined about testing quantum mechanics, would any of that be true with a classical computer? Yeah, probably not, right? It's certainly not gonna be a test of quantum mechanics. It depends. I would say though that in fairness, we've been learning a huge amount from these experiments about new classical simulation techniques. That's actually pretty awesome. I said that there was this sort of enduring legacy, which is about benchmarking or verifying these systems. I think another legacy that I didn't mention might be that we're learning a lot about how to classically simulate quantum systems that we didn't know before these experiments, in part inspired by trying to prove that these experiments are wrong. But that's pretty awesome and that's something that's going to be useful well after the quantum advantage experiments have either been successful or whatever, whatever the endpoint is. That's really, really interesting. These are good questions, can we keep them coming? In the back. Yeah, if I knew, I'd be much happier, right? It's not even completely clear that quantum computers will pull ahead. I happen to be a believer. I think there will be a point at which they will. But it's not something that my, anything from these slides will tell you, right? Because no one really knows that. We're learning a lot about this and there's so much we have still to be discovered. Anything else? Oh yeah, go ahead. You know what? The next four lectures are going to be devoted to that question. You're going to be tired of hearing answers to that question. Yeah, yes. Uh-huh. Yeah. The bottleneck is in absolutely every aspect of verification and we'll talk about it. So I think now I'm hearing sort of more and more technical questions. So let's get on with the talk, okay? But they're good questions. So what do I mean by, oops. So what do I mean by random circuit sampling? First of all, okay? Okay, so what we want to do is generate a quantum circuit C on n qubits, on a 2D lattice with D, which let's think of as some parameter which can scale with the system size in some way that I'll talk about. D layers of Harrandum nearest neighbor gates. So in other words, each layer might look something like this picture here where the circles are qubits. The edges are Harrandum nearest neighbor two qubit gates. This could be the first layer of gates and the second layer might look different. So for example, we could have our gates might interact with horizontal neighbors rather than vertical neighbors in the next layer, right? But they're still nearest neighbor, right? We start with the all zero input state where we have each qubit, there's n qubits, each of the qubits are initialized to zero. We apply D of these different layers and then we measure all n qubits in the computational basis. We think about this as getting a sample from a distribution I'm gonna call D sub C over n bit strings, okay? And that's the hard problem. That's it, okay? The hard problem is to sample from the output distribution that's D sub C of this random quantum circuit. Yeah, yeah, yeah. Ah, yeah, good question, good question. Yes, at the end of the day, the theory is going to demand Harrandum and I don't know that any sort of fixed moment is going to help us, any sort of K design for any K. But the experiment is not gonna be Harrandum at all. It's going to be chosen from some discrete distribution over gates. That's one gap between the experiment and the theory. It's not at all the sort of leading order gap, okay? So it's not something that I'm not worried about, but strictly speaking in the theory we're always going to be assuming Harrandum gates. We probably can get away with other distributions but we haven't worked that hard to generalize it. That would be a great project in fact, yeah. Yeah, yeah, yeah, yeah. Well, okay, so first of all, yeah, this is a great question. I'm happy to take it. So the first thing I want to say is that it's true. One thing you could do, of course, is take whatever circuit you want, you choose it randomly and then you could compile it into two qubit gates, right, using some universal gate set, right? The problem though is that we really want to preserve the distribution that we're considering here. So if you take a Harrandum two qubit gates and you choose your circuit and so on and then you compile it, right? It's not going to be a random circuit with that gate set anymore, right, yeah. Precisely, fixed architecture, right. Now we definitely will be considering fixed architecture just like here and then choosing the gates according to some distribution. We're going to, again, we're going to be assuming for theoretical reasons that it's a two qubit Harrandum gate. In some cases, it's going to be important. In some cases, it's not going to be that important. It could be weakened quite a bit. Yeah, sure. Yeah, okay, for now, let's say uniformly, sure. But this is not going to come up to be clear. This is not even very important, right? The important thing is that we have some continuous distribution over gate sets, over our gates and all my arguments are going to work with respect to almost any of the choices that you could imagine, okay? You'll see exactly why in a moment, okay? So the precise way in which I want to say sample from the Harmeasure, it's not going to matter. Okay, great, now this has been implemented many times now. The first, oops, yeah. Yes, uh-huh. Yeah, yeah, yeah. You know, you can, again, the gate set question is not that important as long as you avoid sort of easy gate sets, like Clifford's. One thing that I'm sure you know is that if you choose your circuit randomly, according to the distribution I just outlined, the chance that you get an easy gate circuit, like a Clifford circuit, is incredibly low, right? And that's why we're happy with the hardness, right? Now there's another question you were asking, I think, which is could you choose another gate set and still get randomness? I think it's very similar to what he was saying and other people were saying. The answer is generally yes. Our arguments are not going to be super specific to the actual gate set that's going to be applied, but it's not going to work with all gate sets, but it's a little difficult for me to tell you which gate sets work and which don't until I tell you the arguments, okay? And then I think it would be pretty clear, actually. But I'm happy to keep answering questions. Yeah, yeah, please. Sorry, can you say it a little louder? I'm not sure, is it what? Is the goal to approximate? Ah, ah, right, okay, good. Is the goal to approximate a truly Harrandum N qubit unitary? Very interesting question, actually. Originally, the answer was certainly yes, okay? That was all of the theory was saying, look, wouldn't it be great if we could implement an N qubit Harrandum unitary? But of course, doing that is going to have exponential circuit complexity and so we're not going to be able to do it. And that was the spirit of this entire endeavor. That was the original reason that people might have thought that this was a hard problem. What we're going to see is that the actual hardness arguments, the way the proofs work, don't really rely so much on any sort of approximation to the Har measure and the N qubit sense, right? But maybe the right way to think about it intuitively is to think that this is sort of scrambling very quickly, that we're trying to implement something that's a little bit like a two design or an approximate two design or something like that. And so if you want to think about it that way, go ahead. It turns out to be a little more subtle than that and a lot of the theory will not actually require having implemented an approximate unitary design of any sort, but that's surprising. That's not something that was completely obvious. Yeah, do you have any, yeah, follow up? Right, right, precisely, precisely. So in fact, here's what's sort of interesting. The depth is crucially important if we care about T designs, as you were saying. So for example, we start to see approximate two designs at like around square root N depth. And for many, many years, people thought that was probably where the hardness would start for precisely the reason you're saying because we want to implement an approximate two design, something like that, right? Here, when I say square root N, I mean really N to the one over spatial dimension. So if it's a 2D system, then it's like square root N. It turns out though that a lot of the arguments, in fact almost all the arguments that we have for trying to be rigorous about how hard these systems are work even at log N depth in two dimensions. So that's actually well before we have this two design property or anything like it, okay? But that might be because there are still conjectures and we haven't like proven everything. It could be that when we do prove everything, we realize we require larger depth. That's perfectly reasonable. The reason we think log suffices at the moment, precisely because we haven't like proven everything we need to prove. Perfectly possible. Let me take maybe one more question if you have it and then I'm gonna keep going. Any more questions? Yes? Yes. Is that square root N? Yes. Oh yeah, that's actually a really great question. I'm gonna spend a little time talking about that. So she just asked log N or square root N depth, is that what's experimentally observed? Aha. What does that even mean? Right? Hold on, this is a really good point. What does N really mean? N is the input size, right? The number of qubits in this case, right? And when we say square root N, like in these two design arguments or whatever, D is square root N, that means as N gets very, very large, we need square root N. Is N getting really large here? In fact, this is one of the biggest sort of issues in the theory of quantum computing and the gap between the theory and the practice. N is 53. What is O of square root N mean? Does it mean two? Does it mean five? I have no idea what it means, right? So you have to be a little careful. Now, when we talk about the theory argument, we can absolutely say that's what I was just discussing. I was saying, okay, if we're theorists, we put our theory hats, now we talk about asymptotics, then we can say it doesn't require square root N, does it require log N? But when we talk about the actual experiment, it's gonna be really hard to make these sort of claims. What we really want at the end of the day is the experiment at finite size to generate enough entanglement so that tensor network algorithms, for example, don't work. Is that square root N? Is that log N? I don't know, these experiments aren't scaling, you tell me. Yeah, but numerically, it doesn't even mean anything. That's the whole point, right? Oh, because this is not square root N as in square root 53, that's not what we mean. Never what we mean. We mean O of square root N, right? Some constant times square root N, but the whole thing is a constant. Now, it is true, and I'm giving you a hard time, but I don't mean to. It's true that when you talk to experimentalists, they'll say that you need to scramble quickly enough to sort of, in particular, to prevent certain algorithms like tensor network algorithms. Generally, a good heuristic is something like something like literally the square root of the system size, but the theory doesn't really correspond to that so well, and it's a really important point to realize. Is that clear to everyone? Yes, there is, okay, it's a good question. You definitely can compute the constants, but then the problem is it depends on the implementation of the system, like the actual constant depends on what is the gate set, what is the dimensionality of the system, all of these things, and you change one parameter and the constant could be different. So yes, in principle, you could work it out. You could say, I worked it out for Google's current gate set, it's gonna, I think you're actually, it's gonna be somewhat depressing. It's gonna be larger than you'd want, but I can't say for sure I don't really do this. But fundamentally, I think the really important point is that the theory is working in one regime, which is asymptotics. The experiment is really working in a different regime. There's always going to be some hand waving when we're trying to bridge these gaps. And that's a great example of it, actually. We want the depth to be as large as possible so that we don't lose a signal. That's certainly a goal, right? We want the depth as large as possible so that current tensor network algorithms don't work. We want the depth to be large as possible so that we generate enough entanglement, but not too large because if it's too large, the noise affects us very poorly. But these are all questions that I think complexity theory doesn't directly answer. One more question, these are good ones. And then I promised I'd go on. Is it good? Okay, awesome. Okay, so now Google's random sampling experiment is going to be the focus of this talk. I wanna say a little bit about Boson sampling because this is another quantum advantage experiment you hear a lot about. What we do is this is a quantum optics experiment. The experimental architecture is very different. What do we do? We start with an n photon state with m, something like quadratically greater number of modes. We call that a foc state. So it's just a state in which the first n modes have exactly one photon. Then the rest of the modes have no photons. And then we evolve under a hard random linear optical unitary composed of beam splitters and phase shifters. If you've seen optics before, you'll know what this means. If you haven't, these beam splitters and phase shifters are the rough equivalent to gates in a qubit system. Now we take photon number resolving measurements in each mode. So after you apply the unitary, you ask how many photons are in mode one? How many photons are in mode two and so on? That's your measurement. That gets a sample, okay? That's the hard problem. It's very similar actually to random circuits in one sense, which is that we're starting in some simple initial state. We're applying a random unitary of some sort, right? Which has some circuit decomposition in some way. And then we take some measurement to get a sample. This is true for superconducting random circuits. This is true for bosonic random circuits. Okay, now recent experiments use a slightly tweaked idea where the input state is changed. They use something called a Gaussian input state rather than a Fox state that I described. For the purposes of this talk, since we're not going to talk too much about boson sampling, it's not that important. But what I wanted to say is that this has now been implemented several times. For example, by Xanadu with 216 modes and as many as 219 photons. And the one thing I wanna tell you about this is you can already see that there's a giant gap between the theory and the experiment. The theory is working in this regime in which the number of modes is greater than or equal to n squared, number of photons. And you can see the number of photons and the number of modes clearly do not have that relationship in this experiment. So a major open question in the theory of these experiments is how hard, even when we scale, is a, m roughly equal to n experiment? Okay, if the number of modes is roughly equal to number of photons, the answer is we don't know, okay? Okay, great. So let me tell you about the rough agenda. I want to first talk about the most basic, most foundational hardness arguments that you can possibly have, which is these sort of hardness of quantum sampling arguments. It's gonna take us, I think, at least one or two lectures to get probably two lectures to get through. Then I want to talk to you about hardness of benchmarks. So this is sort of much closer to how the experiment is actually being verified, sort of running certain benchmarks and asking is it hard to score larger than some predefined score on these benchmarks? But it's also, we also know a lot less about this. Okay, so proportionally, this will be a shorter section, okay? And then I'm going to sort of change gears. So in the first two lectures, I'll really talk about, I'll see this as good news for the experiment, sort of potential hardness arguments, reason to believe that there's a quantum signal in these near-term experiments. And then I'll shift gears and talk about easiness arguments, right? So I'll tell you about a classical algorithm for one particularly important benchmark called ex-quaff. And then finally, in the last part of the lectures, assuming I get there, I want to talk about uncorrected noise. And the fact that with uncorrected noise, we can sometimes have surprisingly clever algorithms that sort of are made to use this noise and to take advantage of this noise and the system ends up being a lot easier than we might have expected if we didn't take into account the noise. Until this last step though, I'm not talking about noise at all in the system. The first three lectures to be really clear will simply talk about the hardness of the ideal or noiseless circuits. And then I want to talk a lot about noise in this last lecture, okay? All right, great, questions before I go on. Yeah, yeah. No, no, no, that would be true in any current Bose-Unsampling experiment, even if we try to implement it with Fox states. The main difference is the input state. So rather than what we call a Fox state, which is exactly what I described, the simple input state in which you have one photon in the first n-modes and then zero photons in the rest of the modes, if m is n squared, it's like n squared minus n, right? That was the proposal due to Arun's and Arkhipov. And instead, this Gaussian Bose-Unsampling, the input state is something else called a Gaussian state. It's a superposition over photon numbers. So it's not like you have, you know, deterministic number of photons in each gate, like one or zero. And that's the main difference. On the other hand, the theory of these two experiments is strikingly similar. And I have work trying to bridge the divide, saying that the input state is not that. Like the input state doesn't affect the theory very much. It might affect the practice very much, you know, the sort of classical simulation algorithms and things like that. Good question, though. Anything else? Is there a? Yeah. Yeah, yeah, yeah, yeah. I mean, stay tuned. Yeah, yeah, yeah. Good. Okay, how much time do I have left, at least? 10, 50 minutes. Okay, I'm gonna start and we'll probably get like halfway into the, okay. So now, I wanna talk about the hardness of worst case quantum circuit sampling, okay. So there's no random circuits here. We're just talking about, you know, how hard is it to implement a certain random, I'm sorry, a certain fixed or arbitrary quantum circuit and then measure from its output distribution to get a sample, okay. Okay, now, let's be more formal. So what do we mean by quantum sampling? Well, current quantum advantage experiments sample from the output distribution of a quantum circuit. In other words, on input C, that's the description of the circuit, the quantum circuit, it's the gates in that circuit. The experiment is supposed to run C on the all zero state and then measure all N qubits in the computational basis to get a sample. We called that distribution D sub C. It's an N bit string Y, okay. That's the problem we're studying. Very important definition, it's very simple. We're going to call the output probability of this circuit C, you know, we're gonna label it by PY of C and it's exactly this quantity. It's the probability that if we start in the all zero state, we apply the circuit, we measure all N qubits, we get Y, okay. That's PY of C. We'll often be talking about P zero of C, which is the probability when we measure, we get all zeros, okay. Now, the first goal is to prove the impossibility of inefficient classical sampler algorithm, S, that samples from the same distribution as the quantum circuit, okay. So we're gonna try to prove that no such classical sampler algorithm exists. That's the goal. So to be a little bit more formal, what is this algorithm S? Well, it should have two inputs, right. The first input is going to be the classical description of the random circuit that specifies what the output distribution is that it's supposed to sample from. And then the second input is going to be a sequence of random coin flips. We're gonna call that R. That's just a bit string chosen uniformly at random, okay. And then given that input C and the random coin flips R, that algorithm should output a N bit string Y with the quantum probability, okay, with the probability PY of C, okay. So this is a classical algorithm. This is essentially simulating an idealized version of the quantum experiment. And our goal is to show that this classical algorithm does not exist. That's the goal. Questions about this? It's pretty clear? Great. Okay, how do we analyze this? Well, the starting point is actually really simple. I'm going to describe two problems to you. They're both very easy to describe. They call one the classical sum problem and the other the quantum sum problem. Here's a classical sum problem. We're given as input a classical circuit that computes a Boolean function F. And so F maps N bit strings to zero or one. That's a Boolean function in the standard sense. And then the goal is to compute the sum over all inputs X, that's all possible N bit strings of F of X. So help me out. How hard is this problem and why? Is it, you know, is it BQP? Is it NP? Is it P to the NP? Is it co-NP? No, no. How hard is it? Yeah. Sharp P hard, absolutely. How do you see it? Yeah, exactly. So the way I see it is that if F is a Boolean formula, right, with N variables, then this sum X, F of X, is counting the number of satisfying assignments to the Boolean formula, okay? So this is clearly as hard as sharp P. Okay, now that's sort of the clear description of the complexity of this problem. All right, now I'm gonna change it up a little bit. We're gonna call this the quantum sum problem, okay? And by the way, when I say quantum sum here, there's nothing actually intrinsically quantum about this. I'm just being sort of provocative, okay? So here's the problem. It's exactly the same issue, the same thing, but now G is going to map to, the input function is gonna map to plus minus one rather than zero one. That's the only difference, okay? You might think this has no difference at all and we'll see that it matters a lot. But we ask for exactly the same thing, which is the sum over all inputs of X of G of X, okay? How hard again? Yeah, sharp P. We're just relabeling the outputs. This is not a fundamental difference in that set. So the worst case complexity of this problem is clearly still sharp P hard. Yes, in both cases, I just want the function to be efficiently computable. That's all I mean. So I'm thinking about a classical circuit. It just is a poly-sized circuit. And by the way, why do I do that? I do that in particular because I don't want, I want the input size to be polynomial in N, the number of variables. I don't want it to be exponential in N, like the truth table, right? That's the only reason I'm, the input is a classical circuit. I just want a succinct description of this function because otherwise naively someone could say, oh wait a second, isn't the input to this function already exponential in N? So we don't want that. Other than that, it's not that important. Okay, great. So both are sharp P hard to exactly compute since they're at least as hard as counting the number of satisfying assignments to a Boolean formula. All right, now, so far this discussion is uninteresting, in my opinion. But things get much more interesting when we relax the problem and consider approximations. Okay, so here's what I'm gonna consider now. I'm gonna weaken the task a little bit. I'm gonna ask for a multiplicative approximation of this sum, so the input is the same. So Boolean function specified by a circuit. But now, rather than asking you to exactly compute the sum over all inputs, I'm going to ask you to compute a number alpha that's between one minus epsilon times the sum and one plus epsilon times the sum, okay? So rather than exactly computing the sum, I'm asking for a multiplicative estimate. It's multiplicative because if I multiply this out, the error would directly relate to the size of the problem itself, the sum itself, okay? It would be like epsilon times the sum. Okay, now, here's a really famous result in complexity theory. It's called Stockmeyer's Algorithm, and what it says is the following. It says that there's an algorithm for this problem, this classical approximate sum problem that runs in time poly in n and one over epsilon, okay? But it's not an efficient algorithm in the sense that it's not something we're going to do on our laptop, right? Because it uses an NP oracle, okay? Uses the ability to solve an NP complete problem in unit time, right? But the point of this and the reason that's going to be very interesting to us is that even though Stockmeyer's Algorithm is talking about an algorithm that's sort of fantastical, it doesn't really exist, okay? It turns out that that same sort of ridiculous resource that we're using, an NP oracle, we have complexity theoretic reason to believe is not enough to solve the exact sum problem, right? So in other words, even though we're using this crazy NP oracle, it's still quite interesting that that's the only thing we need to be able to multiplicatively approximate this sum because we know unless the pH collapses, which is sort of a complexity theorist's way of saying it's incredibly unlikely, unless the pH collapses, this algorithm, we don't expect to have for sharp P-hard problems, okay? There shouldn't be an algorithm even with an NP oracle for a sharp P-hard problem. That means that in particular, this classical approximate sum problem should be strictly easier than sharp P, in the exact case, unless the pH collapses. Okay, sorry, question about that before I go on? Okay, now, here's how we're going to use this. I call this Consequence One, okay? So the direct application of the Stockmeyer's Algorithm, it's going to be how we're going to use this in the proof of this sampling hardness. So suppose a classical sampler does exist. Remember, that's the algorithm that takes its input, the quantum circuit, and then outputs the sample from the distribution of the quantum circuit. Well, let's assume that it does exist, then outputting a multiplicative estimate for the probability of any outcome Y, that that sampler outputs Y, right, is strictly easier than sharp P. Does anyone see why that is? So I have a classical sampler, yes. Ah, yes, because you can write the probability that that classical sampler outputs a certain outcome Y as a classical sum, and therefore approximating the probability that it outputs that particular outcome is an instance of this classical approximate sum problem. So let's see how we do that. It's because the output probability of the classical sampler is fundamentally a classical sum problem. How do I do this? Well, I can define a Boolean function F, which maps to zero one, it's a classical sum problem, and it's a function of the randomness, R, right, and we're going to define F to be one, the value of F to be one, if on that input R, the sampler outputs Y, and otherwise it's gonna be zero. Okay, and now it's not hard to see at all that the probability that the sampler, probably over the choice of random coins, right, the sampler outputs a certain outcome Y is directly proportional to this sum, the sum over all R of F of R, right? Now, this is a question for you, so here's the thing. I said directly proportional, and I was purposely sort of dodging the fact that it's not exactly the sum, it's the sum over this exponential term, this two to the length of the, you know, the number of coins that we're flipping, that's what I mean by bar R, right? Now, why does that not matter, though, for me? Why am I still okay? What kind of error am I considering, and why does it matter? Think about it for a second. Well, hold on, what's the worry? The worry is that, you know, what I really want to get out of this sampler is this sum, the sum over R of F of R, right? I want to gain a multiplicative approximation to that, right? But if the error was additive, rather than multiplicative, right, like say it's an additive one over polynomial quantity, right? Then the worry might be that, you know, if I can only estimate this output probability to with an additive one over polynomial error, that might not be so interesting, why might it not be so interesting? Ah, yes, because you can output zero, that, you know, this additive error vastly swamps the error that you're trying to estimate, but that's not the case, that doesn't happen, because we're considering multiplicative error. How does it help me, though? Come on, how does that help me? The fact that I'm considering, say, inverse polynomial, epsilon being inverse polynomial, right? Multiplicative error, how does it help me? The problem that you just said, fact that the error can overwhelm the signal, it doesn't happen because we're considering multiplicative error, it's very important, but can someone say a little bit more about that, please? Precisely, precisely, or another way to say that, like the way I like to think about it, is that because we're considering multiplicative errors, even though epsilon is one over polynomial, if the size of the quantity we're trying to estimate is really small, then the error scales with that size, right, so in particular, the really amazing thing about Stockmeyer's algorithm is that even though the quantity itself can be exponentially small, the error never overwhelms the signal by definition because this is multiplicative error, okay? And that's key here. Okay, oh yes, oh, five minutes, you're not trying to, all right, good, awesome. Okay, now, things are very different when we consider the quantum approximate sum problem, okay, it's gonna be the whole point of these results. In fact, all of these results. So what do I mean by that? I'm gonna do exactly the same thing. I'm going to consider the quantum variant of this problem. By that, I simply mean I'm gonna change the output from zero one to plus minus one. The rest is the same, still a multiplicative error, one over poly is what we're thinking about, epsilon and so on. Here's what I'm gonna claim though. Unlike the classical sum problem, this problem is exactly as hard as the exact sum itself. Okay, so I think it's actually sort of surprising when we thought about the exact complexity the plus minus one change meant nothing. But when we think about approximate complexity, seems to mean everything, okay? And the intuition here is very simple. It's just that, you know, the main difference is that we have these sort of large cancellations in plus minus one sums that we don't seem to have in zero one sums. Okay, that's an intuition, it's not a proof. Let me try to prove it in five minutes, okay? Probably three minutes by now, but here it goes. Computer scientists would call this a binary search and padding argument, okay? So what do I mean by that? Well, the first claim is that even computing the sign spelled S-I-G-N, that's whether it's positive or negative, of the sum is already sharp P hard. That's my claim. And notice that that suffices to show that the multiplicative estimate is sharp P hard because multiplicative estimates preserve the sign, okay? So we're only making the problem easier by considering the sign and yet we're going to show it's hard. How do we do this? We're gonna have a padding, we're gonna start with what we call padding argument. What do I mean by this? Well, by adding dummy variables to the circuit computing G, we can compute another function, G prime, so that we still don't know what the sum over inputs of G prime is, but we know where it is relative to the original sum, okay? Let me describe how I do this. So, I'd say we take our original function G and it acts on N inputs. Now I add C variable, C binary Boolean variables. So now G prime acts on N plus C binary variables and here's how I'm going to define G prime. I'm gonna look at those K variables at the very end, right, did I say C variables at the end, right? And if they're all zero, I'm going to output the value of the original function G on the first N bits, okay? But if they're not all zero, what am I gonna do? I'll just output minus one, for example, right? And so notice that now this relationship that I have on the bottom right-hand side holds that I don't know what the sum over all inputs of this new function is, but what I definitely know is its relationship to the old sum, the sum over all inputs of G, okay? Now what's the next step? Well, now I compute the sign because I'm assuming that I have the ability to do that for any quantum function, for any plus minus one valued Boolean function. I ask, is this sum greater than zero or not? Aha, now I get the answer and now I know if the original sum was greater than K, right? What do I do now? There's, okay, I said there was padding, I just described padding. What is the next step? Binary search, thank you. What do I binary search on? On K, that's it. We binary search on K and we repeat. Now, here's an exercise for you. I claim what I really care about because I'm sort of seeing into the future as to how this argument is gonna work is not the sum over all inputs of X, G of X itself, but rather multiplicatively estimating the sum of the square, or rather, sorry, yeah, the square of the sum over all inputs X, G of X. Now I wanna show that that's sharply hard. Well, now it seems like I get in trouble, doesn't it? Cause remember we were talking about the sign of computing this, but even I can compute the sign of sum X, G of X.