 Good morning everyone, thanks IBM for inviting me to come and speak here at this conference. I'm Michael Bremner, I'm from University of Technology Sydney from the Centre for Quantum Software and Information and the Australian Research Council Centre for Quantum Computation and Communication Technology. I guess like everyone here this week, I'll be talking about things we can try and do with near-toon, maybe not so near-term quantum computers. And this work has been work I've mostly been doing with Ashley Montanaro, who's here, somewhere, the audience there, Richard Joseph, Sergio Boiscio, Dan Shepard, and a bunch of other people from Google and elsewhere. I think Ryan's the audience here too, so you know all those people you should talk to about these topics as well. I think we're going to see variations on this slide in almost like every talk this week, but we can sort of imagine a potential timeline over the next 10 plus years in quantum computing. Over the next one to two years, various groups have proposed that they should be able to get to around this 50 qubit limit in their devices, so IBM, Google, others are talking about it as well. And the hope is that at around that point, you know, it establishes some sort of classical to quantum frontier. And in this talk, I'm going to talk about how we can begin to think about how this is the transition point, sort of between the different, where there are quantum computations that begin to get very hard to perform on a classical computer. I'm hoping that by the end of the talk, you have a sense of how we can begin to develop applications which go beyond this into this regime over here where the performance of the quantum computer is much less in question than it will be at this sort of 50 qubit regime. So I'm going to talk through some applications where we have pretty good arguments that at around 50 qubits, it should be very hard to classically simulate these devices. But what I'm hoping for is to give you a sense that as we go to higher qubit number with slightly more complicated or slightly, actually slightly more structured problems, we can make these sort of arguments much less ambiguous. I'll focus on two aspects of this. So you can imagine the next few years there's going to be a range of problems people are going to focus on. And I imagine we'll hear more about this later today that people will focus on things like approximate optimizers or quantum simulators. And these are really good things to focus on, obviously, because there's really clear imperatives to those applications. What I'm going to talk about is probably neither of those things, which are maybe a little bit less clear, but maybe a little bit easier to prove mathematically. And I'm going to focus on some level of error mitigation for classes of quantum algorithms or for classes of quantum circuit and talk about ways in which you can develop a testable advantage over classical computers. Now, in order to develop applications that can outperform classical computers in the near future, we have to work pretty hard as theorists. What we need to do is to identify problems where the complexity of the best classical algorithms radically diverges from the complexity of the quantum algorithms that we have for that application. If we're going to develop such an application, we have to address a few key issues. The first one is kind of the big elephant in the room is that are quantum computers more powerful than classical computers at all? Any good application which shows such a radical divergence between the classical and quantum complexity of the problem would be a good candidate for being used in a proof that quantum computers can outperform classical computers. Right now, we don't know if that's true, and any such proof would separate P from P space, which would be like the hugest result in complexity theory ever. So we're coming hard up against these very difficult problems in computational complexity. So we have to move into the region of conjecture. But at the same time, you know, it turns out we can say a few things, which is nice. Right, so the second question is, of course, which computations do the classical and quantum runtimes diverge? And importantly, how many qubits and on what instances does this actually happen? And the question which Jay raised earlier was can we achieve or perform any of these applications in the near term without any notion of fault tolerance? So, you know, typically when we're studying the complexity of quantum computers, we imagine we actually have a fault tolerant device. When we're talking about complexity, we're talking about asymptotic scaling of systems, not fixed finite size devices. That said, there will be a crossover point, or we expect there will be a crossover point, and the question is to where that is, and whether you can achieve that without fault tolerance is, you know, still up in the air. Well, and we're trying to address it. You can imagine there is a maximum quantum runtime, which, you know, if I mean runtime, I mean depth, or whatever measure you're going to use for, you know, the complexity of your circuit, you know, beyond which the quantum computer is not going to work anymore, because there's too much secret here and so too much noise or whatever. The classical runtime, of course, is going to be way longer than that, because we can do a lot with classical computers. The point is that, you know, we want to be able to aim for applications which land us there before we hit the air and when we're on this side of the curve, of the classical curve, and this is a challenge. So in order to do this, we need to find the hardest quantum computations we can perform with the least number of qubits and the least number of gates. As a theorist, you know, our goals for, you know, analyzing such applications is to minimize the gate count and the qubit count, understand the influence of errors in these systems, mitigate these errors, and of course, improve the classical simulation algorithms, those classical algorithms for performing that application. So, you know, these, you know, three things here are probably pushing this curve down towards the left and that final curve thing there is pushing that curve over towards the right. And we need to understand where these crossing over points really are. So, this is the work we've been doing and our approach to this is of course to focus on things which aren't necessarily really interesting and actually develop applications which maybe industry cares about but rather to focus on things we can actually prove. And to do that, we've been studying randomized circuits and the complexity of randomized circuits. Now, at a very high level, the goal of this research program has been to classify the complexity of quantum circuits up to reasonable notions of error in quantum circuits via studying the complexity of the probability distribution to describe the outputs of quantum circuits. So, if we have a quantum circuit, like here's an example of a quantum circuit down here, you know, you run a quantum circuit, it outputs bit strings of ones and zeros. That's, in a sense, a quantum circuit is doing sampling from these strings of ones and zeros and, you know, a quantum circuit produces samples x with probability p of x and we're going to study this for random circuits. So, I said the idea is to bound the complexity of sampling, this process of sampling by studying the properties of p of x. Now, since the beginning of quantum computing, this is what we've been doing. However, what's changed in the last few years is that we've been able to study the complexity of these probability distributions and developed a series of techniques that allows to say things about the complexity of the output of a quantum circuit with respect to notions of error which actually are reasonable for experimental quantum computing. Now, by studying random circuits, and random circuits are good candidates for the sort of thing for a number of reasons, but one tuition, and it kind of physics the intuition for this is that random circuits quickly generate, you know, a lot of long-range entanglement. Now, of course, you can have non-random circuits that do that as well, but the examples you can think of say, Clifford circuit, things like that, rapidly build up entanglement, but they're also very, very structured and very, very easy to simulate. Randomized circuits don't have as much structure which makes them more of a challenge for classical computers to simulate them. All right, so the big breakthrough in this area was made by Aronson and Narkapov when they studied linear optical systems. So the problem they studied was, you know, what is the complexity of sampling from a randomly chosen linear optical system? So, you know, the input is some single photons. The output is, well, you're just going to measure in the Fock basis, the output of this linear optical circuit. And their argument established a potential advantage over classical computers which is based, you know, around this sampling problem for optical networks. And really importantly, this advantage holds for approximate sampling. So, you know, imagine your actual device output samples, you know, from the probability distribution R of X, whereas, you know, the ideal distribution is from P of X. What they're able to show, assuming some conjectures, is that you can define a reasonable, you know, total variation distance or one norm distance between these two distributions. And if a classical computer could sample within that distance, then, you know, there'd be consequences for a computational complexity theory which are pretty serious. A few years ago, we generalized that argument or improved on it by mapping it in some sense to spin systems. Now, the improvement was essentially that we were able to make the argument a lot simpler, which has been very useful for finding generalizations of this problem. So, the other sense in which we improved it is actually one of the open conjectures we were managed to resolve, it just turned out to be a lot easier in spin systems to resolve this problem. The problem that we studied is this problem which has become known as IQP sampling. An IQP circuit is one in which, you know, you have, you know, a standard input state, you apply a bunch of Hadamards. The non-trivial part of the circuit is completely diagonal on a Z basis. So, in this case, I think up here we have a circuit which is composed of square root of Z gates, so square root of C phase gates, and T gates. And it's randomly chosen circuit from that set of possible circuits. You know, you perform Hadamards, and then you measure. So, it has this structure, Hadamards, diagonal Hadamards. What we were able to show is that if you were to believe, you know, two reasonable conjectures. One about the average case complexity of certain statistical mechanical models. And two, which is that the polynomial hierarchy does not collapse. Then, classical computers cannot simulate this class of circuits to within constant total variation distance, generally. Okay, so I guess the most talked about generalization of this problem is what I'm going to call the Google proposal throughout this talk. In that proposal, we considered studying the problem of what is the complexity of sampling from randomized circuits. Specifically, in this paper, we talked about the issue of, you know, considering a seven by seven lattice of qubits upon which we performed a series of, well, randomly chosen gates, basically. The actual sequence is somewhat as they perform Hadamard gates and everything to create a big superposition state. Then you apply a round of C phase gates, a round of single cubic gates, which are like T squared X and square Y gates, another round of C phase gates, and so on. So, in this particular circuit, these are randomly chosen. There's a few rules in the way we choose these things, but basically they're randomly chosen. The depth of the circuit is, well, ideally greater than 40. And ideally, the fidelity of each of these operations is like one over the size of the circuit. If the fidelity is roughly that, that roughly guarantees that the total variation distance is roughly a constant. Okay, so that's the goal. And in this proposal, well, the best known classical complexity of this problem can be studied by studying the complexity of, you know, computing tensile network contractions on a network generated by this kind of thing. And the best known classical complexity for this problem is exponential in the number of qubits in the circuit depth. In this paper, we'll introduce this notion of cross-entry benchmarking to try and establish the validity of these circuits. Now, this is a big issue, establishing the validity of these randomized circuits is going to be a thing I'll talk about a bit more later in this talk. But, of course, these are very complex circuits for which we're sampling, taking randomized outputs for, which will occur with probabilities, which we can't actually compute easily, because that's actually the thing which is defined to be like the hard thing in this circuit. So, validating the circuits is not going to be something which is going to be trivial, it's actually quite an issue for this kind of problem. The cross-entry benchmarking works by basically doing heavy numerical testing on smaller sizes of these problems, and then doing basically an inference test to make sure that the larger distribution is, you know, commensurate with what we'd expect on a smaller distribution. In this paper, there's also heavy numerical testing to establish, well, also the levels of randomness in the circuit, which is important for the complexity theoretic argument, for the benchmarking, and also to understand the point at which these circuits really, when the classical problem begins to get really hard for this class of circuits. Now, I'm going to take a step back a bit and try and explain the theory behind some of this. I'm probably going to aim this at a level which is too high for experimentalists and too low for theorists, so everyone's going to be disappointed, so sorry about that in advance. But, you know, we're going to be in for the beginning. What is a quantum computer? So I find it helpful, actually, just to have this slide up there, so people are all thinking on the same page about terminology. Of course, everyone here knows what a quantum computer is. But, you know, a quantum computer is a simple thing. It's input is a classically easy to describe unitary, and you have an input state which is just like the all-zero state, for example. The output is going to be just strings of bits, which, you know, are output by the ideal quantum circuit with probability p of x, you know, given by the usual probability rule. Normally, we'll consider entangling gates to be able to be active between any pair of qubits. I mean, that's the thing you can do in the lab, right? Like, that's not a big deal. Gates are drawn from a finite gate set. So, for instance, T, Hadamard, CZ would be a universal gate set family, which, you know, you'd study. Universal gate sets usually have some construction that allows you to implement any unitary, assuming arbitrary runtime, ignoring errors. The problem we're studying is what is the complexity of p of x? Sounds really easy. All right, so let's begin to talk about the complexity of p of x. Pretty much since the beginning of quantum computing, we've known that quantum circuits can define these p of x's to have probability which can be all the way up here. This is the complexity of the gap p or sharp p. I'm going to use the two kind of interchangeably in a lot of contexts in this talk, though actually the difference between them is subtle and important for this talk. So hopefully I don't mess it up too much. But when you want to consider the problem of exactly computing a probability amplitude from a quantum circuit, the complexity really falls into two categories. It can either be very, very easy, which puts a complexity in p. An example would be the Clifford circuits, single, local unitaries on qubits, things like that. It puts it squarely in p. The other option is basically it ends up being sharp p hard or gap p complete this problem. Importantly, despite the fact that quantum circuits can define really hard complexities, so I should say sharp p and gap p is considered to be much more powerful than p. p is all the way down here. We think bounded error quantum computing lives down here. NP is over here. QMA, which is the quantum generalization of NP, is down here. We expect sharp p and gap p to be a superset of these things. But really importantly, while these things emerge in quantum circuits, we don't believe that quantum circuits can actually compute such functions with very high accuracy. In fact, you can almost take it as a definition that a quantum circuit can produce polynomial estimates of such functions. That's certainly the case for BQP. That's basically the definition of BQP. The reason for that is because the only way that we can compute p of x given unitary is basically by repeated measurement. So you get samples out of your circuit and we're going to have to somehow infer what the probability is with which samples occur. You have to take exponentially many samples typically to be able to do this. The best we can do with polynomial samples is typically a wonderful poly kind of error. OK, so these are a bunch of complexity classes over here on the right. The important ones to focus on I guess are sharp p and gap p here. Gap p is basically the generalization of sharp p that allows you to add some tracking numbers from one another. Turns out this is generalization while for the problem of exactly computing things, it's basically the same thing if you're allowed poly time reductions. However, when you consider approximations, there's a subtle difference between these two classes. So that's what we're going to talk about a little bit. Relative error approximation. So if we're talking about a relative error approximation is one that looks like that. So say you have an approximation a of x of a function f. It's one such that it's basically have a multiplicative constant. Basically it's an approximation that scales with the size of the thing which you're trying to evaluate. So it turns out for quantum circuits that there are classes of functions defined by a quantum circuit amplitudes that stay gap p hard even under relative error approximation. Actually it's all those classes which are gap p complete. It's a property of gap p completeness is the thing that can happen. This does not happen for sharp p. Relative error approximations for sharp p problems actually fall down into the third level of polynomial hierarchy. The reason for that is something called stock-maios algorithm. Stock-maios algorithm delivers in BPP to the NP. Actually it should be SBPP to the NP but whatever. It delivers a relative error approximation to such functions basically. OK. So I'm lumping together a whole bunch of results into the one statement here. And morally it's an argument that began with Ter-Hirland-Diven-Schenswerne in 2002 but it's been refined and actually wasn't exactly using the same words or anything but the basic gist of it was the same. And that is the gap p hardness of relative error approximations for such for quantum circuit amplitudes for say constant depth circuits, IQP circuits, linear optic circuits and any family which is universal, family of circuits is universal for quantum computation with post-selection. Yeah these things will be gap p hard. OK. This implies that for such families that there are no classical efficient algorithms for such for simulating these families that can achieve a multiplicative error bound without a cluster point or hierarchy. You can infer this from the relationship between stock-maios algorithm and gap p. That's not how all the proofs work in these arguments but with a bit of work you can see that basically the arguments actually are interrelated around this fact. Now the randomized sampling, this statement importantly is not about randomized circuits. It's also not about a notion of error which is something which is really commensurate with what you would have seen a lab. Quantum computers don't deliver multiplicative error approximations of themselves. They deliver say total variation distance approximations or diamond norm based approximations of themselves. They don't give multiplicative error approximation of themselves, not typically anyway. In order to make a statement about this we need to introduce another notion, another concept and that's this concept of considering randomized circuits. This was what was introduced by Hansen and Arkapov. So the quantum random circuit sampling argument essentially says that if there exists sufficiently accurate efficient classical samples for the output from sufficiently random families of unitaries stock-maios algorithm can be used to approximate these sort of functions which could be at p-hard and this would cause a collapse of the polynomial hierarchy. Like I said sufficiently accurate in this case means with constant one norm distance or constant total variation distance. Roughly what the argument is saying is that say you have one of these hard output probabilities of a quantum circuit or you have a family which defines such a hard one then there's going to be a set, well you find a set which has the property where you know through a random choice over that family you're going to get this property where the output probabilities are not completely uniform but they're going to be sort of varying just around uniform. They're going to be so quite small but there's going to be some variation. There's no exact uniform. Technically people say that this output probability distribution is anti-concentrate. The other property is that we want a relatively large set of these output probabilities to have the property that they're sharp p-hard and we're going to argue this sort of bi-randomization. If both of these sets are sort of large enough so on average if you randomly choose from this set you get the concentration property and you get the sharp p-hardness property at the same time we'll get an overlap here and if there is overlap then the average case complexity is, well we'll say the average case complexity is hard as a worst case complexity this implies the overlap, then if that's true then quantum computers are hard to simulate and these are families of things which are definitely hard to simulate on classical computers, assuming no clubs of polynomial hierarchy. What we can actually prove looks like this at the moment. So we can identify families which have relatively, which have these, have the sharp p-hardness property for the output probabilities. We can also identify them, the same families, anti-concentrate but all we can do is conjecture, can conjecture that the average case complexity and the worst case complexity actually line up. So there's a separation here is still a big problem in this area and it's unlikely to be solved easily. Okay, so just to get a bit more concrete let's talk about an actual instance of what these sort of theorems look like. So let's return back to this IQP sampling problem I mentioned before. We have a randomly chosen circuit where the substantive part of the circuit is made up of say t and square root of control z gates that have been randomly chosen and what the sort of statement we make is that the average case complexity to the complex temperature isomodal partition function I'll get to that in a second. If that's sharp p-hard, then quantum computers cannot be efficiently classically simulated from the constant total variation distance for that class of the polynomial hierarchy and random choices across this circuit. If you choose enough of these random circuits you'll hit hard instances basically. Now, why do I say complex temperature? When we're talking about partition functions up here it's because we can associate that these output probabilities are always equivalent to a partition function for a complex temperature isomodal. So in order to establish the complexity of these things what we're studying is the complexity of complex temperature isomodals. So in this case, if we choose these sets of things what we're actually considering is imagine a complete graph. We're going to assign edge and vertex weights to this graph and we're going to use that to divine an isomodal in that graph. The complexity of this problem depends on the average case complexity of evaluating the partition function of these models. Okay, there's another problem here which is based on the ground gaps in degree three polynomials. I won't get into that in this talk. This year or late last year we improved this argument. So instead of drawing things over a complete graph and randomly adding edge and vertex weights we now consider sparse graphs. The advantage for that, well there's two advantages for that. One, it's a different complexity here at a conjecture and actually it might be one which is stronger. And the second advantage to this is actually in terms of implementation this is like actually a pretty big saving. So these sort of things typically have n squared gates. The other one has more like n log n gates. That's actually a pretty huge saving and the number of gates will be required to actually implement this. Okay, now to take a look at the Google proposal what is this actually saying? So the complexity of this problem is dependent. It turns out that the output probabilities of these, well there's two conjectures. One is that the output probabilities of specifically choosing from this class of circuits anti-concentrate that's been heavily numerically tested. You can actually prove it for larger depth circuits for this more concretely. But I don't think this is really in question. Like you can really easily test that these things have this anti-concentration property across these families of circuits like numerically. The second point is that the average case complexity of the complex temperature partition function of class three-dimensionalizing models is as hard as the worst case complexity. So each of the output probabilities from this circuit can be related to a three-dimensionalizing model. These are chosen at random by basically your choices of gates. It's a bit hard to write down that's why I haven't written it up there. But it's three-dimensional. Basically you have two dimensions which are defined by the lattice qubits and the third dimension is basically a time dimension which comes from the choices of random gates in the circuit. Likewise, you can instead of relate similar classes of circuits, for instance, to other problems which are capable of describing universal families of circuits, for instance, the Jones polynomials. I'm gonna get into that, but that's something we did in the paper recently. You can go ahead and play this game for many families of circuits and for many variations of circuit, but many variations of circuit families and many variations of sharp B-heart functions. In the last couple of years, people have really begun to populate this table based around studying the anti-concentration properties of families of circuits, so these randomization properties and also the complexity of output probabilities of these circuits. Basically the starting point is you need to prove output probabilities of sharp B-heart and then what you wanna do is to go away and prove that your family of circuits have these sufficient randomization properties of certain kind. A few really noteworthy examples which I'll talk a little bit more about is these cases where we have, say, constant depth circuits or, yeah, like I said, the Google cases, in case we have randomized circuits where basically the output probability distribution is a chaotic or look like port-a-tom's distributions. Okay. Really importantly, none of these models have really made any progress on the problem of sharp B, the average case complexity of the sharp B-heart in terms of these problems, up to relative error approximations. And we know from Scott Arrington's paper last year with Li Zhicheng, is there any proof of this is likely to require non-relativizing techniques which pretty much firmly says that this is actually gonna be quite a hard problem. So progress towards relating these worst case complexity conjectures to average case complexity, sorry. Progress on mapping the worst case complexity of a problem to the average case complexity of these problems over these random choices of circuits is likely to be a difficult problem to resolve. Okay, so I'm gonna sort of change tack a little bit now and try and talk a little bit about implementation and how these different notions of space and time trade off for these sort of problems. Okay, so I mentioned earlier that we showed recently that you can run these randomized sampling arguments for what we call sparse IQP seconds. So sparse IQP circuits are those which are defined by sparsely chosen ising models. Or yeah, well basically they're defined by sparsely chosen ising models. The first thing we had to do was to prove that such circuits have this anti-concentration property. That's sufficiently random. So that's something you can do by convincing Ashley to compute a lot of roots of unity or something. And so he did that and we proved that it anti-concentrates. And then that argument established because we already knew that you had this sharp P hardness property or this relative error approximation to sharp P hard property for this family of circuits. You can combine these things to get this argument that they're hard to classically simulate, assuming the average and worst case complexity line up. Now, importantly, for this construction has, it turns out has N log N to keep the gates. That's just what pops out of this argument. You can, okay, so they're defined in terms of potentially long range gates with all commuting interactions. Okay, so what we're gonna consider instead is how, what that looks like if you're gonna implement that with the universal, you know, with the universal set of gates. Well, what you can do is you can use an edge coloring algorithm to decompose a circuit into order log N partitions of simultaneous gates. And in each of, each one of these partitions you're doing about order N, gates simultaneously, which can be performed, basically. Then sort of by running sort of standard arguments on sorting networks from the classical theories sorting networks, each of these partitions can be implemented with depth squared N in a universal nearest neighbor architecture. Okay, so, you know, you're gonna have an architecture like that where all the interactions are nearest neighbor and the cubits can talk to each other in nearest neighbor way and you get the overall depth is like squared N log N. Okay, so this is neat from an implementation point of view but from a theoretical point of view it's actually somewhat interesting and gives you a sense of how the time, space, trade-offs work for these arguments. All right, so if you were to take a sort of standard tensor network argument, standard tensor network algorithm for computing the output probabilities of these circuits, you can compute these probabilities in time which is the minimum of sort of order two to the N or order two to the little D times big D where little D is the depth of the circuit and big D is the diameter of the circuit, okay? Now, if there were a sub-squared N depth argument or circuit for implementing this kind of or particularly this class of circuits then it would imply that via these sort of algorithms that there is a sub-exponential time algorithm for computing the output probability of these circuits. Now, we know particularly for this family of sparse like UP circuits that actually these can be related to Ising models which turn out to have to be hard for the exponential time hypothesis, okay? Which means that what that means is that if we would assume that there are any problems in NP that take exponential time then these do too, basically there are instances where these do too. Now, if there were sub-squared N depth circuits for implementing these models then you would have a sub-exponential time algorithm for computing these hard functions. Hence, this bound is essentially optimal for IQP circuits. That's the lowest depth circuit you can have that will implement this sort of thing, okay? All right, now you can also consider implementing these models via measurement-based quantum computing or some other technique like that. So we can start considering, say, constant depth circuits for running these quantum simulation algorithms and there's been a series of papers which have done that. One way of considering this is to say take the Google proposal and consider it as this random family of circuits is emerging from randomized measurements on, say, something like a cluster state and there'll be some subset of circuits or some subset of qubits for which these circuits begin to appear at random. Now, the complexity of the overall system is then basically bounded by the complexity of this subsystem, okay? Now, but the exponential time hypothesis basically is bounding now our capacity to run a simulation on these systems. What I'm trying to say is that basically if you move to a 2D nearest neighbor box, as you say, with constant depth, you're going to have a polynomial increase in the number of qubits you're gonna need in order to be able to solve the same sort of classes of problems with the same sort of complexity. So we're always gonna have this trade-off between the depth of the circuit and the size of the system for these sort of problems. Yeah, so for instance, below the square root in depth, you begin to trade qubits versus gates. So yeah, if you go to constant depth, you have a polynomial increase in number of qubits for these classes of problems. So this raises the question again, like where is this quantum frontier, right? Where is this difference between classical and quantum computing? So one thing we can do is to sort of classically test for what is the best classical simulation algorithms we have for doing these things. And so you end up with this argument like around 50 qubits, for instance, for the Google model. Actually turns out for the RQP, sparse RQP case, it's around 60 or 70 qubits. And we'll hear later in the week that it's around 50 photons or above 50 photons for both sampling problems as well. Okay, now these things are affected by the choice of the hard problems we're considering. The classical runtime, but also they're gonna depend on the noise, the amount of noise in the circuit. So we considered the problem of measurement depolarization noise in RQP circuits. And we're able to show that there's a quasi-pollinomial time algorithm for simulating these circuits if you have a constant depolarizing rate in each qubit. Now physically, that's kind of the best case scenario for a noise model that you can expect is a constant rate of depolarization or something like that. So this is basically an argument that the physical situation is something asymptotically you'd be able to simulate in these circuits. Now it's important to understand though as well that asymptotically, if you have a constant rate of say depolarizing noise like this, the distance between your ideal quantum circuit and the thing you have in the lab is radically departing from one another. Like they're radically diverging. So the distance is gonna grow like at least N for this kind of error model for this kind of system. In order to get inside this regime where you can argue you have a constant total variation distance separation between these circuits, you need to sort of set a threshold level which says above this number of qubits it doesn't make sense to be considering doing this kind of thing without something like fault tolerance. And that threshold level will be something like the inverse of the number of qubits or the circuit size or something like the number of elements of things which could go wrong in your circuit. Importantly actually, so this argument can actually be extended to the Google model as well. Ashley's gonna talk a bit more about that I think later on Friday. So I won't go into the details of that algorithm too much. Oh yeah, I've got an advertisement there. Now what I do want to talk about though is that under this noise model it turns out in these IQP circuits that you can have a semi-classical notion of error correction which completely corrects this family of noise. This ideas basically came from a representation you can have for these IQP circuits which is closely related to binary linear codes. Now let me just try and talk you through that a little bit. Imagine you have any of these sort of IQP circuits. You can write them down, you can write down a linear binary matrix. So your matrix wants them zeros which describes basically where the gates are in your circuit. One thing you can do is assume you basically have the, well in this case I don't have the same theta but you can basically assign the same theta to every circuit. And then what you're going to do is say actually in the case of these circuits that theta would be pi for eight. Okay, so what this matrix CJK is is a matrix of wants and zeros. When you have say a T gate, you put a one there. Okay, so if you had say N qubits, so it'll be a L by N qubit matrix, L by N bit matrix, L by N matrix, sorry. But you'll have L rows and each row is basically describing a gate in this system. So it turns out if you just apply a classical error correcting code to that matrix, then that classical error correcting code will be able to correct this noise family in this class of circuits. And what I mean by class error, you know what I really mean? You're basically multiplying the generator matrix of, well, we're multiplying this matrix by the generator matrix of the class error correcting code for that, yeah, for the code that you're considering. If you do this, you end up with a bigger circuit, a more complex circuit, but that circuit has the interesting property that you can use a completely classical decoder on your algorithm. So basically for any IQP circuit, you can run this classical error correcting code on the description of the circuit and basically any good code will work. And by good, I mean it's actually an error correction code that can correct errors. This is specifically for this noise model, though I should point out. The reason for this is in part because basically, if you have depolarizing at the end of one of these circuits, it basically looks like bit flip errors on the sample space of these circuits, okay? So bit flip errors in some sense are basically just a classical error on the system. Like I said, any good code will work. If, you know, this unitary will be, well, say if you use a bit flip code, yeah, you pick up a log n factor. You shatter this noise with coding theorem and place it as a constant overhead. Could be found, but that's for a randomized coding thing which may not be so physical to implement. Like I said, DM is potentially much more complicated in D because actually this encoding process might involve creating more complicated gates, however, like so you might go for a two cubic gate thing to a thing which allows for n cubic gates, but like I said, the overhead is not gonna be dramatic. Okay, now this brings me to this problem of verification. So I said earlier, for these randomized circuit sampling arguments, one of the issues we're gonna have is actually verifying that our randomized circuit sampler is producing the thing it's claimed to be producing. In general, we can expect that complete black box verification is likely gonna be just something which is generally too hard to do. So beyond circuits for which we can do a lot of numerical testing, it doesn't make heaps of sense to be thinking about trying to verify these circuits. However, what we can do is, you know, think about systems as not actually being black boxes because they're not. They're gonna be things that we can actually probe and test in other ways. So we can get a degree of confidence that our system is doing what it's supposed to be doing. But if you don't wanna actually run that sort of argument, there are other things we can consider as well. As we said in the Google proposal, you can make fidelity estimates via this cross-entry benchmarking. Though, as I said earlier, this requires a lot of numerical testing and you're gonna have to compute very hard functions on the way you're doing this and for a large number of systems as a problem. Likewise, Aaron Sun and Chen considered this heavy output generation problem, which, like I said, runs into a similar problem. It requires computing the median of a complex circuit which could be something which is computationally difficult to do. What I wanna talk to you about now, just as I'm finishing up this talk, is a different approach based on sampling from basically pseudo random circuits, in which case we're going to hide, basically run a cryptographic protocol which will not exactly verify your circuit completely, but can give you a degree of confidence that your classical circuit or your quantum system is actually behaving as planned. So I said before that you can encode these IQP circuits in terms of a binary matrix C, which basically describes where the gates are in the circuit. It turns out for IQP circuits, the output probability of a given sample is dependent on the weight numerator polynomial of this code. Okay, oh sorry, when I say code, I mean there's this matrix. You can think of this matrix as being a generator for the generating matrix of a code. Okay, so the columns of that matrix will be the basis for that code. Okay, if you choose this matrix C carefully, you can determine certain properties of P of X and the weight numerator polynomial of that code. Typically, you won't be able to compute P of X for all the same arguments I've been talking about all the way through this talk. But if you have a carefully chosen family of codes, you can, well not necessarily maybe be able to compute P of X, you might be able to determine certain things about it, which are things that you can actually maybe test experimentally or probe experimentally. So importantly, while sampling X might be hard, it isn't always hard to detect a bias instead of samples. That is for some, you can imagine setting some bit string S, it can sometimes, you can sometimes determine with what probability say output samples from your circuit are orthogonal to that bit string. So that problem can generally be considered dramatically easier than the problem of determining the output probability of a circuit or whether you will get particular strings with a certain frequency. Okay, now we can use such properties to create a cryptographic protocol to hide such biases by basically creating a private public key pair. So you can imagine a scenario where Alice is a classical player and she's gonna define a circuit, she wants Google or IBM or whoever else to run and they're going to share with them the public key, which is paired with this private key. This randomization, you know, then they'll be able to analyze the samples coming back from the hardware vendor in order to test whether the correct biases occur with the correct probability. So we considered this problem about 10 years ago. Long, it was a simpletime in 2008, we weren't thinking about actually never doing this, we just were playing around a bit and trying to see how things were. Those two things were different. We had keys on phones. Blackberry still made phones. Do they still make phones? Anyway, Dave Bacon was at quantum computing conferences. It was weird. So the actual protocol, we based around a specific family of codes. So I'll run through the actual protocol, but the reason we wanted to bring it up is because it's probably generalized and improved on. This is something we came up with 10 years ago, basically no progress has happened since then, but I thought it should be pointed out so that people understand it. So we based this problem around quadratic residue codes. The reason why we consider quadratic residue codes is because they have the property, they're basically a parity bit short of being self-dual. And self-dual codes have a nice property which basically says that you can define a code word for which there's going to be distance which is always orthogonal to the sample space of that code. Now, specifically what we did was you choose a large prime Q, which is seven mod eight basically, to define a quadratic residue code of length Q and rank which will be Q plus one over two. So that's a Q over here. So the column space is the thing which generates the code, okay? And the rank of that code will be given by R. In this case, it's one plus Q over two. Q is an example of a singly punctured doubly even code. And as quadratic residue codes, because quadratic residue codes are parity bit short of being self-dual, okay? That feature allows us to define a private key for this problem. In this case, the private key, one way of seeing it is to basically, you add basically a string of ones down the left-hand side of the leftmost column. Then what we're gonna do is basically randomly choose another matrix R, another binary matrix R down here. It's just got no properties, just randomly choose it. And then you're gonna pad the rest of the matrix with zeros. Okay, what this does is define a direction which is orthogonal between this code's, well, this set and that set, okay? Oh, I've said it like here, this defines a string S which it turns out for an IQP circuit that the probability of X dot S equaling zero turns out to be 85%. This is something you can do through analysis of the properties of this code. Now, the problem is like how well is that hidden? Oh, I should also say then finally you randomize. Like you set this thing up, you basically apply any randomly near transformation. The overall rank of this thing will be one, so you could do something like column echelon reduction on this which will randomize the process and you can delete a column and then present that to your experimentals and say, hey, do that circuit. If you do a second order of crypt analysis as Hamiltonian, I eat the row space of this generator C. It suggests that randomized samples will avoid S with probably around 75%, not 85%, which is what you get out of your quantum circuit. So if you don't have any knowledge about S, basically the best thing you can do by analyzing the Hamiltonian as far as we know is to output things which probably 75%. This is assuming you don't just straight up have a way of classically simulating that circuit. Now this is before anyone was talking about these randomized sampling arguments. So conjecture one, which was in that paper is that sampling from randomized IQB circuits is classically difficult. Well, we kind of have that. It's actually for different family of circuits but this is sort of the argument we've been able to make. Conjecture two is that with high probability of this procedure, C cannot be distinguished from a random binary matrix, okay? So you run through this kind of procedure. This in principle should have no properties which distinguish it from a random binary matrix and that's the second conjecture we have. And conjecture three, you know, if that doesn't hold is that S cannot be found in poly time given C, okay? Conjecture three actually would imply conjecture two. So the example we generated that we sort of ran the second, we also generated an example and it had, well, R equal to 244. It's 244 qubit example. So that's a lot of qubits, okay? And that's sort of the smallest really nontrivial example that we were able to generate with this family of codes. And the other problem is that the gatehouse is quite high so it's definitely more than 1,000. It's gonna depend on your architecture and how you do it but it's well more than 1,000. So this argument proposes some obvious challenges. Firstly, like prove the above conjectures in some sense. Second challenge is make this easier. So we chose a specific family of codes because we needed this property that it was a parody bit short of being self-dual and quadratic residue codes have that property. However, you could consider other families of error correcting codes, for instance, for this class of problem. This is one obvious thing that people can actually look at. There's a lot of different self-dual codes out there. And yeah, like I said, find new examples and make this easier. And if you wanna see the details of that, go back and look at this paper on the archive from 2008. So it's kind of a million years ago. All right, so this brings me back to the beginning which is at the end. And hopefully now you've got a bit of a sense of how we can go from this regime where we have this crossover point between classical and quantum computation where we can begin to define problems which are really hard for classical computers on around 50 qubits but we can begin to move up into this regime where things get much less and much less ambiguous. We'll have by introducing things like error mitigation and validation of the problems we're considering. In both cases, this problem requires taking randomized problems and beginning to add more structure to these systems via the use of say error correcting codes or in that final example, via cryptographic procedure also happens to be related to error correcting codes. All right, this leaves a bunch of open questions. And I guess for me, one of the biggest open questions out there in this whole space is can we use these techniques for getting bounds on the complexity of quantum circuits to prove advantage over classical problems for more structured problems? So I gave examples roughly there in terms of a cryptographic problem. However, you could imagine trying to leverage these low bound techniques for problems such as optimization or for more general quantum simulations. For instance, those constant depth version of the sampling arguments where you basically see things that measure quantum computing are an example of an argument which shows for a deterministically chosen quantum simulation that is a simulation that produces cluster states that these are hard to classical simulate. Can you do this for other models? That's the thing which is definitely worth considering. The other question is how much noise is too much noise? When do we need error correction and when can we make do without it? The other big question is more generally how many qubits are required to outperform classical computers for other tasks? Okay, so we've made the argument around 50 is hard for these randomized circuit sampling problems. Maybe a bit higher if we have better classical algorithms, but what about other problems as well? Like it's worthwhile examining this kind of question across many families of algorithms. The final hard problem is, yeah. How do we attack this open average case complexity of conjectures? Is there anything more about it than we actually have? Okay, one of the rationales, I didn't talk about it in this talk, but for relating these randomized circuits of the Jones polynomial was because we can introduce via that construction a notion of depth which doesn't exist, say, for these IQP cases where the maximum depth of the circuit is like n squared. Okay, I think I might end there. So thank you very much. Okay, some questions for Michael. I've got a question for you. Are you using anti-concentration in the sense of chaos and scrambling or do you mean something distinct? I didn't catch your definition. Yeah, it's the same thing, yeah. Basically, yeah. So I was trying to understand the classical complexity that you were describing for the... Hi, yeah. Okay. Yeah, for the Google proposal. Yeah. In terms of the analysis of the tensor network. And so was the classical complexity you're quoting the worst case complexity. And the reason why I bring this up is if you use random circuits, you create a random tensor network. And random tensor networks are, in a certain sense, simpler than special tensor networks because they have a injectivity property which means that whether or not they're contractable is dependent only on the spectral gap of their parent Hamiltonian. So they're in some ways more easily characterized than the worst case tensor networks. So I'm wondering if you thought about the special things which happen with random tensor networks in... I mean, yeah, I mean, there's some things you can compute, right, to bound it. So generally it'll still be exponentially hard to compute, we think anyway, randomize tensor networks. Yeah, I mean the worst case, but are you then making a statement about the average properties of the spectral gap of parent Hamiltonian? Well, I wouldn't put it that way, but what I would say is we're making a statement about the average case properties, but we're conjecturing that. That's the point which has been conjectured. Okay, so the conjecture is that for these classes of circuits that the average case complexity of the relative error approximation, so a good heuristic for those sort of problems is still gonna be sharp, be hard asymptotically. Okay, so the classical cost that you're putting down is a conjectured cost? Yeah, totally. Oh, okay. Oh, yeah, yeah, yeah, absolutely. So that lower bound is definitely conjectured. I see, okay. In that case, so what's interesting about, say that case, if you go to higher depth for those classes of problems, it's then in some sense a conjecture about, you could also make it a conjecture about, say, the average case complexity of computing these output probabilities of typical quantum circuits. And that's pretty unresolved, right? We actually don't have a good answer to that problem. Any more questions? If not, let's thank Michael.