 Felly gynnw'r griffwff wneud yn i gael gweithio'r hyd yn yma o'r cwmhren yn ei gweithio'r ffordd o'r ffant worlds o'r cael ei ystyried o gynhwys ar gyfan a'r ddechrau i ychydig o wneud ar gyfer Carantera i ddod i'r penlygaid a'u amdoedd cyfaintio'r cwmhren yn y byd hefyd, ac mae ydych contractorsaidd i gweld eu cyfaintio. Felly mae'r bobl yma'r gwadwch i fynd i'ch iddynt ar y cwmhren. Oherwydd, mae'n cael ei gweithio'r cyfaintio ac yn rhan o'r cael o'na cynnwys cydwyliau gyda'n ei gyflodi ar gyfer gwentym Cwntymol. Ond, yw'n gondol, sy'n gennym eu cyflodi ar gwybodaeth cyflodi, yn rhan o'n cyflodi a'r cyflodi yn cael ei gael yn gwybod rhywun beth o'n byw yn diogelio. Felly mae'r probeithio yn rhan o'r lleol sydd wedi'i cael ei ddechrau. Felly ydych chi'n rhan, rwy'n gweld yn gwybod i'r cyflodi ar fod y Cwntymolデau'r cyflodi. I've been told that in the US the only famous manifesto is the communist manifesto and I can't promise this will be as influential as the communist manifesto, but nevertheless maybe it was going to be of interest to some of you. Basically this is a document which tries to promote the idea that we should have more research and more investment in quantum software as distinct from quantum hardware. Quantum algorithms, understanding architectures, verification of quantum experiments, this kind of thing. In particular with a kind of European focus, but the idea is that anyone who agrees with it can endorse it, which is maybe helpful for those of us in the future who would like to see more activity in this area. I invite you to take a look at this page and to see if you'd like to endorse the quantum software manifesto so you can just search for this if you're interested. What do I want to talk about today? Something which we've heard a lot about this week is this phrase, quantum supremacy or quantum computational supremacy. This is obviously a terrible word, supremacy, but I want to use it in this talk because I want to distinguish it from advantage in the sense that we want to look at quantum experiments which do something which we couldn't simulate on a classical computer in any kind of reasonable time. We might imagine a quantum experiment which we can do in one day where if we had a supercomputer it might take us a year to simulate that experiment. I want to talk about proposed experiments which are extremely hard to simulate classically rather than perhaps just achieving a fairly small advantage over classical computation. In fact, what I really want to talk about is some classical simulation algorithms for some of these proposed quantum computational supremacy experiments. So classical algorithms which show that we're not going to achieve quantum supremacy with these particular experiments, with these particular choices of parameters, let's say. I want to talk about two examples of this in two different models. One is this model called IQP, instantaneous quantum polynomial time, which we heard a lot about earlier this week in McBremner's talk and also in Yens Eistedt's talk yesterday. In that model, what the algorithm gives you is a polynomial time classical simulation of quantum circuits in this model, of sampling from quantum circuits in this model. If the circuits experience some kind of small amount of noise at the end of the computation, and there's also some technical constraints which I'll talk about later, but the rough idea is that if we have some kind of relatively realistic noise occurring in these circuits, then we get a relatively efficient classical simulation which contrasts with the hardness of classical simulation results which we heard about earlier this week. The second result I want to talk about is in this somewhat different setting of boson sampling, which again we heard about earlier in the week. This is just a kind of numerical experimental result showing that classical algorithms for simulating these boson sampling experiments can be somehow surprisingly efficient in the sense that we can simulate quite accurately experiments in this boson sampling picture, which are maybe a bit larger than we thought we could simulate before. So this kind of pushes the threshold of quantum advantage or quantum supremacy in the setting a bit further away than we thought it was before. So what I want to do is I'll start out by introducing these two models. I guess we've already heard about them, but just to remind you and also to introduce them in the way I want to talk about them later, then I'll describe what the results are in each of these two settings and then I'll finally say a little bit about the proof or the algorithms that go into these results. So first, what's IQP? We've heard about this already, so an IQP circuit, instantaneous quantum polynomial time on n qubits, it looks like this so we've got our n qubits here and the way the circuit looks is we start out with some Hadamard gates, one on every qubit, then we have some diagonal gates, so this thing here, this D, is just a circuit in the middle of this overall circuit that's made up of gates like any other quantum circuit, but the key restriction is that these gates are all diagonal on the computational basis, so they might be like z, controlled z, t gates, this kind of thing. Then we have Hadamard gates at the end and then we measure each qubit in the computational basis. So the result of this is some string of bits, some string of n bits, and depending on what the circuit is here in this d box, this gives us different kind of distributions on outputs here, so for a given circuit we get a different distribution on measurement outcomes, distribution on n-bit strings. One nice thing about IQP circuits, apart from being perhaps easy to implement experimentally, is that these circuits have a nice mathematical description because IQP corresponds to sampling from some Fourier transform of some particular function. So if we have the function f of x, which is equal to the x if diagonal entry of this matrix d, so f of x is just equal to entry of d on the diagonal corresponding to x, bit string x, then the probability that we see a particular outcome s, so p of s, so s is an n-bit string, the probability we see a particular outcome is given by this thing here, so it's an average over x of minus one to the inner product of x and s over f2 times by f of x, all absolute value squared, and this is just the Fourier transform of the function f over z2 to the n. So you can just work out this expression based on this circuit here, because of these Hadamards on either side, and this is kind of nice because it means we can try and understand these IQP circuits in terms of Fourier analysis over z2 to the n, so they're kind of mathematically kind of tractable. And this model was introduced by Mick Bremler and Dan Shepard back in 2008, and already back in 2008 they had some cryptographic type arguments in their paper that it should be hard to simulate this model classically, by which I mean it should be hard to simulate sampling from this output distribution here on n-bit strings. So they had some arguments for why this should be the case, but these were kind of heuristic. Then a couple of years later, in a joint work with Richard Joseph who's also here, they showed that these IQP circuits are actually hard to simulate classically if one's prepared to assume a computational complexity conjecture, namely that the polynomial hierarchy doesn't collapse. For those of you who are not complexity theorists and not familiar with this, this is kind of like a version of the conjecture that p is not equal to np, somehow a somewhat weaker version of this conjecture. And the result is that IQP circuits are hard to exactly simulate in the sense that if you believe the polynomial hierarchy doesn't collapse, then it's hard for a classical algorithm to exactly sample from this output distribution on n-bit strings. Okay, so this is very nice result, but it doesn't say so much about real experimental implementations because even the real quantum experiment isn't going to exactly sample from this distribution on n-bit strings. It's going to do it up to some level of approximation. And a few years later, Mick and Dan and I were able to show that IQP circuits are still hard to simulate even if we consider approximate simulators. So these are ones that approximately sample from the output distribution in the sense that the distribution that they sample from is within small total variation distance of the true distribution. So what this result rules out is classical simulators that sample from some distribution within total variation distance epsilon, where epsilon is some small constant that's about 1% or something like this. It rules out classical simulators that sample from something total variation distance epsilon from this true distribution, assuming you believe certain other computational complexity conjectures. So I don't want to go into details about these, but they're kind of average case harness conjectures which are kind of maybe plausible, but probably very hard to prove actually. Although this non-collapse of the polynomial hierarchy is a long-standing conjecture, these ones are perhaps less well studied, maybe less easy to believe, but perhaps still true. So this result says that these things are probably hard to simulate classically even approximately. But somehow this is still not a very realistic result because this, the sort of regime that we're able to prove hardness for is that the total variation distance between the simulator and the real distribution should be some small constant. And if you have M gates in the circuit, that means that each, if you imagine the quantum circuit itself, if that wants to achieve total variation distance epsilon, each of the M gates in the circuit needs to have error about epsilon divided by M, something like this, because as we've heard earlier, these errors kind of add up. So if we have a kind of realistic experiment like the kind of things we're seeing today or in a few years time, it's not so clear whether you can simulate the, whether these IQP circuits should still be hard to simulate classically. And one reason I think this is interesting is not just because we think people might implement these IQP circuits as a nice experiment, but also because this, the same sorts of arguments might end up translating across to other families of quantum computational supremacy type experiments. So, so IQP maybe allows us to kind of play around with this question and see, you know, whether it provides us some intuition for other types of models. So this was kind of an interesting open question. And that's the one model I want to introduce. And the second one I want to talk about is boson sampling. And this is, as we've heard, is the problem of simulating the behaviour of N photons, non-interacting photons in a linear optical network on M modes. So we've got these, you know, photons, like just two of them here kind of being injected into some network of linear optical components, like beams, bitters, phase shifters, this kind of stuff. And then at the end we measure each of the modes and see, you know, if we have a photon or perhaps more than one photon in each mode. And this again gives us some distribution. So depending on what goes on in this linear optical network here, we get different distributions on the output modes. So the positions of the photons in the output. So this also has this really nice mathematical description. So this, roughly speaking, corresponds to sampling from a particular distribution on subsets of the integers between 1 and M. So this distribution P is defined by the probability of seeing a particular subset S of N of size N is the absolute value squared of the permanent of a sub matrix of A where the sub matrix corresponds to the choice S, the choice of this subset. So S specifies a subset of the rows of a matrix A. So A is an M by N matrix. And it should also be a sub matrix of a unitary matrix U, an M by M unitary matrix U. And then AS gives you an N by N sub matrix of A corresponding to the rows index by the subset S. And then, you know, to find the probability of seeing that particular subset, we take the absolute value squared of the permanent of that sub matrix, where the permanent, for those of you who haven't seen it, is like the determinant, but without any signs. So it's a sum overall permutations of the integers between 1 and N, of the product of the entries of M given by a particular row index I and a column index given by a permutation of the I. And, you know, the roughly part here corresponds to the fact that this only considers collision-free outcomes where we only see a most one photo in each mode. And, you know, this is often a fairly good approximation, in particular, if we have a, you know, a random unitary where M is much bigger than N, then like, for example, M should be like bigger than N squared or something like this, then we generally don't see collisions, so we don't have to worry too much about these other outcomes. So we can sort of think of this as a rough approximation of the problem we're kind of trying to solve. It's just a sample from this distribution on subsets of size M of the integers between 1 and M. Okay, so this is somehow the mathematical task we want to solve. We're given these matrix A, which is a sub matrix of some unitary matrix U, and we're asked to sample from this distribution P here. So that's like the mathematical task we want to solve, and it's kind of clear that there's a nice quantum experiment to do this. You know, you just implement the linear optical network and you do it and then kind of automatically samples from this distribution, but then there's the interesting question of how well we can do this classically. And Scott Aronson and Alex Algarpov introduced this task in a computational complexity sense back in 2010, and they had this extremely nice result that said that boson sampling is hard to approximately simulate classically if you assume various other conjectures in computational complexity theory. Like they also have an average case, a hardness conjecture, which I don't want to go into. They also have a technical conjecture about anti-concentration, which I also don't want to go into. But anyway, these are kind of plausible conjectures that one might try and believe, and their result is that this task is hard to solve for random unitary matrices U, assuming you believe these conjectures, which I won't talk about, and where approximately simulate again means up to small total variation distance. So your classical simulator is supposed to be accurate up to total variation distance epsilon for some small constant epsilon. Okay, but then, you know, you can maybe see already because this is such a vague thing I've written here, this doesn't necessarily tell us that much about how hard this is in practice. This is an asymptotic statement that says that we believe that if you assume these conjectures that the complexity of the best classical algorithm for this problem should be exponential, but, you know, how big does n, or perhaps also the number of modes m, need to be in order for this actually to be hard in practice, which is a kind of important question because people are trying to implement these experiments now, and it would be nice to know how big these experiments need to be before we really see a significant quantum advantage. And there have been a number of different predictions about this, and interestingly the predictions have been going down in terms of the believed complexity of this problem, in the sense that the original prediction of Arincin and Archipol is that if you have 20 to 30 photons, then this should be doing something which is very hard to do classically, it should be demonstrating, you know, really challenging behaviour to simulate. And more recently some have even predicted that you might be able to get away with seven photons and this being hard to do classically. So you probably need a lot of modes for this to be the case, but nevertheless, you know, the prediction is you might be able to get away with as few as seven photons. So this has been kind of evolving this number, but somehow it wasn't so clear what the right answer should be, I would say, in terms of experimental difficulty. Okay, so these are the two questions that I want to talk about today about IQPM boson sampling. And so now I'll just say what our results are for these two settings. So the first is a theoretical result and the second is an experimental, you know, numerical result. Yes, so the first result is about IQP and we're going to assume that we have some IQP circuit which is, you know, is of the standard form of IQP as we saw before, but now we're going to introduce some noise at the end of the circuit. This is going to be an incredibly simple noise model. We're just going to have depolarising noise on each cubit with noise rate epsilon. So we do the IQP circuit as normal and just before we measure, we do this depolarising noise. So this isn't supposed to imply that this is really what we think the realistic noise is going to look like in these kind of circuits, but, you know, it's a nice model to play around with. It's very simple, which maybe makes it easier to analyse. And, you know, it's nice in a way because it's somehow quite a classical looking notion of noise because we have this depolarising noise right before measuring. It's equivalent to flipping each output bit with some probability like epsilon over 2. So it's equivalent to taking the standard IQP circuit measuring and then flipping each of the output bits with some small probability. So, you know, this maybe isn't a realistic noise model, but it's kind of a noise model that, you know, our circuit should probably be able to cope with because if we have a real circuit, it's probably going to be much more noisy than this. So if the circuit wants to be hard under really realistic notions of noise, probably it should try and be hard under this notion of noise too. Okay, so it was our result for this setting. So we're going to, if P is the output distribution, so this is the distribution we want to sample from, then we need this technical constraint that the sum over x of P of x squared, so the L2 norm of this distribution squared is at most alpha times by 2 to the minus n for some alpha, which we think of as being a small constant. So, you know, the smallest this thing can possibly be is 2 to the minus n, so this is saying that this is not much bigger than it's as small as it can be. So this is some kind of, we could call an anti-concentration kind of requirement, is saying that this distribution needs to be quite spread out. It's not too concentrated in one place. Okay, so the result that we have is that if we have this condition, then we can sample from the output distribution classically, so this is the noisy output distribution that we get, you know, the noisy output distribution. We can sample from this classically up to distance delta, L1 distance delta, in time n to the order log alpha of delta divided by epsilon, which is, you know, it's kind of a horrible looking complexity, but the point is that if alpha, delta and epsilon are all order one, so yeah, so if these are all, you know, small perhaps, but if they're constants, then this algorithm is polynomial time. I mean, you know, the polynomial is going to be absolutely outrageous like it might be, you know, n to the 100, n to the 1000, you know, something like this, but still, you know, it's not exponential time as n grows. So this is saying that assuming you have the constraint that all these things are small, then you're not going to see an exponential quantum speed up for this kind of experiment because we have this polynomial time classical simulation. Okay, and this parameter alpha is, I guess, kind of the most interesting or controversial one, and the reason why it's kind of interesting to think about this regime of alpha being order one is that actually for the hardness proofs of our previous paper, we needed this to be the case. So in general, like for the cases of IQP circuits where people have proven that they're hard to simulate up to some kind of conjectures, you generally have this kind of this anti-concentration, you know, condition holding. So in many cases alpha indeed is actually quite small, and in particular almost all IQP circuits you can show quite easily have alpha being order one. So random circuits have this property that alpha is kind of small. Okay, so this is kind of a negative result if you want to prove that IQP is a way of achieving this quantum computational supremacy, but one interesting thing is that this classical simulability result, you can get around it very easily, it turns out, by using a classical error correcting code. So Mick already mentioned this in his talk earlier in the week. I'll say a little bit more about it, but the basic idea, or the intuition here somehow, is that the noise at the end of the circuits is kind of classical, so maybe it's not surprising that we can just use classical error correction to deal with it. We don't need to use full quantum fault tolerance. Okay, so in case you're already wondering about if there's some kind of contradiction here that if we encode with this classical error correcting code, why can't we just classically simulate that new thing that we've got that's still an IQP circuit? Well, the point is that this alpha parameter here kind of blows up then when you do this, so this isn't in conflict with that result. Okay, so that's the IQP results, and next is to talk about the boson sampling results. Here we don't have anything theoretical in our paper, but we do have something kind of empirical to say, which is that with a simple classical algorithm called metropolised independent sampling, which probably goes back to the 50s at least, this algorithm can sample from this boson sampling distribution with, quote, good accuracy by computing order one, n by n matrix permanence. So what does good mean? Well, it means that firstly you can look at it, it looks pretty good, but also that if you can do some statistical tests, then they pass and it looks like we're achieving good accuracy, but this is an empirical result, we don't have a proof that this is achieving this, but it looks pretty good. And what does this order one here mean? Well, it means that in practice it seems that you can get away with computing something like 100 matrix permanence per sample, and this gives you fairly good accuracy at approximately simulating this boson sampling distribution. Okay, so why is this interesting to cut down the number of matrix permanence that you need to compute to this relatively small number? Well, the reason is that each matrix permanent of an n by n matrix can be computed in n times 2 to the n time, or thereabouts by an algorithm called Reiser's algorithm. And this is still exponential, like obviously 2 to the n is exponential, but it's not that bad and exponential. So if we look at the entire distribution, the entire boson sampling distribution, it's a distribution on m choose n things, and if m is something like n squared, this is something like n to the 2n or something like this in size. So computing the whole distribution will take substantially longer than 2 to the n time just to write down all of the elements. So this is somehow a lot less than this. And then perhaps more importantly is, you know, forgetting these kind of weird bit-bigos here, the actual practical results are that you can achieve quite good accuracy for n equals 20 easily on a laptop, by which I mean in perhaps a second you can get quite a few samples, something like this, maybe 10 samples, 100 samples. You can get good accuracy for n equals 30 on a fast server, like a fast computer you might have in your office, and we project that you might achieve good accuracy for n equals 50 on a supercomputer, though we didn't run it on a supercomputer, this is based on other people's results about how long it takes to compute permanence on a supercomputer. So what I mean by this is just to reiterate is that you could achieve pretty good accuracy and fairly frequent samples, so you might get each sample in like a second or something like this. And as well as this, there's another paper by Peter Clifford and Raphael Clifford at about the same time which showed that there's even a provably correct algorithm for exact bows on sampling which has achieved similar performance. So if you don't believe these numerical results then actually there's a proven theoretical result that you can achieve, you can solve bows on sampling problem in time like order n times by two to the end time. So this shows that this somehow really is the true complexity of the bows on sampling problem. And I guess that the conclusion from this is that demonstrating quantum supremacy using bows on sampling might be a bit more challenging than we thought previously, in particular you're not going to do it with n equals seven. I'll show a bit later on a plot for where we think you might need to be to achieve some kind of significant advantage over classical computers, but you can definitely do it for n equals 20 or 30 on your own computer without too much pain. Okay, so that's the results for the last few minutes. I want to say a bit about the proofs or the algorithms or how these things work. And I'm going to start with the IQP result which remember was about approximately sampling from the noisy output probability distribution of these circuits. And the proof, I mean as I said you know IQP has this nice property that you can understand it in terms of Fourier analysis. And indeed the proof will be based on using Fourier analysis over z2 to the n and applying this to the noisy probability distribution on outputs. So I'm going to call that p tilde now on this slide. Okay, so the basic idea is to use a kind of beautiful feature of this depolarising noise which is that if we look at the Fourier transform of this function p tilde, so remember this is a you know p tilde s is a probability of seeing a particular n bit string s. So this is now the Fourier transform of this function over the group z2 to the n. And the nice thing about this depolarising noise is that it behaves very nicely with respect to this Fourier transform in that it shrinks the high order Fourier coefficients of this yeah I guess this should yeah it should be p tilde here. Oh no sorry no this should be p indeed here. So if we have a Fourier coefficient corresponding to s then and if we have noise with ray epsilon then this Fourier coefficient shrinks by 1 minus epsilon to the hamming weight of s. So when the hamming weight is large this kind of goes down very very quickly. And the kind of intuition for this is that you know the high Fourier high order Fourier coefficients the ones with large hamming weight they kind of correspond to the you know this kind of spiky or you know quickly oscillating you know parts of this function p. So these are the bits that really get suppressed by noise somehow. I don't know how good that intuition is but at least that's you know my intuition. So the point is that if we look at the Fourier expansion of this function p then we only need to understand the low order parts of it in order to approximate the noisy distribution p tilde well because the high order parts just get killed off. And it turns out that it's sufficient to approximate Fourier coefficients p hat s for all s such that the hamming weight of s is at most log alpha over delta divided by epsilon. So you know I won't go through the proof but you know you can you know turn through some equations and this is what you get. If you have a good approximation of these things p hat s for all s with hamming weight at most this then you get a good overall approximation of the probability distribution. There are definitely some things swept under the rug there but that's the basic idea. And then the question is you know how do we actually approximate these coefficients. So I should say like up to this point there's nothing about this argument that depends on the circuit being an Iqp circuit. This holds for any quantum circuit that we can look at the output probability distribution and if it experiences noise then the high order parts get suppressed. But then the question is you know can we actually compute these Fourier coefficients of this output distribution and in the case of Iqp circuits it turns out that we can we can use some kind of Fourier inversion argument to say that you know that each coefficient p hat s actually has a nice interpretation in terms of some convolution of this function f which we remember was you know the diagonal entries of the matrix D in the middle of the circuit. Each coefficient is given by a convolution of this function f you know with itself. And this thing we can approximate efficiently just by sampling different values of f. So if we want to approximate this thing up to you know a suitable level of error we can just sample a few choices y at random take the average and then this thing is gives us a fairly good approximation of p hat s. So we can do this for all of the the coefficients corresponding to strings s of hamming weight at most this and there are like n to the order log alpha over delta divided by epsilon of these things so the time we need to do this is you know n to the order this this thing here. So you know like I said this this is kind of horrible but it's still polynomial time when these things are all constants. Okay so I should also mention some subsequent work about this so that there's a very nice paper by Jung and Gao which showed that you can actually apply this algorithm and you know some extensions thereof to simulate noisy random non-commuting quantum circuits so general quantum circuits rather than just iqp ones. And this is a very nice result there are some kind of caveats about it so their analysis doesn't give you something that works for most for all circuits it gives you something that kind of works for most circuits so what you get from their result is that for you know that you get an algorithm which you can run and for most circuits it will be good but for some circuits it won't necessarily be good it won't give you a good approximation. But it was shown by Sergio Boykso and co-authors a little bit later that in fact for random quantum circuits the Fourier coefficients of the output probability distribution they decay so fast that in fact even guessing that the uniform distribution is even you know having a classical approximation of just sampling from the uniform distribution this even gives you a better simulation than than this algorithm than the Jung-Gao use of our algorithm. And you know you can also apply this sort of analysis to random iqp circuits as well. So you know one way of interpreting this is you know good news for random quantum circuits and random iqp circuits that you can't use you can't use this algorithm the other is you know interpretation is bad news because it shows that even just the uniform distribution is a kind of good simulation in this kind of regime so yeah because it kind of depends on your perspective. But you know another interesting point about this is that as I mentioned before that you can actually deal with this notion of noise quite easily using classical error correction. So this allows us to get a hardness result even when you have iqp circuits that have this kind of noise happening at the end. And I'll just say very briefly, Mick already said this, but just say very briefly how this classical error correction idea works. If we have an iqp circuit then the diagonal part of it d we can write it as something like this so d is e to the i h where for some Hamiltonian h where h is given by sum over l terms of some weight times by some tensor well yeah some tensor product of z gates acting on different qubits. So a way of thinking of this is that the circuit itself is specified by some matrix of zeros and ones that tell you you know for each term which qubits there are z on like because maybe we can assume that these these theta js are all constants actually it doesn't really matter but you know but the key part specifying the circuits is just this matrix of zeros and ones and c that tells you which which qubits have have a z gate on them. And what we can do if we want to to correct this sort of noise classically is we can just do perhaps an obvious thing which is apply a classical error correcting code to this matrix c. So we replace c with a matrix cm where m is a generator matrix of some classical error correcting code so we just you know multiply this matrix by the generator matrix of a code and you can show that this means that the the output bit strings of the resulting kind of encoded iqp circuit they have a nice relation to to the original ones in terms of this code in that the each right so so each diagonal entry of the of this matrix that we just knew new one we get dm here is equal to this what we had before you know with with matrix d but we replaced x with m times by x so so each of these bit strings gets replaced with its encoded version somehow. So then you know you can also show that when you see when you look at the outputs output distribution of this encoded iqp circuit and then if you didn't have any noise you would get something that's like a distribution on encoded bit strings so rather than the original bit strings that that you had a distribution on those you get a distribution on encodings of them and then when you have noise happening this just corresponds to bits of these encodings being flipped so this means that you can use your classical error correction algorithm that corresponds to this code just to correct these errors and to go back to the to the original bit strings that you had so this means that you can kind of undo this this noise at the end of the circuit and the nice thing about this is that we have very good classical error correcting codes whose overheads are very low so this means that that this kind of noise here happening at the end can be dealt with kind of quite easily without very big overheads okay so that's the rough idea and just to mention the the algorithm in the boson sampling setting as well and so as I said this uses a technique called metropolis independent sampling MIS and just to you know introduce this briefly basically the idea behind this algorithm is that we have a distribution which we can sample from efficiently and we want to go from this to a distribution that we can't sample from efficiently um so the distribution that we can sample from efficiently is the distribution on so-called distinguishable bosons um so this is a nice trick that if you start out with this matrix A which remember I said is a sub matrix of a unitary matrix and you form a new matrix by taking the absolute value of squares of each entry that's what this notation here is supposed to mean then if we have the same kind of distribution where now we you know the probability of seeing a particular subsets of the rows is given by the permanence of you know that's that sub matrix of A only we've now got you know the absolute value squared on the on the entries rather than on the outside now if we look at the problem of sampling from from this distribution you can do it efficiently classically and it's kind of a nice exercise to to see how to do this um it's also you know it's an appendix of a paper by Aaron Sinan Archipof they describe explicitly how you can do this but the kind of intuition is that nothing kind of really quantum somehow goes on now because these things are just behaving like kind of classical particles like so you can kind of treat each one of them separately so this thing we can sample from efficiently and what we want to sample from is something that looks kind of similar which is this this boson sampling distribution which is the one you know which we can call it the distribution on indistinguishable bosons where you know we now have the absolute value squared on the outside like I showed before and you know we've sort of got access to this distribution d and we want to get to distribution i and I can describe the algorithm for doing this in in basically one line um which is that we start out by taking a sample from this distribution d the one on um distinguishable bosons and then we repeat this this process where we take um new samples uh from this distinguishable uh boson distribution d and we accept them with a certain probability that's given by by this thing here so this is uh you know if you ignore this this min here this is basically a ratio of probabilities under this indistinguishable boson distribution and there's another ratio of probabilities under the distinguishable boson distribution so you can you can show that if you repeat this procedure you know again and again that you know you take samples from uh you know this distribution d and you know you accept them with this probability and you know if you accept then you keep this as your new sample and you go on and you repeat and repeat um you can show that this will eventually converge to sampling from the distribution um i so it will eventually converge to sampling from the um the true you know boson sampling distribution um and this is a really simple example of a Markov chain Monte Carlo method it's kind of the most simple one you could possibly come up with i guess um and then obviously the interesting question is you know how long does it take to do this and empirically it seems like it doesn't take so long you know you have to you can't just um well okay so there's a few things you have to have to consider to figure out how long this will take firstly it's you know how big is this acceptance probability like what is these you know what are these ratios here like how big are they and it turns out that uh in general uh it's you know in our experiments that seems to be quite large the acceptance probability is pretty big um next question is you know how many samples do you need to take before you seem to be you know converging quite well and the answer seems to be you know about 100 or something like that you know it's just an empirical empirical result we don't have any any proof um but you know you can kind of even look um in pictures to get a sense for for how well this algorithm works um there is some stats as well which i'll you know spay you from you know it's in the paper but um you can see for example in the case where n equals seven so this is just seven photons that this is a nice case because we can compare like various different sampling methods including brute force sampling this is kind of as big as a size as brute force sampling can go and a way that you can compare these different sampling methods as you can look at uh some particular statistic of the output like here we're just looking at you know the actual probabilities so the absolute value of the permanent squares then you know to make it look nice so taking like minus log of it but um we can have the histograms of these different um uh of the probability distributions we see under these different methods and see what they look like and you can see that this distinguishing distribution looks quite different to the others but these other ones um the uh metropolis independence sampling brute force sampling and also rejection sampling which i guess i won't talk about today these things all seem to do very well they all seem to sample from from the right thing and indeed it turns out for larger n as well you can see similar plots and you can see that that this does quite well okay so based on this algorithm we can make some predictions of where we need to be in order to see you know quantum supremacy using boson sampling and there's you know there's no rigorous definition of what quantum supremacy might mean but we can say for example you know a couple of definitions for example the the classical runtime tc on a supercomputer maybe that should be 10 to the 10 times the quantum runtime you know that's that's one possible criterion or we might say that the quantum runtime should be at most um a week the classical runtime should be at most 100 it should be at least 100 years you know this this kind of thing um and you know we can make educated guesses for parameters for the the quantum experiments and you know how long it will take to to get these samples on a supercomputer to try and figure out where the boundary is of quantum supremacy so so what this this plot is showing is the number of photons along one axis and then the the transmission probability of given photons so that's the probability that a photon survives to the other end of the circuit so because any any real circuit is going to have loss and and so forth so um so so what you know but you're able to to infer is that you've got this curve here of qa bigger than zero so this is the region where there's any quantum advantage over this algorithm that we presented um the dotted lines are if we allow a certain amount of loss in the experiment which I won't go into and we can see like these points here are the uh the parameters achieved by experiments that we have now this one a here is kind of interesting this is a proposed experiment from jamway pans group but you know it's not an experiment we have now like these are the experiments that that we have now and they're kind of down in this in this region and in order to achieve you know quantum supremacy according to these criteria based on our you know educated guesses you kind of need to be up here so you probably need you know around 50 photons and transmission probability of you know well over a half for each each photon so you know depending on your point of view this could either be very challenging or yeah maybe not so challenging I don't know um okay so the overall message I guess of the talk is that quantum supremacy so-called might be kind of hard to to achieve than we thought if we have to deal with realistic errors and also you know nontrivial classical algorithms that that um you know if we think a bit harder about what the classical algorithms can do then they might be able to outperform you know the quantum experiments but a nice thing about the errors is that we might be able to deal with them using simpler error correction techniques than full quantum fault tolerance and you know we heard a lot about error mitigation earlier in the week which is perhaps the kind of physics perspective on this and this is somehow perhaps the computer science perspective that if you have a kind of very classically tractable error maybe it's kind of easier to deal with than a full fully full blown quantum kind of error so I'd like to finish just with some references so the work on iqp was joint work with Mick Bremner and Dan Shepard as in quantum the work on boson sampling is joint work with many people and it's in nature physics and also there's a survey on quantum computational supremacy with Aram Harrow that you might find interesting too so that's everything I wanted to say so thank you all very much all right we have time for a few questions here hi thanks thanks Ashley for the great talk um I thought that um the the algorithm for um iqp with with small amount of errors was pretty cool but do you mind just showing the the result for the um scaling like you had the theorem there I just wanted to make a comment about the notion of simulation used there so it was yeah yeah that one so it's like n order um so I just wanted to comment that like you know it makes sense that when you fix this kind of delta the l1 distance you get something that scales polynomially but um maybe a uh more uh a kind of better notion of simulation would be one where your polynomial in both n and in one on delta and we we've got a paper coming up next week where we kind of show that this is what you need in order to have a referee not be able to distinguish between the simulator and and the actual system being simulated so potentially there's like a gap there where you've got a simulator but somehow the system can do computationally more interesting tasks than the simulator can so I just wanted to point that out and you know I wonder if this like move where you kind of impose a classical code on top of on top of this and they recover the complexity only happens when you have this notion of simulation but not the stronger one I don't know if you have any thoughts on that okay I mean that sounds great I just say um yeah I guess I I don't yeah I don't have too many insights into to kind of what these parameters should somehow really be or if you know if you look at other notions of simulation I mean obviously as you say it's a one distance is maybe not the perfect one it's just you know one which is kind of tractable so so yeah I think that sounds very interesting Ashley on the same slide sorry do you know do you know what is the constant in the x1 oh this one uh I could yeah you could work it out um it's it's not outrageous I don't think but it's it's not very good I uh well I think like the results I guess I mentioned of um you know this this other work of Sergio boxo and co-authors kind of suggests that if you want to do a random circuit on these uh this sort of architecture then you're probably better off just simulating it with a uniform distribution like uh well okay so yeah so so you could um right yeah so well yeah okay so I think maybe Sergio can correct me but I think like they just looked at kind of arbitrary constant epsilon uh somehow and and you know that this is I mean it's backed up by some theoretical results but it's you know there are numerical results here as well so I don't know uh yeah precisely what that is but but yeah I mean like the constant um yeah I don't know I mean I might guess like 10 or something but like I don't really know and and the you know for what epsilon really is like you know if if it's um you know 0.01 or something already this is not looking so good in terms of a classical simulation I mean you've already got like n to the 100 before you even worry about these other parameters here so so I'm not sure this really gives you a realistic simulation okay let's uh let's we'll take it afterwards because just to get a break so uh let's thank uh Ashley uh