 Good night, rest, and we're able to enjoy some of Trieste and saw some things. But anyways, today we're going to be all day. We're going to be up here. We're going to have lunch again outside the buffet like we did yesterday. So I think we're now getting the role of things, how things work, right? So that's good. So today we're going to have Glenn and Ben opening. So he's going to continue his discussion on adiabatic quantum computing. And then we're going to start with error mitigation in the morning. So we're going to have some Zoom meetings. Unfortunately, speakers couldn't come. And then in the afternoon, we're going to have error correction. So it's going to be a very exciting program again today. You're going to learn a lot. Please ask questions. If you have a question, we do have a microphone. So we're going to try to use a microphone, because people online can't hear you. But without any further ado, we're looking forward to part number two. OK, thank you. OK, so thank you very much for the introduction. And today I'm going to continue the discussion I started yesterday. And I'm going to discuss about this adiabatic quantum computation. So the main thing that I'm going to maybe write here that we should remember is this would be the conclusion of the last lesson. So we started saying at the end that if we have a classical optimization problem, OK? So this is going to be the problem of given a certain function, h, which goes from a set of classical bits to r, is going to be finding the minimum. So zeta star, which is going to be, let's say, arg mean. So it's going to be the minimum of h of zeta over all the bit strings. So here zeta, as we said, are binary variables. Zeta 1 is a collection of n binary variables. OK, so actually let me just change notation compared to yesterday. And I'm going to use small n to indicate this. And this can assume this zeta here can assume value 0 or 1. So as we said, there are some interesting problems that can be written like that. And what we discussed is that this can be translated in a ground state, which is going to be finding the ground state of the following Hamiltonian, which we called h final. And it's going to be the sum over the z over the strings of h. And finding the ground state so that we are going to say gf, the ground state, is going to be equal to our z star that we are looking for. So it's equivalent to solve the ground state problem of this Hamiltonian or to solve the optimization problem. And to solve the ground state problem, we used adiabatic quantum computation. And we said that we supplement the Hamiltonian with another Hamiltonian, h i. And we initialize the system in psi 0. At time 0, it's going to be equal to the ground state of h i. So we are going to call it c0, where c0 is the ground state of h i. And let me just pin you here. And if we do that, we can evolve this system with the following Hamiltonian. Let's say we can define this h, which is going to be f times hf plus 1 minus f times h i. OK, here maybe let's use a different notation. So this f here just represents the final, that is final. So I'm going to maybe write it like this, just to differentiate from this f, which is going to be a function of s. And we say that usually we will take it to be s equal to s. So the important thing is that f of 0 should be equal to 0, and f of 1 should be equal to 1. So under the condition, if I take that from the initial state, and I evolve with the Hamiltonian h of t over the total time. So we are using s equal to t over the total time. I'm going to end up after a time t in a final state, which is going to be approximately, more or less, proportional to the final ground state, which here is theta star. So for this to happen, so for the algorithm to solve the problem, what I need is that the evolution has to be slow enough. And because in this case, the diabetic theorem gives us a runtime. So tells us that if the time is the time, let's say, that I need is going to be less or equal to constant by epsilon, which is the error that I expect I want to obtain. So it would be related to the accuracy. And here I'm going to have the h of s, we say, almost divided by the gap squared. And I'm going to have a maximum s in 0, 1. So this is telling us that if we have a problem and we find these Hamiltonians, we can compute then for the time that is going to required to solve this problem with this algorithm by using the adiabatic theorem. So the adiabatic theorem, I repeated just quickly. What it says, it says that the system is going to remain in this instantaneous ground set of the Hamiltonian, as long as we have this. Let me just draw here. We have that this gap. It's going to be these are the energies. This is going to be the time. And we're going to have the energies as a function of time for the, let's say, E0 state function for the E1. And this gap, the delta of s is going to be the difference between these two energies. So E1 minus E0 of s. OK. So this is what we discussed. And I want to show, I want to discuss today an application of this algorithm. And I'm going to discuss with the simplest application. And that is going to be applying it to the Grover problem. So we are going to look at the problem of search in an unstructured search. So we are given a database, which contains, in this case, n, which is going to be two to the n elements, which are bit strings of, you can write them as bit strings of. OK. And we are given a marked state. In this case, we're going to call it zeta star again. We are given a marked state. And we have a function such that if we call it on q of z, it's going to give 1 if z is the marked state and 0 otherwise. OK. So this is the function that tells us which is the string we are looking for. And in the Grover problem, it's an Oracle problem. So the problem involves the number of queries of times I have to call this function in order to find the item. So we know that classically, so we have that the number of queries is going to be of the order of two to the n. OK. So two to the small n in this case. And we know that, on the other hand, you have seen that in the gate-based, let's say, gate-based quantum computing, you can make, instead, you can use the Grover algorithm, which is going to achieve the same results. So it's going to solve the problem. However, it's going to do it in a time which is scales at square root of n. And these are like, actually, this is the best you can do. So the best you can do, classically, it's scales at n. And so you get a quadratic speedup in the algorithm if you go to the gate-based quantum computing. And now, I want to look at the same problem and tackle it with this adiabatic quantum optimization algorithm that I've described. So the first thing that we have to do to tackle the problem is to transform the problem into a minimization problem. So we have to find an h of z that we want to minimize in the Grover. And the h of z that we want to minimize is quite easy to get. So we see that we want that the mark item. So we're going to take q of z minus q of z. And in this case, we're going to have a minus 1 if z equal to z star and is 0 otherwise. And now, z star is exactly the minimum of h of z. Because it's the only point where the value is minus 1 and we have 0 for all of the other ones. OK, so now that we are into this kind of problem, into an optimization problem, we can start the program to use the adiabatic algorithm. And to do this, we say the first thing is to write the final Hamiltonian, so the target Hamiltonian, which we want to, from which we need the ground state. And this Hamiltonian is going to be, in this case, as we said, I will write the formula again. OK, and in this case, since we have a 0 for all non-marked items, this is going to be simply minus. OK, and the second thing, OK, so we want to find the ground state of this Hamiltonian, which, by the way, it's written as a projector, but you may want to see it in terms of spin, Pauli spin operators on each side. So this can also be done. And if you want to write it like that, you're going to get that the Hamiltonian is going to be a product for j that goes from 1 to n of sigma zj divided by 2. OK, and this I'm not going to discuss. It is just, this is just simply the projection operation on each individual, for each individual qubit, is projecting on the right value of that the qubit you have according to z star. And by doing this for all qubits, we can obtain the Hamiltonian. So it's actually a quite a complicated Hamiltonian. If you write this product explicitly one would contain products of this sigma z variable, which can be up to the whole system, can contain the whole system. The second thing we need is an initial Hamiltonian. So we said that the initial Hamiltonian should be chosen like it can be chosen by us. So once we have the final Hamiltonian, we have to make a choice and choose an appropriate initial Hamiltonian. So in this case, the final Hamiltonian is what is for us is going to be the oracle. So it's playing the role of this Qz here in the classical algorithm, because this final Hamiltonian contains its information about the state that is marked. And we want to use it within the evolution to find the marked state. So in this case, calling the oracle becomes using this Hamiltonian. And consequently, the initial Hamiltonian, which is part of our algorithm instead, cannot use the marked state. Because we should find the marked state without making use of the fact that we know what it is. So it cannot contain z star. And the choice for the initial Hamiltonian is going to be simply minus. Again, a projector on C0, where C0, in this case, since the Hamiltonian is written like this, C0 is the ground state of the Hamiltonian. And we choose C0 to be the normalized, so the normalized sum over all states. So this is the equal superposition. So we're starting from the equal superposition of the computational basis. And this, of course, does not assume that we know what is the marked state. Because this state is totally independent, because we are taking all states with the same weights. And this state, it's actually not so hard to prepare, because it can be written as the product. So it's going to be, let's say, the tensor product of 0, 1 divided by root. So on this, on the j, on the j spin. So it's going to be like a state which is defined on a single cube. It's a product state. So these product states are usually quite easy to prepare. And the initial Hamiltonian can also similarly be written as in terms of the Pauli operators. And in this case, sigma x. And you get a similar expression to the previous one, but with the sigma x. I will not write it now, because in the following, we are going to mainly use these expressions in terms of projection operators. So after this, the evolution of the system is going to be controlled by this Hamiltonian, which is going to be given by substituting in that formula these expressions. And we are going to have f times g star. And we are going to have 1 minus f times d0. OK, so if we evolve the system with this Hamiltonian, again, f should have those properties that we described before. And one easy choice, as we say, is going to be f of s equal s. If we use the evolution that I described before, we can solve the problem with the adiabatic optimization algorithm. But in this case, the issue is what is the time that's required to solve the problem? Do I need a time that scales worse or comparable to this classical time that is needed? Or can I do something with this algorithm as good as I was getting in the gate-based quantum computing model? And to do so, we have to use the formula of the time. And you would see that the formula for the time which is involving the derivative, let's say h dot of s. And it's involving the gap, delta of s. So the next step is going to be computing these objects in order to evaluate what is the time needed. And let me just, OK. So from here, we are just going to keep until here and just keep the results that we need on the blackboard. The first result is that hf was minus this star, and that hi was minus this zero. OK. And now, as we said, we have to evaluate. And we had that, sorry. Maybe let me write also the interpolation, which is going to be minus f this star, this star, minus 1 minus f c0, c0. So this is what we discussed previously. And now, we just have to evaluate the terms that appear in the expression for the time. So if we compute the first term, which is going to be the norm of h, we had mentioned yesterday that you can take various norms here, but here, I would consider the two norm, which is going to be equal to, so the two norm square is going to be equal to the trace of h of square, h dot squared. And we are going to have that for this, in this case, we are going to get the trace of f dot. This is going to be equal to the trace of f dot, this star, 0, let's say, minus c0, c0. OK. So moreover, we will assume that f dot is positive, so we can take it out from OK. And we are going to use just this simple relation, which says that the trace of y is equal to, this is going to be equal to y. OK. So we are going to use this, and we can compute the trace. And let me just give you directly the result. So you're going to have the square. Let me write here just explicitly. I mean, that's correct, but let's use the trace form directly here. So we can go to the next step. And in this case, we are going to have 1 minus 1 over n, where we use that the overlap between c0 and z star is equal to 1 over square root of n. OK, fine. And this is the first term that we need. So similarly, for this problem, we can compute also the second term that we need, which is going to be the gap. So the computation of the gap, it's a bit more involved. So I will just, in this case, describe quickly maybe a method to get it, but I'm not going to write it explicitly. So why can we get in this problem, which is a many-body problem, so how is it possible that we can compute this object? When to do it, we need to diagonalize the problem, and we have usually a hard time diagonalizing many-body Hamiltonian. So the reason is that the Hamiltonian all involves the projection over two states. So it can be written in the basis with, we can use the basis containing z star, and it's orthogonal, which is simply going to be z star minus 1 over square root of n times p0 divided by 1 over divided by the norm. So I mean, I will not normalize it actually. That's not quite important. But for this one, it's actually going to be like this. So the z star is proportional to this. And the reason is that this is the projection of c0 on z star, and we can remove the projection from z star. And if we do this, we have just a basis with two elements, because all the other elements in the basis, we can leave them the same, and they don't contribute to the problem, and we get a 2 by 2 Hamiltonian, a 2 by 2 Hamiltonian. And for this Hamiltonian, which is simply a 2 by 2 Hamiltonian, we can find the gap. So I'm going to write the expression of the gap after this. So let me copy it directly from the nodes. So the expression of the gap is going to be l the mean square, 1 plus. So this is the expression we get. You'll get if you do this computation of diagonalizing, the 2 by 2, the 2 times 2 Hamiltonian, and computing the game values, and taking the difference. Yes. So actually, this I've written it for the gap squared. If you take the square root of that, you're going to get the gap. But let's just keep it with the gap squared, because that's actually what appears in the Hamiltonian. So I have to tell you what are these terms that I've introduced. So we have that delta mean. Let me just go here a second. You're going to have that delta mean. And v are going to be respectively equal to the square root of minus n and 1 over n. OK. And now we can use these expressions to evaluate the time. So as we said, the time is going to be the ratio between the ratio between proportional to the ratio between h dot. And we have to take the maximum over s of the ratio between h dot and delta squared. And if we take this, in this case, we get an expression which is let's write it over n. So this is the one for the h dot there. And we're going to have divided by delta mean. And we are going to have a delta of s divided by delta mean. OK. So let me put the squares because it's squared. OK, so if I take this expression and I try to write it, I can separate delta mean squared from it. So this is going to be squared. And we're going to have, let's say, sorry, 1 over delta mean squared. And we're going to have here delta mean over ds. And we are going to have here the f dot time square, 1 of n. OK, so how do we see what's the scaling of this object when we take the maximum? Right, because it was a maximum here over s. And we have a maximum over s. So how do we see what's the scaling of this object? So I call this delta mean because actually, as you can see from the function of the gap, this is the minimum. So if I plot the gap as a function of s, we are going to have that OK here. We're going to have that the gap is like this. When s goes 1 and not s 1 half, we are going to have a minimal gap here, OK? And the value of the minimal gap is you substitute s 1 half here. You get exactly that is delta mean. And the, OK, so that's the first thing. Then the second thing is what is this f dot, OK? And this f dot, we say that usually we can choose f of s equal s. And in this case, we are going to have f dot equal 1. So this will directly imply that when taking the maximum, we can have that we can replace. We said we can replace. We are going to do this with times n. And we are going to have a maximum over s of the ratio of delta mean over delta of s squared, OK? And this expression, it's always smaller than 1 because delta mean is the minimum gap, OK? Here I substituted the expression of delta mean. And what I get is that the time that I need, it's going to be under the assumption that I'm taking this linear schedule is going to be proportional to n. And then we have a term which is square root of 1 minus 1 over n, which is more or less proportional to n in the limit of large n. So we get that this unsatisfying result that using this linear schedule in the Grover problem, we would have a result similar to the classical one. We have that the time needed. So the time for which I need to use my Oracle HF, it scales as n as the number of items in the problem. So how can we correct this? And the issue is that this is closely related to our choice of the schedule. And the idea is that we can find a better schedule, which does a better job. So we have to choose an F of s. We want to look for an F of s, which when substituted here, I can take the maximum. And the maximum that I obtain is going to be better. So I'm going to have some total time which scales better than n. And in this case, the way to do it was proposed in 2002 by Roland Serf. And they suggested that we should try to keep this object. We want to maximize a constant during the whole evolution. So what is this object saying? As we already described yesterday, this object that we're trying to maximize is telling us that when the gap is small, this object is growing. And so if I want the total time to be somehow reasonable and not too big, I have to slow down. So h dot has to be smaller. So h dot has to, so I have to slow down the evolution when I see a closing gap. However, in the linear schedule, this is not done. So since in this case, I know the gap, I can look for an F of s in such a way that this object is constant. So the error that I'm making, so let's say this error that I'm making, depending on s, it's s independent. And then when I will take the maximum, so if I have a function which I can keep constant equal to c to constant, it's going to be equal to this. So f dot, we say n minus over n square root, and we're going to have delta mean square over delta s. So this is a differential equation for my function f. And let me just mention that this differential equation for this particular problem can be solved exactly. And I'll just write the solution of the differential equation directly. So if you solve the differential equation that is below, you're going to get such a solution. So you're going to get f of s is going to be 1 half 2 s. So this is the solution. And where the objects that I've introduced are functioned of square root of n minus 1. OK. And OK, once you have this, you can insert f dot in the expression for the time. And you would get that the total time would be more or less equal. And you're going to get a square root of n. And OK, so you're going to get some arcotangent. So this is proportional. Of course, you also have still this 1 minus epsilon, which is related to the error. And OK, so after this substitution, you get that instead of having a scaling with n, you get a scaling with square root of n for the total time. So let me just, to give you an idea, I'm going to draw the kind of schedule that you get. Let me try to make it as accurate as possible. OK, OK, and OK, sorry, this is f, and this is s. So this is the kind of schedule that you get. And indeed, it's slowing down at 1 half, which is where we know that the gap is closing. So by taking a schedule which does this and computing it properly, you get a total time, which in the end is scaling, let's say, has the order of magnitude of square root of n. So in this case, by doing this, we can recover the speedup that was obtained with the gaze-based quantum computing. And we can do this instead with the adiabatic quantum computing algorithm. So what is maybe the lesson from this problem? So first of all, we can recover the speedup obtained with the usual quantum computing model using this other adiabatic quantum computation. And this is actually a result which is quite general. So in general, we know that there is a theorem that states that the two models of computation are equivalent. I didn't write them down. OK, but the two models of computation are equivalent. So I can equivalently use gate-based computation, and I can equivalently use adiabatic quantum computation. And I'm going to obtain, at the end, the same scaling if I do it properly. If I use both of them at their best performance. So I don't have to do, for example, choose wrong Hamiltonians or choose not the schedule, not in a smart way, because this is going to make the algorithm underperform. So intrinsically, since they are the same, why should I prefer one or the other one? So first thing is that, let's say, let me add here, adiabatic. So let me describe the differences. So the difference is that while this one is obtained with using gates on a digital device, this one is obtained using an analog evolution on an analog device from how I've introduced it. And so this can run quite easily on analog devices. So this would be the first advantage. The second advantage is that since this computation, as we described from the adiabatic theorem, occurs only within the ground state. So the system is somehow by the adiabatic theorem forced to stay in the ground state. We would expect that this is somehow as long as the temperature is low enough, somehow it would have some intrinsic noise resilience to, let's say, to thermal noise. So this may have this kind of advantage. And the other thing that I want to mention are maybe some difficulties that you would encounter when using this kind of problem. So we saw before that this HF was actually that is written here. It's actually quite a complicated object in the Grover problem. And implementing this, you must have a device where the Hamiltonian, so the interaction in the device, are such a way that the Hamiltonian H of s is this one. And in this case, I would just remind you briefly that H of s was minus the product over j of n of 1 plus z star j z, OK, OK. And so this, if you expand it, would contain all kind of terms that you can imagine. And there is basically no device that can implement such an Hamiltonian. So you would also have to redefine the problem in such a way that you can change the H of s such that this H of f can fit onto your device. So this is actually also one of the problems that arise in this kind of approach is that it's hard to expect that you can do all computation with a single device without, let's say, large overhead due to trying to implement these complicated Hamiltonians on your quantum device. So I'll not discuss how this is done, but there are techniques to do this, which go under the name of embedding. And you can implement this kind of complicated Hamiltonians. But the price that you have to pay is that you have to increase the number of qubits. So it actually requires a lot of qubits to solve the problem. So for the Grover, we'll not require just n qubits, but many more if we want to implement this complicated Hamiltonian on a quantum device. So this is the issue that you would encounter. And the last thing that I want to mention is that this time that we get here, as we said, is related to the gap. And the fact that the gap was becoming smaller and smaller, it's not written here, but the minimum gap, at least, was scaling. Minimum gap square was going as 1 over n. And so it's closing exponentially. It's becoming exponentially small in the size of the system because the size of the system is 2 to the small n. The n is 2 to the size of the system. And this feature, which is what makes the computation slow, so what makes this to appear, it's quite a natural feature of many quantum systems. So in many quantum systems that describe problem Hamiltonians, you're going to have these closing gaps. And you're going to have to use similar techniques to try to slow down on the closing gap if you want to obtain some speed up using the best possible schedule. And this is also something which is not usually so easy to do because it's computing this difference. It's hard for the many-body case. As we said, we had an advantage because the Grover problem is very easy, and we were able to compute this and do everything exactly. So usually you will not know what is the runtime of the algorithm because to know this, you need to compute the gap, and you don't know the gap. So in most problems, you'll not know what the runtime of the algorithm is. And what you do, it usually goes under the name of quantum annealing. In these cases, you know that eventually you get the convergence if you go to a very long time, but you stop somewhere. You cannot stop when using the adiabatic theorem that tells you when to stop because the gap is unknown. So you just stop somewhere, and you get some values, and you read those as an approximate solution of your problem. And this is an heuristic algorithm, which is actually used, and it's implemented in current devices. And there are companies which commercialize these quantum annealers, usually that are mainly based on superconducting qubits, but there are now also some version on Rittberg annealing with Rittberg atoms. And this is another way to try to solve these optimization problems. And at the moment, there is quite some research going on in trying to see whether we can find a problem where this performs well. Because since we don't know the runtime, because we don't know the gap, we simply have to, in this case, try it on a certain problem and see how it's performing. And various companies or research groups are trying to find problems where this algorithm can perform well. OK, so that will be all. Thank you very much. So if there are any questions, yes? I have a question. So what kind of devices would you implement today? Because you said it's not gate-based, what you were talking about. So which devices are you using to implement this algorithm? Yeah, so it's not gate-based. I said you can use superconducting qubits or atoms. I mean, in many cases, actually, in the gate-based device, what they do is that to realize the gates, they use some pulses or some analog control that they use to realize the gate. So what we do is what you can do in this kind of devices is to go one step back and try to use directly the analog knobs and controls you have in your device. So it would work for most of the devices that are around. The main problem is that you're not going to have control over all of the terms that you're required to control to implement the driving Hamiltonian written there. So you need some device which is sufficiently flexible such that you can implement this driving Hamiltonian, complicated driving Hamiltonians, and it has sufficiently long coherence time so that you can get to the run. The run time of the algorithm is going to be smaller than the coherence time. And at the moment, mostly the experiments of annealing on large systems are done mainly with Rickberg and with superconducting qubits. Hey, thank you very much. In case there are no more questions, let's thank our speaker again. Thank you. And then we're going to continue at 10, 10. So you've got seven minutes to stretch your legs. Thank you. Are you going to be around? Just for today. Just for today. OK. Ah, for today. Today is my last day here. Yes. So maybe I'll see you later. Yes. Yes. Yeah, OK. Hello. Oh, no. This is good. This is, yeah. No, no, no. I don't wish this was just a, I guess, a little bit. Just a little, I think if I spend a few hours since I do have some, would you please let me see the second on the stand. But our students, Jay, you're in doubt. Yeah, yeah. OK. OK. Yeah. Yeah. It's a difficult one, because it's such a big, such a big difference. Like, there should be at least lecture notes. If he has lecture notes on, like, he could do some slides or something. Yeah. You can ask him. We can share the lecture notes, that'd be good. So I wanted to take, like, two minutes to start to really not more than two minutes. I don't have a solution. OK. Yeah, man, sure. Just wonder, just we do more. Yeah, man, that's fine. I think this is, yeah, because it's going to be online. Just do two minutes. Yeah. OK. So, like, here is our side. We exist. We do this. We have scholarships. And ask me, you're in the break. OK. Yeah. So can we see the? Sure. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. Oh, yeah. Hello. OK. Great. Uh, yeah, can you speak just one more bit? So you could check your audio in the room. Sure thing. How does it sound? Yeah, perfect. Uh, do you have any presentations? Is that anything you'll be sharing? Yes, I have a slide deck I can share Daniel, we're going to have like a two minute announcement right before you start is that okay just so you know Yeah, I guess we have to stay There are young there are the others Just stand here and then when you're done if you do my turn turn them off Okay It's good going for the next session before we have a sex session. It's just gonna have a very quick announcement Hi guys, I wanted to take this opportunity to make a little advertisement. So my name is Irina. I'm from Master in high-performance computing. So outside I written there basically all the information you can find there I just be very brief. So what we are we teach Scientific programming so basically we teach how to write the high quality professional codes for Science the cost that can be run on the supercomputers or on GPUs We also do have some Machine learning in our program though. It's not the focus. Well from next year We try to introduce quantum computing closer. You can see the full list of our courses there I would say we are really unique Program in the sense that they're not that many places that teach you how to do scientific programming So we have scholarships for ICT P supported countries and the deadline is in like 10 days So if you're interested just don't be too slow We also have OJ scholarships this year for people who are local or like richer people who can support themselves But at least they don't have to pay the fee so they can get the education for free if they perform the project on a given topic If you have any questions about this or you're interested, please just find me during the break Lunchtime, whatever and I can ask answer all your questions individually and I won't take any more time From the lecture. Okay. Thank you Okay, thank you very much for this very nice announcement and great opportunity. So take advantage of it So now we're gonna start Our next session for today introduction to quantum error mitigation We're gonna have two speakers. Our first speaker is Daniel Mills. He's already online. He's one of our research scientists and then we're gonna have Silas Dykes and he is one of our software engineers In any case without further ado Daniel, it's all yours Okay, great. Thank you very much Yeah, so I'm gonna talk. Well me and Silas are gonna talk a bit about quantum error mitigation The first part is the first hour of this is going to be a bit of a some theoretical background into Noise and error and error mitigation then Silas will take over The second hour to talk a bit about Kermit, which is a software library we've developed to perform error mitigation So error mitigation is sort of like a near-term way to Prevent and manage errors in quantum devices So later on you'll also hear about some error correction from my colleagues Which is a bit of a longer-term approach to dealing with errors in quantum computers But for smaller devices error mitigation is the approach people have taken up taken to use and I'll talk a bit about that now Yeah, so there are a few ways that you might want to deal with noise in quantum devices So the first would be the first thing you want to try is to reduce the noise in the Components of the device as much as much as possible in the first instance so this would include things like improving the type the pulse the quality of the pulse sequences that you Make use of for example You can also use So this this falls under Things like optimal control and computing pulses and you can try things you might hear about things like dynamical decoupling Which is some pulse sequences you can add into your quantum computation to Undo some errors that you might have inadvertently added So these are very low-level approaches to managing errors and managing quite directly with the components of the device You can try what I'm going to call circuit level approaches. So these are approaches which will change the circuit that you're running But but not manage any of the results for example, so things like noise aware compilation and routing So this would be where you're picking the best qubits to use on the architecture for example Things like frame randomization randomized compilation. So these are adding additional gates into your circuits in order to Undo some rotations that you may have accidentally added or to randomize some of the errors to make them more favorable to you So these are approaches where you take your circuit that you want to run and you change it directly So this is a little higher a little higher level So you're not dealing with the components of the device itself But you're now you're dealing with your the circuit that you're trying to run Above that and there's what I'll call application level approaches So these are approaches where You might change the circuit, but you also do some clever processing of the measurement results you get from your circuits So combining them in some clever way Or perhaps removing some that you know to be erroneous So this so Yeah, so we'll spend a lot of time talking about the second approach. So this again is a little bit of a high-level technique to Tackling errors and then at the very highest level you have error correction, which you'll hear about later So during this talk I'm going to talk about these two central approaches approaches that you can Deal with by not knowing to not dealing directly with the device and not abstracting away so much that you're doing error correction These are also often called things like digital error Error investigation where you're dealing with the the circuit rather than the device itself So I'll talk a bit about these and these time techniques are the techniques. We've targeted it in our implementations in Kermit So just to prepare you in advance And to prepare some of the terminology I'll use later in the talk Kermit is this open source Python package that implements these digital error mitigation submit digital error mitigation techniques It has a Compositional architecture based on graph composition. So So I was to talk about this a bit more directly, but I'll also use this graph based approach to describe some of the schemes that I'll introduce And what I mean, but roughly by that just to give some intuition in the beginning is that You can imagine your Error mitigation protocol described as a graph the nodes in the graph are some sub-prosicles that you're using And you might reuse and combine in different ways and the edges of your graph just describe how the inputs and outputs from those sub-processes I'll move between the processes So this is the architecture we use within Kermit's and I'll use this notation to describe some schemes have been later on It's built on top of by-tickets. So you're familiar with by-ticket. You're ready to go with Kermit basically And you can quickly install it through PIP With usual commands and there's some documentation available at Kermit.it. So I've stolen the Italian domain name which I hope doesn't offend anybody and there's it's open source and there's a manual available at the GitHub repository there Okay, so that's Kermit So let me just talk a bit about what I mean by noise and some of the terminology I'll use for different noise sources So first of all there I'll distinguish between coherent and incoherent noises So coherent noises are Noise noises that can be represented by unitary operations So for example, you can imagine if you're wanting to implement some kind of gate In your circuit and your gate is poorly calibrated You might want to rotate the block sphere by some angle, but actually you over-rotate it or you're under-rotate it And these types of errors are called coherent errors And you also have incoherent errors which cannot be described as unitaries and Example of this would be like depolarized depolarization So here what the criteria to describe depolarization is to say that with some probability The state that you want to be manipulating is replaced just by the maximum mixed state So you just lose a little bit of information Under the effect of a depolarizing noise channel So you can sort of imagine coherent errors as if your block sphere is rotating in a way that you wouldn't like and depolarizing noises would be something like your block sphere contracting so you're losing some information to perhaps the environment and Just a final example I give here is a bit flip error. So in this case You yeah, you're basically your zero states are being flipped to one and one states being flipped to zero And you can sort of imagine this as your blocks fear being contracted along one of the axes only so information on One of the axes is preserved, but along the other You're losing some information so these are a couple of noise channels some very common ones and I'll use some of these terms electron as well In practice if you want to simulate noise Which you might be interested in doing there are some tools to do so in Kiske for example There is the the noise model There is such a thing as a noise model provided through Kiske's which you can use to describe some simple noise channels for example You can inputs just Some depolarizing noise on single and to give a case for example or some reader or some bit flip errors on your readouts on your measurement errors and In practice what these simulators do behind the scenes is adds some Kind of errors after each after each gates so quite a simple noise model, but these kind of tools are useful for doing some Testing of your code or testing of the robustness of your circuits for example So I just bring this up in case you want to have a look that up a bit later Okay, let me describe a simple again try to describe again the noise channel and the simple error investigation technique So I'm going to talk about state preparation and measurement errors or what you'll hear often referred to as spam errors and here the idea is that with You've performed you perform your circuits and you perform a measurement and with some probability the measurement causes Measurement outcomes zero to flip to be one and flip and for measurement outcomes one to flip to be zero with some probability So you have this this simple channel to the left here You can describe this using a matrix n. So here we have like what one minus e with probability one minus e The zeros remain in the zero zero zero states are measured at zero with probability Are the one states are measured as zero with probability e? The zero states measure with one and one minus are the ones one states are measured at zero So with this matrix describing your noise channel to hand you can recover the ideal probability distribution just by inverting this matrix so there's a simple technique once you've characterized these values e and r to invert the noise and Improve the quality of your measurement outcomes. So just to say that a target more formally You Have this probability of measuring some outcome z given that the true correct measurement outcome was said said dash here So this generalizes to More than just one qubit. So in the in the example I just gave you had Just a single qubit being measured, but this generalizes to Many qubits. So you have in the end very large matrices The size of the matrices grows exponentially with the number of qubits and You can perform the error mitigation by inverting this matrix s on the noisy probability distribution to recover the Hermit's gated probability distribution So just to return to the graph notation that I mentioned before to describe this error mitigation technique and You start with some unitary you want to implement the first step is to perform some tomography to establish This matrix s so these first two boxes at the top on the top right here correspond to that procedure Then you have two tasks to compile and execute the original circuit to gather the results from that circuit and then the last step is to Use the matrix that you've learned and the results from the device to invert the noise So this is how we would describe it within kermit's and Also convenient way of explaining So what we've talked about here is correcting errors in probability distributions So this is one Situation that we would be concerned with In in kermit we refer to these techniques as result miscigators or miss res But there is a second type Cool, which we call expectation mitigators or mid-exes and in this case we're not concerned with Mitigating the complete probability distribution, but instead with some statistics of the distribution in particular the expectation values of calculating some The expectation values of some observable acting on state produced by our circuits So in such experiments the inputs that you're concerned with are The circuit unitary and some observable that you want to calculate the expectation of So I have here the goal is the expectation value This observable defined by this trace quantity here So these types of experiments are quite common for example in quantum chemistry, which will be covered a bit later week So as an example here calculating the expectation value of this ex observable corresponds to just Calculating the difference between the probabilities of measuring zero and one If you act this Hadamard state before the measurement So you have to perform some rotation on the output state and Measure that and you from that you can recover the expectation value of the measuring the ex observable So in general so I have this so in this gives a little example, but in general the procedure can be described to the right here so you Have an input as an observable and unitary you use your Observable to calculate some measurement circuits. So in this case the measurement circuit is simply just this acting of the Hadamard gates But in general your operators might be much more complicated. So you might have more complicated measurement circuits Then you repeat the same kind of process as before And you have a compilation step you met you accordingly execute the circuits and apply the measurement Circuits that you've built in the first step And then as the very final step you Recombine those measurement outcomes to calculate the expectation value of the observable that you began with So you can see a bit of a difference here to the first type of air mitigation techniques instead now instead of calculating probability distributions We're just calculating some statistics of the output So there are some different there are other air mitigation techniques Developed for this setting as well, which I'll discuss Yes, so this just says in words what I said before That's that's these different steps of appending measurement circuits Combining the circuits performing the measurement and then recombining the measurement outcomes to calculate the particular observable your So this gives you this procedure gives you I and A noisy approximation for the observable expect the expectation value of the observable that you wanted so you can sort of picture it as if The expectation value is noisy and sort of like as you can imagine as if it's a distribution that's been shifted slightly the goal of The goal is to derive a better approximation for the ideal And This is the target of performing air mitigation on these types of experiments typically what happens is you improve improve the accuracy of the Approximation, but you increase their variance in the approximation of these quantities So as this is the trade-off that you inevitably have So you can have this kind of picture in mind of the distribution being spread out by performing air mitigation But being drawn closer to the ideal So just to outline some general steps for air mitigation of observable expectation values So typically the first step is to take the circuit that you want to run perform some operation on the circuit Which I'm representing here by these news This basically allows you to build up some data that characterizes the noise of the device in some way You You're also take it's also taken that you're you have at your disposable disposable some relationship The data has between itself So the combining the data in some clever way should be able to reduce the errors So this this functional model step of an air mitigation scheme And then finally use this data that you've generated in the first step and some knowledge you have about how this data should Be related to itself In order to recover an air mitigate air mitigated approximation for a subservable expectation value So there are these two step two and kind of things that you need to learn during an air mitigation scheme And then finally you can recombine them to produce a better approximation Okay, so this is a general outline. Let me give a particular example of an air mitigation scheme So first off, I'm going to talk about what's called zero noise extrapolation So the idea here is that at least the intuition is on the bottom right here in this graph So what you can what you can do is you can take your circuits and you can run Run it on the device and you get some Value you for this expectation expectation value. So what you can't do is reduce the noise But you might be able to increase the noise So for example, if you increase the noise run the circuit again increase the noise increase the noise Then you get you build up these green points Which which have this might for example have some decay So as you're losing information to the to the noise the expectation values will might decay towards zero for example Then what you can do is Get some fitting function Fit to those decaying points and extrapolate backwards to the zero noise value And this is roughly the intuition behind zero noise extrapolation So there are a couple of things that you need in order for this to make sense you need an approach to increase the noise and you need an approach to You need to agree on a function to fit to the decaying quantity decaying value of the expectations So one approach that you might use to increasing the noise is to replace Every gate in the circuit with itself followed by its inverse followed by itself So the the kind of second to gates the gate and its and its inverse Cancel out in the unitary that you're implementing. So the circuit that you're implementing is unchanged But the result is that you're in if you replace every gate in this way that you've increased the noise by a factor of three And you can repeat this process to increase it by a factor of five any or any odd integer And this way you so you think you can increase the noise in a way that in a controlled way And there are other techniques to increasing the noise as well You might if you have lower level access to the pulses, for example, you can consider stretching the pro pulses Or you can replace the gates in some other way But this is one example you can keep in mind So that's the first that's the first step the first step to Build up this data that you need to perform the air mitigation The second step is to consider this functional model. So this is the Function that you're fitting to the data And there are again a few different different functions that you might try and fit Some common ones would be to fit some exponential decay to these values or some polynomial function They each of these have their own justification for doing that and It'll depend a bit about on the device you're working with or the circuit you're working with as to which you might choose And then the final step in this case would be just to So you have your data you have your function that relates the data and now you can use that just to extrapolate back to the zero noise value So this is the zero noise extrapolation technique in our abstraction described So just to say that the function And noise scaling technique the EU's will depend a bit on the device So I have a couple of devices here Credit here to the team out of mythic for performing these experiments In this case the ideal outcome was would be one And on the two different devices you can see how as the noises scales the decay is Different and indeed the extrapolation technique which corresponds to the the color of the points at the zero noise scaling value The accuracy of those scaling of those extrapolation techniques depends on the device so for example on the IBM London device Exponential extrapolation is closest to one On the regetti device is Richardson extrapolation, which is a form of polynomial a kind of polynomial that you'd use for extrapolation So it's a bit unpredictable, which is the best one that's That's something to keep in mind when you use these things So what I said when I was describing zero noise extrapolation is that you can increase the noise But you can't access this area in red here that corresponds to reducing the noise So there is one case where you can reduce the noise and this is in particular is when you are considering Clifford circuits So Clifford these Clifford circuits are a subset of all circuits that happen to be classically simulable And The approach of Clifford data regression and which is another air mitigation technique is to make use of this fact In particular the approach is To take the original circuits as given and to Look look into the look into the circuits and for every gate in the circuit, which is not a Clifford gates to replace that With a random gate in the from from the Clifford group So the result is that your circuit all the non Clifford gates in your circuit have been replaced by Clifford gates And as a result you can classically simulate the circuit you Run those classically simulable circuits both on Classical devices and on quantum devices which allows you to build up a relationship between the noisy and exact values for those Clifford circuits, which are similar to the original circuit and You can use that relationship to relate the noisy expectation value from the original circuit to what it's You can conjecture it would be in the exact case So that's the intuition by this Clifford a data regression technique So you can see sort of to the techniques are kind of similar So on the left here, I just do a little pictorial representation of Clifford data regression You take the original circuit that you want to run you replace the non Clifford gates with Clifford gates so that it's classically simulable You can use the classically simulable Clifford gates Clifford circuits to calculate these green crosses at these noise value of the device and at the noise level and at zero noise level and build this Relationship between the two pairs to use that to perform air mitigation Or in the case of zero noise extrapolation, you can do increase the noise using these identity gates So the unitaries and the inverses and use that instead to extrapolate backwards So these two techniques are sort of similar in that regard More generally the process I've described is to start with some unitary you in red here at the top Do some transformation to generate new circuits run those new circuits and the original circuit on the device use that data to build up some relationship between the the training circuits and the original circuits and Use this class some classical post-processing in order to recover a not a non noisy No, I know it's free value for the expert or a mitigated value for the expectation So again, so I can write this down in my favorites graph notation so I Yeah, just kind of The same thing I've described to you and it developed you build up some measure some circuits that produce your expectation value Generate some modified circuits execute those on the back end do some classical post-processing on those circuits and Use that in combination with the original circuits to generate an error mitigated value So, okay, so now we have an idea about result mitigation and expectation value mitigation And a few examples of such schemes The next thing you might try and do is to combine these schemes and in some cases. This is very sensible thing to do In particular a lot of the assumptions that we've that go into Developing these there are a lot of assumptions that you have to make when developing these error mitigation techniques In particular this relates to kind of for example the extrapolation function that you might use this the extrapolation function use Depends on some the particular noise characteristics of your device And if those assumptions that you've made about the noise characteristics of the device were to be incorrect then you It might affect the quality of your outcome of your experiment But sometimes you can enforce these assumptions by using other error mitigation techniques Perhaps some of the lower-level circuit error mitigation techniques that I described before So for example You can imagine using zero noise extrapolation and spam errors to correct the measurement outcomes and That results in it's an improvement in some improvement And you can also use something like frame randomization Which in Clifford data regression performs fairly well, but not as well In both cases, it's not performing as well as it did on the classical simulations And this is because for them the noise model used for the classical simulations are of course much simpler than the noise models on the real devices and Probably lower in attitudes so on the real devices you have much more complicated noise Sources and the assumptions that were that go into using the Into making developing their mitigations techniques are not necessarily met by these real devices Whereas for the classical stimulators, you're sure that they actually are So you see that they the real devices don't perform as well, but our mitigation still seems to help you out So I'll just yeah, so I'm going to talk about the chemistry circuits run on the real devices So flash this up a bit earlier, but their chemistry circuits sort of have this kind of ladder structure where the You have see CZ gates acting on two qubits in this kind of ladder structure with a small rotate Just a single qubit rotation at the bottom followed by this increasing ladder of CZ gates And this agree this occurs for many layers So running these circuits on the real device, you see the same thing as you saw with the classical simulations in particular Roughly speaking CDR the clipper data regression does better than zero noise extrapolation Most of the time air mitigation helps in the worst case It seems not to and there are a few quite a few squares here Which are reds where air mitigation was not shown to for result in any improvement So you can see that on real devices the performance of these schemes becomes a bit less predictable So this could be for many reasons Yeah, but we don't understand the noise channels particularly well That they change from the change for you that the noise profiles change very quickly And lots of other things So this is run on the IBM The IBM and Lagos device Just to emphasize that it depends heavily on the device characteristics as well There is these experiments were conducted on the IBM Casablanca device and you can see that in this case zero noise is traveled Clipper data regression really just performs It's completely As no benefits running clipper data regression on these circuits So you can see just to emphasize that It really depends on the device depends on the device. It depends on the circuits and Yeah, it can depend on the vice characteristics that you take a moment So these results show that it's a bit hard to predict the performance for a mitigation schemes in practice But there are some things you can take away from this For example, this plot in particular tells you that maybe for IBM Casablanca Which I think is not running any anymore, but she should not worry about using any error mitigation For chemistry type experiments. It's beneficial to use clipper data regression and for other other types of experiments It's probably beneficial to use zero noise extrapolation So there are a few things that you can take away from these experiments So that's just about all I have to say on this So just to prepare you For Silas's talk There are several schemes currently implemented in Kermit that you might like to play around with so zero noise extrapolation clipper data regression Are there for example, there's also a probabilistic error cancellation Which I haven't mentioned about this technique Basically relies on the idea that you can modify your circuit in such a way that Combining the results of the modified circuits has the effect of inverting the noise channel on the device after doing some characterization of the device It has spam which I've discussed frame randomization Which I mentioned briefly And it allows you to combine existing methods quickly. It's built on Top of five tickets, so you should be able to use it quite quickly and Hopefully some of the techniques that are implemented there already can be easily reused and developed into new schemes if you wish to try that out due to this modular structure that we tried to make use of and Yeah, similarly can but combining every station seems to be quite straightforward. Yeah, so yeah, so I'll follow up with next with some code examples for using kermits and develop a bit more backgrounds on Immertigation through through a practical use in practical use cases. So I hope you enjoy that And I'll stop there and take any questions if not already Okay, let's have some questions from the audience. You're gonna need to have a Microphone in order to communicate so I speak here Then you can hear me. Yeah. Yeah, okay good Questions hi. Thank you for the nice talk Um, could you just spend a few minutes explaining again the notation with the colored blocks? I'm just not familiar with that if you could just explain again how that works These ones here. Yes. Thank you. Just what what the inner and outer color represents Yeah, sure. So the The inner the outermost color so the largest square indicates the average performance of the air mitigation scheme And the innermost square indicates the worst case performance. So this was conducted over five random circuits So yeah, so saying for example Let's see Maybe the spot the bottom left square on this left most plus here You have a blue background which means kind of on average the air mitigation scheme results in it an improvement so anything that's Sort of blue or green means the air mitigation scheme is performing well and the innermost square is reds Which means that in the worst case the air mitigation scheme doesn't do anything. So there was a circuit Run during these experiments where air mitigation performed as well as running the noisy experiments Thank you. Thanks Okay, okay, more questions Gonna get a workout now Is there a fundamental limit for air mitigation complexity? How much air mitigation complex for example? Do we need for different problems? so be referring to The resource can resource consumption of the air mitigation schemes. Mm-hmm. Yeah, so they There are Limits of the resource consumptions to Maintain the same accuracy of the prediction of the expectation values grow exponentially But so these techniques will not work when which circuits become really very large The tech the idea the idea is roughly that these air mitigation schemes will carry us along until we have enough qubits to perform Error correction and then error correction will sort of take over Yeah Thank you. Okay. Uh, great a lot of questions. Hey, um, I just wondering why the expectation value decays when the noise is increased or The zero noise extrapolation technique. Yeah, so you can sort of imagine Like if you keep adding noise Then you're sort of progressing towards just a circuit which just generates uniformly random bit strings So in that case the expectation value that you'll get From uniformly random bit strings is zero so you can sort of imagine that as you progressively go towards generating just that Like uniform distribution so like get progressively closer towards zero Yeah, thank you Not a question Hi, Dan, um, thanks for the nice talk I was wondering if you could comment on the like sampling overhead required for some of these air mitigation schemes Yeah, so as I Yeah, as I mentioned you would Typically require There's something overhead to increase exponentially for these experiments We sort of think the sampling is a number of songs you take is increased But the factor by which they're increased is constant Just to demonstrate that in practice you can sort of still get things out of it without Going too hard on the number of shots you take so I think in this case there was a factor of maybe three or four Yeah Thanks, okay great more questions Hey, I have a question for you. So if I do an experiment, right? And I want to figure out if I should use arrow mitigation What would be the steps I should be doing in order to? Determine if it's gonna help for me because obviously sometimes from your plots it works and sometimes it doesn't so it doesn't seem like there's a There's a feature I can kind of Bite on for a given application. So what would you do? And I think in practice the results from these experiments Suggest that you probably needs to Kind of think very carefully. There's not too many broad strokes things that I can say I Can say that if you if your circuit has a large number of Clifford gates Then you Clifford data regression seems to reliably result in an important improvement And Otherwise you can use zero noise reference, so this is like something you can take directly out of these experiments But in practice you yeah, you really need to you probably really need to sort of experiment a bit with smaller instances of your problem With some schemes that you think might work And Run with that. So a lot of schemes will work quite reliably like spam arrow mitigation is quite will Reliably work for any consistency so you can make use of that But you can try it with some of these more general techniques that are also techniques that are specialized to particular problems For example in chemistry, there are a lot of cases where you have some symmetries in your problem Which you can make use of to do some air mitigation So basically then there might be the spoke techniques which you should definitely use there are techniques which Perform reliably generally such as spam There are techniques that work In general if you have a lot of Clifford gates and apart from that you'll have to play around with some small instances probably Yeah Okay. Thank you. We have one more question Hi, sorry, I am a little bit lost but This application of the noisier, I mean it's at the beginning of the program of the final for each cubic I am a little bit lost or after some operation Sorry, they are the denoising. Yes Yeah, so the so these were zero noise extrapolation and Clifford data regression so in this case the sort of There were the steps that I described so there's sort of like The data gathering step where you change the circuits and you run the altered circuits Device so this builds up some data. The correction is sort of like some classical post-processing on that data So the The improvement in the result comes about because of the classical post-processing on the data that you've gathered Okay, thank you. Okay. Thank you more questions Yes, got one more over there Hello, thank you for the talk I was wondering if there is any problem when trying to combine Different error mitigation techniques like can we have a situation where for example one error mitigation technique and give the Opposite effect when combined is when they have counter-effect with each other for example Yeah, I'm sure that could arise There are definitely problems that you Do They for example they might be tackling the same noise source in which case It's sort of a bit redundant to combine our investigation schemes So one example where the combination might be a Might not be beneficial So there are some air mitigation techniques which rely on post-selection so for example They basically detect that some for example, they might detect that some error has occurred in your circuit and then delete that Delete that particular shots from your results But so that in principle is good, but it has the effect of changing the description of the noise model So and that the description of the noise model Might be relied on by an error mitigation scheme that you use later So by removing that noise source, you're sort of changing the Assumptions that another scheme might make for example. So there's there is examples where combine them. You should think of it carefully about Okay, thank you Okay, thank you. So I guess if there are any more questions you can always put them in Slack and Daniel will check Slack, right? If there are some more questions, great So I didn't say that Daniel Milse joined us from the continuum office in the UK And we'll have a 10-minute break now. It's thank Daniel again for a very nice talk and 1120 we're gonna do continue with part two. Okay. Thanks Okay, bye I Don't see that's can you hear me? Yeah, I can one. Okay, great. Do you mind if we test your connection real quick? Go for it. Okay Can we see your slides, please? Yep, can do great. Do you have any do you have any audio or video in your presentation? Not in my presentation there. Okay, and we're good. You can hear us. We can hear you. So I think we're good Okay, do you want me to start in five or whenever? Actually, it's up to the organizers. I think they have a coffee break right now In which case I'll just hold tight and then get going when I'm told to Okay. Yeah, it won't it won't be long. Cool. Sounds good. Okay. Okay. So please take a seat everyone come in We'll continue with the next session before lunch Okay, please take a seat Okay, so for the next hour now, we have our next speaker Silas Dalis So Silas is a senior software engineer in Quantinium. He's also joining us remotely from the UK And here's Silas. Hi and He will he will tell us more about Error right error mitigation. So without any more delay. It's all yours Silas Thank you very much for the introduction. I thank you everybody for having me remotely. So yeah, as captain says You know Dan's just hopefully I assume introduced you all to the basics of error mitigation or at least the kind of error mitigation We've been interested in and he's also taught a bit about Kermit, which is the software package we developed for actually implementing these themes of interest Well, I'm going to do the follow-up stat Which I'm actually going to show you all how to actually use Kermit or at least the kind of basic introduction now this is going to be a Demonstration as opposed to the kind of hands-on tutorial But you'll see here I've got a link that I will also put in the suitable slack channel on the slack count you've got for this conference Which has the resources that Dan showed you this morning and that I'm about to show you So you'll be able to access these notebooks after the talk to get like a better more hands-on idea of what I'm about to talk about There will also be an element of high ticket to this. So hopefully Yoshi and Catherine and Callum set you all up suitably yesterday. So that all that will make some more sense Okay, yeah, so so technically this is part two of our error mitigation morning and that Dan has already introduced You all some noise error mitigation and what come it is an overview as the paper we worked on and also as a software package Certainly the said we'll be going through introduction and getting started with Kermit the kind of out-of-the-box error mitigation methods that it Supports and that you should hope to be able to use really easily We will cover At the end a slightly more advanced use of Kermit Which is the ability to combine schemes in quite straightforward way, which was really It's a selling point when we were designing the whole software kits And then I'm thinking we will not make it to the development of new or mitigation schemes today But I decide to leave it in this notebook so that if you do want to have a look afterwards You'll be able to see how we actually create schemes in the first place Also, I should mention now that if you have any questions do just interrupt me go and I'll try to answer them as well I can now I'm sure Dan has just talked all about this. So you don't need me to remind you but People often kind of classify the era we exist in now in terms of quantum computing as being the NISC era Which for our purposes and looking at our mitigation Kind of set a couple of things. So the first thing is that the devices were interested in Defined as having a small number of qubits even knowing that there's really great programs on the hardware side It's device sitting larger every year and the noise rates are going down. There's really brilliant We're still not in a position where in the near term We're expecting to be able to do some kind of error correction as I say on the second bullet point here or Implements some of the kind of really good use case algorithms that motivate a lot of the work in quantum computing such as those that kind of work and breaking current encryption protocols and Broke research algorithm And so error mitigation is kind of posed as a a near term technique trying to get essentially better results when we run things on hardware running better experiments and We tend to define them as essentially attempting to do this trade-off between Running more circuits or running more shots on my hardware at the cost of them being able to reduce the noise So it's a more expensive experiment. You're running, but if you've quantified the experiment properly and you've quantified the error mitigation properly Even though we're spending a bit more time with the hardware, which might be a bit more expensive Actually the payoff is that it will reduce noise in a way that's really beneficial to experiment with a running And the other thing to mention is that typically the increased number of circuits and the increased number of shots and running schemes is the kind of trade-off and that the Changes in terms of like the circuit size itself say whether I want to add more qubits to be able to characterize some aspects of the noise Tends to be very modest and especially in comparison sent like a quantum error correction technique Where you tend to need many many physical qubits available to encode a single logical qubit Which is something I'm sure my colleagues Ben Kruger and Kieran Ryan Anderson will Introduce you to in far greater detail this afternoon so I can leave that there Okay, then and then just you know, there's our background device the noisy so noisy that we can't be quantum error correction But not, you know, they're good enough that we can maybe do some things And so so we like them we scoop in and we say oh, here's our open source pipeline package will turn it Which stands for quantum error mitigation and any similarities with the frog is coincidental it which is our source Python package for designing and Executing digital error mitigation schemes like the type Dan does introduce you to and when we say execute we've been Kind of automatically if it's all set up, right, it should be really easy to just hit run And then go get a coffee or go do something else and when you come back not only is your experiment run but it's been run with a bunch of error mitigation and You know, there's a there's a few setting points to come it What is that it's implemented using pick it which you should all be somewhat familiar with now which You know, let's say build top a lot of the great work that's been done in pie ticket It also gives it a couple more generic features. So it's apple Magnostic You'll see later that I'll be using pie ticket back end objects to run various experiments with Kermit and The use those back ends is essentially interchangeable. So I'm going to be using some Backends which we've built on top of the Kiske software platform. This is just a long-gathered Notebook, so because they're all open source and really easy to access But with what I'm looking at you could easily exchange them for a backup which runs on our continuum hardware And it should just work throughout the box So this makes it platform agnostic to the hardware you're interested in as long as the error mitigation method is suitable for hardware And it also means is it kind of easier to work with other software development It's just say for some reason your preferred software development kit is kiss kit and not pie ticket. That's fine We don't mind you can you can develop your circuits and kiss kit Converting the pie ticket with our converters of the industry's cover So that doesn't have to be a barrier to running better experiments And then so then the final thing is that come it has a kind of common interface to Generating these kind of schemes running them And in terms of the methods we have kind of out the box that you could do Well, I believe Dan is just introduced you all to zero noise extrapolation different data regression and probabilistic error cancellation Which I'm hoping he's also characterized as being schemes which mitigate for errors In expectation value calculations. So which is the kind of we typically think is being experimentally related to Hamiltonian simulation This is the kind of the end quantum chemist might be interested in And we also have out the box error mitigation But it equaled frame randomization or randomized combination and also through Corrective technique that works for state preparation and measurement errors The point being that there's you know, there's a bunch of out the box things here You don't have to be an error mitigation expert told to be able to run experiments and apply some error mitigation to see what happens And then we were going to get a bit more hands-on with this But it was a very quick introduction to the design of permit. Essentially what it does is it represents experiments on quantum computers and experiments on quantum computers with additional error mitigation as data photographs and The diversities of the data for grass and the kind of things you might Do in a typical experiment. So, you know, there'll be its Vertex in the graph which has a practically quantum circuit and it gives it to a back end and that back end You know, if you're running on actual hardware Interfaces of some API and send your circuit over the cloud to the hardware and it gets all set up to run for you There'll be a there'll be a vertex that does that as an example The other Edges between these vertices kind of define the flow of information from the start of experiments in the back view experiment and then Maybe the other interesting thing to mention is that we don't store these actual data for grass and memory when you use the Kermit package we store essentially generator functions, which hold blueprints to create them And that's what allows us to be kind of really flexible in how you run the experiments That's what allows us to use different back ends when we run experiments And that's what allows us to combine schemes and then finally before we actually start looking at some code then It is open source. It is available on PIP. This PIP and store Kermit We managed to snag the URL Kerm.it. So if you go there It should just redirected to the documentation and then similarly you can find the github repository for the code itself and in the SCECL organization Okay, then On to hopefully some actual code. Well, first of all we Find the kind of error mitigation methods that we're interested in and that people are able to use and turn it into kind of one of two types What is this one we call mit breath and one is this other one called mit ex? Mit breath experiments refer to any error mitigation method we've implemented That is designed to modify the distribution of shots retrieved on the back end Does we've been kind of saying but just to make this even more clear a typical workflow for a Kind of scientists running something on a quantum computer is you you have some Python package like save my ticket You use their circuit generation to create a quantum circuit you want to actually run on some hardware and Then you Well with us You use our high ticket backends for the different hardware, but abstractly There is some API which you give your circuit to and then that sends that over the file to the hardware providers and On the side of the hardware providers they have given you a quantum circuit And this instructs their actual quantum computer to initialize all the qubits in this zero state Run a transpiled version of your quantum circuit down to the specific hardware instructions So, you know the size of the pulse is being run Let's say or maybe the physical transport of some of the Natural physical qubits themselves and then at the end they measure them all typically in the ZBasey Jets them into a set of z eigenstates which correspond to zero one eigenvalues which you get back as a shot and so Mid res captures error mitigation methods that work within this kind of environment so you have a circuit and you get shots back and Mid res will run that for you automatically, but it might also run that for you while also applying error mitigation But from the outside perspective as a programmer working on it, you send circuits you get shot So that those shots are better. That's great Mid text then refers to this other type will be looking at later But refers to experiments where you're typically interested in the expectation value as the estimator of some kind of observable of interest And it's modifications on that We'll talk a little bit more about that later because we'll start with mid res And in terms of how we'll kind of consider what you from this We'll be looking at how we Might implement what Kermit does is just like in the raw pipe of your code We'll then see the equivalent of how it's done in Kermit We'll have a look at how I might form with or without errors and then we'll apply a Error mitigation technique out of the box to try and improve the results we're getting Okay, then so if we're going to do any of this we we need a solid candidate circuit to try and show an improvement results in Now I am I'm starting wasn't watching yesterday, but I would be astonished if at some point Nobody has, you know shown you a bell pair circuit or do you want a bell pair circuit on a platform somewhere that I'm I'm going to assume that you kind of know what a bell pair circuit looks like and it was probably also seen how to generate it You can pay ticket as that's what we've got here We have I'm sure you'll know pie ticket now that we can import a circuit of different pie ticket We can create a circuit with two cubits and two bits and we can use the indexing to apply On some gates that are required to make the circuits We've got Hadamard and we will control X gate and a measure and you probably all now seen this kind of circuit diagram now So we can you know have a look what it looks like And this is the circuit We're going to be kind of considering for the purposes of whether error mitigation can be helpful or not And so I guess the really important bit here is to remember that if we've generated a bell pair and there's been no noise We expect the state we construct to be in a equal distribution over the zero zero state under one one state And if we're not getting that there's something wrong Okay, so how would we run a bell pair just using pie ticket? We don't know what kind it is We've just started using pie ticket. Well, what are we doing? How do we do this? Well, hopefully you've also been introduced yesterday to the selection of backends we provide on pie ticket which Stand by set of objects running experiments with different hardware and simulators So what I can do is from the kits kit extension for pie ticket I can import an air backing object Like this is just going to be a noise the simulator So when I pass it circuits it's going to do a noise the simulator Of the gates in that circuit the industry that that circuit finds And then use that process to generate a bunch of shots equivalent to the number I've asked for Um, and so okay, so what I I create my backing object here and then I use the the run circuit method I pass it a Circuit this is the bell pass so I can be defined You pass it a number of shots I want to take which is a hundred thousand here And then I use some sneaky behind the wall's code to do the positive of the counts for me It's a bit easier for everyone to see And so, you know, I get my my counts object back here the distribution over the shots that I Got from the simulator running the quantum circuit of interest And I get something like this and so from our understanding of what a bell circuit looks like and what a bell pair is This is, you know, this is about right. We can fairly well trust that this has been a Noiseless process and this is because we've got approximately an even distribution of zero zero and one one states And this is what we expect to see And this is you know, this is fairly simple If we if we take a step back if you look at the bell pair generating code is in pie ticket and we look at the code to actually run the Running through a back end and get some results back. This is all this is all fairly simple So mit res would also do this for us as we're about to see but you know, if you're only interested in this, it's not doing much for you We'll have a look anyway though. So as I said mit res is the the set of Experiments that kermit defines that you might want to run where you have a circuit and you want shots and it runs that process for you We can import it from kermit. I'm surprised in the so, you know, it's a top-level import mit res And we can define it via a back end object So this ideal back end is this air back and we use just a moment ago to do the simulation for regular pie ticket And this produces our new object an ideal mit res, which is hopefully going to do some helpful things for us How do we run it to do the exact same experiment? We just did then well, we have to define our experiment as an input And so the mit res each experiment is defined as Essentially a pair but we wrap it into a named tuple. So it's a circuit shots object But this is defined by two things the quantum circuit we want to run and this circuit is the bell pair circuit We just looked at and the number of shots we would like to receive of that circuit And the current we have because it can run a bunch of experiments for you in Parallel We have to wrap this into a list because this is the top level out of single thing. I hope it's not that's fine So this is our you know all the experiment is defined. We've got a list of a single experiment And the mit res object we've just generated which is running through the noises back end. We just used Has a dot run method, which is a local runtime Retrieving the results we want and when we look at the representation in the second I'll explain a bit more about how that runtime works We run this though and we get a result list. Now, if you use a normal pi to get back end You pass it a circuit and the number of shots and you get a back and result object back and we hope for you introduced yesterday Maybe unsurprisingly then the result we get back here is a list of these back and result objects So it's you know, it's the same kind of data And then if we get our counts back and we plot them, we see that for the noise simulator Okay, we had to run the experiment in a slightly different way and we had to use a whole bunch of experiments Different way and we had to use a whole different package to do it just to run the same thing So maybe that's not that appealing as like a base level thing But if we then plot the results we get, you know, it's about an equal distribution zero zero one one and that's that's good That's what we that's what we want to see so come it isn't doing something buddy under the Under the hood at this point. It's just taking the circuit or running it through the back end give you some shots back And then to kind of show you what this looks like I've realized sadly just before this talk that my html hadn't rendered properly So we're going to have to switch to the notebook itself to see what this is but as I mentioned Kermit stores the experiments it runs and hopefully eventually the error miscarriage experiments it runs as data photographs which are generated on demand And so if we you know, we just talked about running our circuit through a back end via a kermit mitre's objects Well, what was that mitre's object doing? There is a helper method for the class object would get task graph, which will give you a visualization of what happens And so the process we did looks at something like this Uh, well, okay. Well, let's take each bit by bit. It's a data for graphs of diversity's Define kind of generic functions you might want to do for running some of the quantum computer And that's what these big green boxes are going to represent if you look at these So one of them says circuits to handles Uh And this stage is getting the quantum circuits And passing them to a back end to get an object called a handle Which we can later use to retrieve the results and hopefully you were introduced to this kind of process yesterday Uh, and it handles the results is using these handles to go back to the same back end and say Hey, I want my result for this handle and then it gets back that result And so the whole process we've uh, we've done when we call dot run Is we pass our circuits into the inputs and the number of shots This gets passed to a task which they're passing to a back end to be run This passes those kind of unique identifiers the experiments back to the back end. I said, where are my results? And then they get passed back So, you know, this is a really basic experiment. It's done for us, but it has done it for us automatically And this is kind of another quick description of what I just said Because there's data graphs that they're in a python class we wrote called task graph. I don't think that's too important Um, I mentioned that diversity is our functions reducing of help now in practice are not just functions They're actually a python class we wrote called a mitt task Uh, this is only so we can add additional attributes to the to the functions essentially So that class knows the function that's running on the input data It also knows the number of in edges it needs the number of out edges. It's got a name It's just additional information for either views essentially The edges of the graph move the data between the mitt task objects Something I should have mentioned that I hope is fairly clear is that data moves from top to bottom on these graphs So we get from inputs to outputs Um, and then so the final thing is at the moment Kermit Unfortunately only has a kind of basic local runtime So we define these graphs because it kind of gives additional granularity to the processes we're running And it means that if you've got a really good runtime you can run the various things in parallel Which is really great when you're doing stuff on quantum computers Because often you spend a when you run to another quantum computer I don't know if it was told you this but you often spend a lot of time Waiting for the quantum computer to run because lots of people want to use them so being able to Paralyze that kind of process is really helpful currently the runtime is just Local it does a topological sort on the data photograph and then just runs to the task sequentially So, you know, you're not saving time as much there, but it is still doing it automatically Okay, quick quick aside into you know, Kermit fundamentals when we're back to the actual code bit Uh, so we've just seen some noiseless simulations of a bell pair and we've kind of listed the results and gone Yeah, that that makes sense. It's about even distribution of the two states that it's meant to see Uh, uh, and see, you know, if all our computations were noiseless, we wouldn't need aromitication But ultimately Kermit exists to do aromitication We suggest that maybe sometimes quantum computers are doing things we don't like and we need to try and, you know account for it or correct for it Uh, and so let's kind of create a scenario in which we do have some noise there Then we can see how we can improve on the results And so we're going to do this by creating another air back and object like what we have before but we're going to pass it a noise model Now I'm passing it a depolarizing noise model. This is going to have, uh Errors every time I run a single cubic gate or a two cubic gate with some probability It's going to slightly change the unitary that is implemented And it's also going to add something called a readout noise. So when I measure my qubits It's going to probabilistically decide to give the wrong result. This is an artificial noise model This is something I'm asking for it to do It is a handy tool though If you're trying to work out the performance of techniques you're working on or maybe the performance of some kind of circuit compilation you're doing Uh, the code for this noise model, I'm not going to show here But if you access the notebook, you're going to see that there's some hidden cells and one of these hidden cells has it So you'll be able to, you know, if you're really interested you can go have a look through the notebook yourself afterwards and see how we create that noise model For a top level explanation, the point is instead of our back end being noise, there's everything being perfect Now when we run stuff through it, we're going to get some noise So we can't be run a very similar experiment before Uh, we construct a new mitres object And mitres object where the back end is for the noisy simulator and not the noisy simulator We do this for a function for gen-compiled mitres So we're going to start getting into the, you know, the world of slightly more complicated mitres data graphs This one in fact though is very similar to the one we looked at before Where it can send circuits to a back end to get results, which it then returns to you The only difference is that now before it sends a circuit to the back end It's going to perform a basic high ticket compilation on the circuit Just to make sure that the back end Is able to run the circuit we provide And the reason we need to do that in this case is because when we define a noise model We'll say something like oh for the hadamard gate With some probability Run a slightly different process And so our circuit needs to be defined in terms of the gates we've defined the noise model for I don't mean this is also true in general for actually running on hardware You will need to make sure your circuit is rebased to the primitives that that hardware actually supports Which is something particularly automaticity for you if you ask it very kindly And there's also something that callum i'm sure has told you all about yesterday Okay, so the main point being we've now got a new mitres object where when we run through stuff through it It's going to do a noisy simulation and the results aren't that good Even this new object though the interface of running it is is the same You know, we've got our list of circuit shots that we generated earlier that runs a belt at circuit 100,000 times Uh, and we pass this through to the run argument of our noisy mitres To get some noisy results back and then we have a look at what the results are saying. Well Kind of a good thing because it means we've defined our noisy simulator correctly We're finding that actually before we just got zero zero one one states, you know Our belt simulation was done perfectly the difference in the distribution was just due to kind of sampling noise as opposed to an actual issue with the Quantum computer or the simulation we're running Here though, we've got some zero one and one zero states that we know are not a valid result to this The kind of quantum state we've tried to produce So this is now the kind of area where we're thinking well, how do I how do I improve on this? Maybe, you know, my device is too small and too noisy for me to run quantum error correction Maybe there's another technique. Maybe quantum error mitigation will help now So let's just using out the boss error mitigation method to see if it works for this noise simulation we've done So as I mentioned earlier for the mitres kind of criteria we have two types of Error mitigation support it out the box one is an equal randomized combination will not touch on today But the other is the spam errors spam is a A nice way of referring to state preparation of measurement errors i.e as I mentioned earlier you you know the way you run experiments is you define a quantum circuit And you send it off to the hardware people And what they'll do is they'll initialize their their set of physical qubits in the zero state And then they'll run the operations you define and then they'll Measure more than z basically and so state preparation refers to errors in the initialization You're supposed to construct an all zero state But there is a chance that there is noise in that process So you don't quite do that and also when you uh, I'm sorry and the the measurement obviously refers to the bit where you measure of seeing the z basically and Maybe there's a bit of noise there and it doesn't quite determine the result you expect Uh, so this this terminal spam module has a few options. We're going to use the uncorrelated option for spam correction when we run this experiments uncorrelated in this sense essentially means That you make the assumption about the noise profile of your device that each individual physical qubit It's uh error in state preparation and measurement is independent of all the ones around it so we can model them all separately This isn't necessarily true in practice. It might be that if you measure a single physical qubit That it gives the the wrong result because of some like over citation and then this affects the results around it So you might have to kind of gather more information But if you're going to see that all uncorrelated then you can gather quite small amount of information And actually in particular a scalable amount of information to be able to correct for the results you get So we use this input generator function Which follows a blueprint to create a data photograph that will implement spam mitigation automatically Uh, we pass it our noisy back end. So it's running everything through our noisy back end Which is the one we're interested in now and we also pass it a number of shots This number of shots refers to how many times we run the calibration circuits spam error mitigation to uh get the kind of photographization information we desire The more shots you pass it in general the more precise it's going to be Uh, there is you know, there's there's a point where if you pass more shots to make a difference We we won't be talking about that kind of analysis today Once we have the object there running it is the same as any mit res you just call dot run You pass it your list of circuits and shots and You know, we can see here that in the uh the artificial example I've created for trying to convince everyone here that error mitigation can be useful Lo and behold, we're finding that actually You know, we've got the noise results. We saw a second ago here And we've now got some results where we applied spam mitigation on a noise model. So you know, it is fairly artificial But we'll see that it's doing something and we can kind of we can visually see that we've got fewer Zero one on one zero states and those are the states. We don't want to see so that's probably it's probably doing something better Uh, and the important takeaway from where this is a kind of a Termit hope you're doing something helpful Is that I've not had to really explain how spam error mitigation works So in particular, I'll just explain how the spam error mitigation we've implemented works Uh, in terms of an interface, it's creating a different mit res object and running it as before And now our results are better. So, you know, who in an industry you need specialization in different areas You don't need to know about error mitigation to get good error miscated results And then finally before we move on to the mit ex which is these other types of experiments will have a very quick look at the kind of task graph It generates So as I said, when we call the generator function, it's going to automatically create a base for graph representing the error mitigation experiment We want to run And so we can lift it here with the same kind of dot get task graph method you can run this in any of them Uh, and this is uh, you know a definition of the overall spam correcting process We've just run automatically. So these are all the tasks that are actually happening under the hood Uh, we can very quickly talk through it. I think the most actually the most important bit is to notice these two tasks down here These are equivalent to the mit res tasks we saw earlier So we can see that we're still running stuff through hardware As actually how this graph is generated is it takes a mit res object like this And then it's got the what the blueprint does is it defines how we add the other tasks around this Which tasks we append which have we prepend which tasks we add in parallel And so at a very high level we can see, you know our experiment circuit shots come in through this input Uh, this task here works out all the characterization circuits We need to be able to do our spam correction later and generates them Then the actual experiment circuits are shot down this left hand side of the graph Which I said she just runs them all through the device as if with a normal experiment This right hand side of the graph runs the characterization experiments for my spam error mitigation Through uh, and actually do the same device to go to generate data So we got two points where we're running through a device And then I've got a final task which takes my experiment data just running my circuits through the The back end is normal essentially and it takes my characterization data So I've got from my spam mitigation and it combines them together and it returns a distribution which has been corrected And so it does all this automatically for you, which is nice Okay, that is Approximately part one of this talk then so we've looked at You know broadly what kermit does and we kind of we've built up from How we might run experiment by ticket how we might run experiment mit res How we can add a noisy simulation to our experiments if we know we're testing things locally And then finally how we might be able to use the Kind of error mitigation scheme provided by kermit to get better results But as I mentioned when we were first looking at error mitigation schemes in the literature We found that you know People people try to apply error mitigation wherever they can when they run experiments because the general it can improve things And one point where people tend to be defining error mitigation experiments to work on Was the experiments where they were trying to essentially get expectation values and they were defining error mitigation schemes Which were affecting the result of the expectation value So that means not the actual shots that are coming back They get the shots as normal and they process them But the quantity returned from that processing they then error mitigate on And so we have a few schemes in kermit that automatically apply this kind of experiment And to kind of show this off. We've got about 20 minutes. It should be fine Uh, we're going to see how you would do this a normal ticket and then how you do it in kermit We'll then look at it with and without noise So kind of very simple we just did and then what applies to even see if it improves this And spoiler is going to improve things because we wrote the demonstration to show that Okay, so let's start off with trying to define an experiment which we can kind of train improvements for For our first is here. We're just going to create a a random circuit comprised of a section of two cubic unitary matrices defining gates If you're interested in how we define this random circuit once again the code will be Hidden there's no but somewhere so you can go work it out And we can see a visualization For our purposes though the point is what is the ideal expectation value this random circuit? I've generated is going to give me Now this is how you might run such as bear with pie ticket I'm also hoping that cannon was kind of introduced as kind of idea yesterday, but we'll go through some of the types very quickly Okay, we start off by copying our ideal circuit So we will have a funny business going down and then for our ideal back end Which is this air back end of it we created earlier without any noise We call the get compiled circuit method to produce a circuit which is compiled Uh, this is hopefully not a surprising step because General quantum simulations don't accept Unitary to keep a box as an input gate that that's not something they know they don't know what that is So we have to turn that into a sequence of gates, which it doesn't know what it is That's what we're doing in the first stage Then we need to define the observable we want to measure now in this case We're saying we want to get the all z most uh all z observable Which essentially corresponds to projecting my output my quantum state into the z eigenstate for all of my qubits And that is what this thing is defining here There is a object in pi ticket under the paoli class If you're not sure about how he makes these This afternoon in the quantum error question section you are going to become instantly aware of what they are So you can hold on to that We can create a qubit power string and so this essentially says for all of my my qubits That's what this list is generating here It's saying I want a dead a dead term in my measurement and then that's what these paoli deads are saying That's so so if you know about uh quantum chemistry and you've run these kind of experiments before I hope this is kind of making some sense And then a normal pi ticket our back end has got a helper method called get power the expectation value I have my random circuit. We just looked at I have my qubit power stream Which says I need the all z uh term please and it gets you a result And if you if you don't know much about experiments we're getting expectation values That's totally fine. The thing you need to take away from this is that the ideal expectation value is 0.55496 etc etc etc This is our target value. This is when we've got no noise for our small experiment This is what we should be getting and so once we start adding noise and we don't get this We need to try and recover this value again essentially Okay. Well, this is how we did it with normal pi ticket doesn't get any easier with uh kermit And the answer is I think a face value Maybe not But the point is that kermit could do more things if you get used to this interface Suddenly, you know the whole world is already stuck Uh But you know, let's go let's go basics and build it up The first thing is like similarly to how we were generating these mit res objects for running error mitigator schemes Which affected our distribution of shocks We can import a mit ex object. We can define it for the same noise as back end as air simulator and With our mit ex, you know object class initialization to get our mit ex object So we've now got this thing here ideal mit ex and if we can pass it the right thing There's going to automatically run our experiment for us and that's hopefully really helpful Um, however, the definition for a myth experiment is a bit more complex because uh, it has to Hold on to more information Something I should mention is that actually These experiments aren't quite the same in a typical quantum kermit experiment Say you would be wanting to measure multiple qubit power strings not the same one And so that's what this qubit power the operator object here will hold for us It holds a dictionary between uh powerly term to be want to measure i.e. qubit powerly strings And their corresponding coefficients the coefficients is normally something which are physically motivated by say the the Hamiltonian of choice for your experiment So we don't for this what was on my here? We don't need to worry about these coefficients We just need to know that there's an object that can hold them So this is saying, you know as before I want to find the all zed terms expectation value for the experiment i'm running And then for mit ex we wrap each of our individual experiments into this class called observer experiment Which you know, if you remember before we had this circuit shot thing It was just a main two four, but it was designed to hold everything to define that kind of experiment Well, this is analogously the same It's a it's a main two four, but it's designed to hold everything I need for my experiment So I've got my ansat circuit, which is my random circuit In number of shots I want to take of that for my you know the experiment i'm running and also In spy ticket we support symbolic compilation. So Often you might want this to have a set of symbols in it that you want to explore and we pass that through this object here We don't need to worry about that right now the state. We've got a circuit that we're doing our expectation values odd and a number of shots we want to take And then we've got you know Another level of abstraction to hold our qubit parity strings That's put a qubit parity things into our operator And now we're going to have to put this operating to this new object called observable tracker And it's going to essentially stack all the computations we do and work out to they knit the results back at the end So for now just think you know this just holds my qubit parity operator. It just holds the terms I want to measure But this whole thing defines an individual experiment And we can pass this experiment through in a list to our ideal mitx of this the one with for the noise simulation If you can get your head around what the input arguments look like then well Then you're simply just calling run and you're getting your ideal expectation value back and the results kind of come back A bit like this This will be a dictionary between the the terms we wanted to measure so In that qubit power string we said that for each of the qubits wants to measure the z term And that's kind of what each of these things are saying I think Said for qubit zero said for qubit one And I say this term gets a value of 0.55656 and a bunch of zeros And we'll see you know, it's a density matrix simulator says it'll be some sampling noise, which is why The expectation values differ slightly But we can see they are approximately the same so they are they're doing basically the same thing and that's a that's good So our our baseline is Doing an experiment through normal pi ticket or doing an experiment through our kermit mitx We're going to get the same result pretty much and that's a good thing So now we need to work out how we can make our result worse so we can then show how to recover our result He says but actually first of all, we're going to very do we're going to do is a very quick look At the task graph for a mitx object As we saw for the mit res and we saw span mit res What does this do then? Well We'll get really abstract for this because we're a slightly short in time The most important thing to notice is this little word mit res here All these experiments at some point are being run through a hardware Which means they're being run through a mit res object because we build our graphs around it Now there are other things happening as well Essentially what the task to do beforehand is for the terms you want to measure and your ansat circuit It works out all the measurement circuits. I actually to run on my hardware to receive the results I want And then these results afterwards do the post process when the results for you So if you're if you know, it's your experience quantum chemist to get these kind of problems You'll you'll know how to process shots to get an expectation value while it is just as that automatically for you Yeah, and so we can pass our noisy simulator back in from earlier the one with that deeper Right with noise model that I showed for the mit res case We can just pass that through to our mitx to get a noisy version of it And then we've defined our experiment now. So we can just pass that through the run function here So it's doing that same experiment before and for the artificial noise model we've created to try and Then you know improve We're finding that the value we get is 0.26920000 et cetera And this is not 0.55 whatever it was And there's quite a bit away from that and so our metric of interest here is How to make this number closer to the number we want it to be? And we are going to try and do that with an out the box error mitigation method from kermit And this one is going to be zero noise extrapolation Which I believe dan has just introduced you to But to maybe try and give a very quick refresher on what it is As we're seeing here We have for a certain amount of noise, which is whatever this natural noise is We get a certain value, which is 0.26 Okay, and this is essentially like a single data point How zero noise extrapolation works is it tries to artificially increase the noise experienced by the device? So, you know, this is noise value one just whatever the device naturally has If we can artificially inflate it to the amount of noise experienced is double or triple we can get new data points 0.26 down to 0.15 et cetera et cetera And then we can use that back as a extrapolate with some kind of fit to the case where we have zero noise And so this this generator function is going to generate a mid x object Even that is going to do all of this all of this for you. So you don't you don't really have to know what's going on You just say, you know, I want better results. Maybe this will do it and hopefully it does So maybe unsurprisingly then there's a modeling kermit called zero noise extrapolation Where we can import a mid-x generating function by the same kind of blueprint scheme I talked about before which will generate a data flow graph which will likely run this And it also has a couple of extra keyword arguments Because we can feel like the parameterization of how we are going to use the error investigation method Let's talk about this in a second. So we have our generating function call We pass it our back end of choice. So this noisy back end we're using so we're trying to improve on the results We'll pass that one in Our noise scaling list. So these are the values of which we want to artificially inflate our noise. So we're saying, okay When we run our experiment normally, it's noise level one. Let's get points for noise level three five seven and nine And then try and do our backwards extrapolation Uh The folding type will do this one first is essentially the manner in which we choose to artificially attempt to inflate this noise So when we do dot circuit Uh, essentially what happens is well, okay, our whole circuit is a unitary process Uh, so if we've got noise level three Uh, if we run that unitary once that's noise level one If we then run the diagram of that unitary that's noise level two But the actual process we've done is just the identities. That's not helpful If we never run our whole unitary again for our whole circuit We've run three times as many gates and we've done the exact same Uh, kind of unitary on our state of qubits So we can say oh, that's a bit like if I had three times as much noise and running my process So that can be my data point Uh, and so if it's five seven nine, we just run the same circuit this diagram more and more times And then for our fit is the way we fit these data points on our on our kind of like space Back to the point where it's got zero noise, which we'll see in a second because Generating the mithix object itself is very straightforward And now if we run it with the same experiment we defined before which you know defining it seems maybe a little bit complex But once we have it we can just pass it through all these different mithix objects that are well in order to do it for us Uh, and we can see that you know, we've got our our noise for the Noise level one here. We've got our expectation value and it's about in a 0.27 So approximately what we saw a second ago And then as we increase the noise we can see the expectation values getting further and further away from the 0.55 value We wanted but when we do this fit backwards We're getting uh, we're predicting that in the zero noise case. We'd have got a value of 0.365 Now is that accurate No, if 0.55 was our target value to this, you know In five example, I'm showing everybody here, but it's closer And in a second, we're going to see a trick that gets us just over the line to get essentially the right value Before we do that though, we can have a very quick look at the tartar graph for this guy Hopefully you're not too bored of these green grass yet I guess what I'm really trying to show is that you know The error mitigation methods of showing you that getting more and more complex to implement yourself So it can be handy to have something that's just going to automatically generate to do it This one's I think artificially quite easy to understand what's going on though Essentially we have this, you know the circuits come out normal We could power them for our back end and then we have this duplicate box essentially what this does is It defines new experiments for each of the artificial noise levels we're working with So we had you know artificial noise level one here three here five here seven here nine here Each of these run through their own separate experiments, which is what each of these Indiana edges are going off the show And then at the end would be a task that collects all these different experiments Works out the results and then we pass things out at the end again Point being is it's doing a lot of stuff in a state for graph under the hood But for us the user we just Frate of the objects And we've ran it and we got some more results back Uh This is very just quickly showing that you can use different types of folding and fits for your results So Depending on what experiment you're doing. You might find that a different fit type or a different folding type for the hardware You're working on outperforms Okay, so we've done, you know What's Kermit we've looked at running Experiments where I've just got circuit out with some shots back and ways to error mitigate that We've looked at experiments where I want to get an expectation value and we've looked at ways of mitigating that with Kermit What we're going to do to finish off is we're going to put these two ingredients together And we're going to run them together and we're hopefully you're going to get a nice expectation value for this noisy case. We just looked at And This is going to be this is a two-minute job here Because it's really straightforward and Kermit and it's one of the design principles for me. So initially wrote the code Which is you know, okay. We just looked at gem instead of the myth at the fall We were kind of talking through the arguments We pass it our noisy back and object and this is this one. We're just talking about It's the depolarizing noise model with the kiscuit simulator Uh, we've got our noise scale in this and we're we're saying artificial increase at the 3579 And then we're saying use an exponential fit to estimate what the zero noise case would be We've also got this other keyword argument that's going to show us the kind of graph is You know graph representation of that exponential fitting But we're finally going to use a new keyword argument To improve the computation we're doing and that is experiment mit res What this does is it defines the mit res object each of our instances of zero noise extrapolation are going to be going through or To make it a bit clearer This is the z and e graph we just looked at and we said that it splits into five different experiments for each of the noise level And runs them independently Well, we can see over here that each one of them has to go through some kind of mit res object when it does this You know at some point we've got to run our circuits on hardware and get some results back So when we pass this spam mit res object here, which is the one we defined earlier to get better results when we're working with mit res Essentially what we do is this mit res object here gets replaced with the one at the spam location Um, I probably have a graph for it somewhere here I do we'll see it said mit res the second to go on the graph around here Well, now it says, you know Spam things you see the words when we say the same spam The point being that for you the user Getting this new data flow graph that does both zero noise extrapolation and spam mitigation is Essentially requires adding a new keyword argument to my definition And so now this is going to do both for us and it's defining the data flow graph doing that Uh, we just had a quick look at that so we'll skip past this And so now if we run our experiment It's this noisy it's this noisy simulator when we first ran it We got this expectation value of 0.55 and that was our target value for the ideal case When we ran it in the noisy case like any error mitigation, we got 0.25. So we're really far away Well, actually now by uh applying uh spam mitigation all the individual circuits being run And then applying zero noise extrapolation on the expectation values retrieved for all of them We find that we're able to get a value of 0.5690 blah blah blah i.e. a value that is a lot closer to the 0.55 ish that we were looking for You know, it's arguably done a small overfit and error mitigation, you know Austin you're going to end up suffering in this way where things are just like really close, but they're not quite there But in the grand scheme of things there's done a good job Arguably of accounting for the depolarizing noise model we just looked at And I think I'm going to have to leave us all there for today because We're coming up to the top of the hour and this would certainly take longer than 10 minutes um, but what I will say is that you know, we've looked at uh, you know how to use This mit res per circuits. I just want to run so it's shots back We've looked at how to use error mitigated mit res. We've looked at mit ex We looked at error mitigated mit ex and we've had one look at how to combine the schemes Well, I should maybe mention then that in the documentation it tries to make it quite clear how you can pass these uh Kind of keyword arguments between all of the different Error mitigation schemes we define so you can combine them in any way you want really Um, some way the more sensible than the others. So be careful um I'm sorry, I have to have the the conclusions here Instructs in case I would come it out the box error mitigated Advanced use of kermit to combine schemes to get better results And uh, if you guys are interested the notebook in this repository here Talks a little bit about how Uh, you can define your own experiments and show some results through a methods for special filtering So it goes through how to make this kind of data photograph yourself Uh There's last three houses a column yesterday and Catherine Usually would have talked about the uh using pie ticket down Let's talk to you about noise narrowness gauge that I've talked a bit about kind of and I think Uh, that's all I'm going to talk about now. So I'm happy to take any questions people might have etc. Thanks a lot for listening Okay, thank you for a very nice talk Silas Uh questions Got some hands up At least three so let's see Hi, thanks for the nice talk So just in the beginning you showed when you initialized the mitres object So calling the mitres class you passed the air back end I was just wondering on the compatibility of different back ends like what you can use And if there's news, um harder things that come out how You make everything compatible Great question. Uh, so also I say great question partly because it just it just falls into my hands nicely So I really appreciate that Uh So in pie ticket hopefully this issues yesterday the the back ends we define Um, they're essentially standardized So they have the same access of functions to do things with them if you want to run a circuit It's always process circuit if you want to get results It's always get results if you want to compile it's always compile circuit, etc And so this means when we're defining emit res object like so Well, if I wanted to I actually normally I can't edit the Yeah, I could probably edit it. Can I edit it here live demo? Should we find out? Uh No, I can't but if I wanted to though here I could instead of Importing this back end here from kiss kit What an example is if I wanted to run this on actual IBM hardware There is another pie ticket back end object called an IBM Q back end I could create an IBM Q back end object instead to replace this one And then I could simply just pass that object through to the mid-range generator And now if I ran it it would run all my experiments through the hardware and not through my simulator Are you the point being that changing the hardware you want to run on is as simple as just changing the back end object you pass Now in terms of what ramifications that has for the actual error mitigation method That is quite a sensitive topic in the sense that each Individual hardware tends to have very specific noise models that are also very very hard to characterize So when people work in error mitigation, they tend to do things like say Oh My you know as all my bits of physical hardware moving all my humans around and doing measurements There will be noise coming from all kinds of sources that are going to perturb my computations and capacity But approximately I can probably say that noise is going to be about a depot as a noise channel i.e. all my errors will kind of manifest as just being Essentially as if additional power gates were added to my computation under over some kind of distribution And then they'll will work off that principle to define the schemes Okay, cool. Thanks. Maybe on that second point. Is there a lot of automated methods to discover the noise model Okay, well, okay, so I could maybe use that to explain why we decided to make this composition the first place And this is because there's a method I haven't talked about here called randomize compilation Randomize compilation is a method which attempts to tailor the form of the noise the circuit shows when you get results back i.e. in in terms of errors and I'm sure Dan introduced this we kind of often Characterize errors as being one of two types as either here in errors or in here in errors Here in errors are errors where it's the kind of the exact same error every time So if I say to my quantum computer, I want to do a rotation in the ZBasey of Angle 0.3 A coherent error might be that every time it actually does that it runs 0.32 instead and that's just a calibration fault But that means if I run a lot of rz rotations on the same qubit by the time I've done a few in a row Suddenly the quantum state I want to actually Receive is quite far away from the one I actually wanted On the other hand, we talk about incoherent errors, which are probabilistic errors So it might be that with some probability when I run an rz gate I actually run an rz gate of 0.32 followed by A y-gate a poly y-gate you can practice One time I might do a poly y-gate one time I might do a poly y-gate four etc And so by the time I've run this gate often Even though there's still some kind of error occurring an error is in terms of like distance in my Hilbert space Closer to the quantum state I actually want to implement in the first place And so that is a kind of a tool people are aware of That attempts to account for the noise not in the way you say where it's kind of A well calibrated noise model for the actual device that we can then account for We're actually changing the form of the noise that the the device has so that we can then correct for it So when we have these compositional schemes the idea would be Well in zero noise extrapolation, I need to pick the fit and my noise scaling levels How do I know which fit to pick? Well, okay, maybe if I add randomized compilation underneath then I know that I've got a classic Cowley noise model approximately over the days of running So maybe that can educate the kind of fit and picking So from a digital level there's less focus on defining really good noise models and more of a focus on defining Schemes that can kind of either change the form of the noise and then correct for it But yes, when it gets to the hardware level the the the engineers working on actually making quantum computers are very interesting And that's why it's loaded. That's what they're doing Just not on the level we can care about Cool. Thank you very much Thank you. So maybe one more question Hi, thank you for the nice talk like how to select the number of short counts With that optimal we need to use so Great question. Thank you it That is in general quite a hard thing To quantify. This is something we did Well, I wouldn't say we considered necessarily if you if you do take a chance to go look at the karma paper and you look at the results You'll see that we fixed the number of shots we did through this experiment because we knew it was a factor of importance essentially The terms of like kind of the near term experiments you might be running now it is Maybe just the approximate rule of thumb, which I appreciate isn't a very helpful answer The more qubits your circuit has the more Basically states that could be returned by the measurement obviously to next potential scaling If you're running a larger spread with more qubits, you're going to need to run more shots of that circuit to get accurate results And going up to the order of 100,000, which we've shown here Maybe viable for a quantum circuit with many many qubits And I have to know similarly the Something like a clear data regression, which is applied in the show The amount of circuits you need to run and the precision of the circuits you need to run To get accurate data for actually doing the correction might be very high as well So you might also be going up to an order of 100,000 shots. I appreciate that isn't a very helpful answer But the truth is The precision of the relationship between the number of shots we need and the number and the results you get Is something we haven't explored massively Thank you, but if we are going to increase number of qubits anyhow, we are going to increase the computational cost for the problem, so Yeah, well, yeah exactly so as we as we move forward in the world of quantum computing and device get larger and larger We would be expecting to have to probably run more shots to get things and that is that is a problem we need to consider Okay, thank you So, I'm sure sila is going to be online on slack right to answer any more questions you may have I could be online on slack. Yes on slack just Get more questions to him if you have some I think there was one more question, but Due to time we should probably Now thank our speaker Okay, thank you And then you know now we have lunch so that's good so Get our energy sources back filled up and then you have gotten an email about a swag, right? So there is a she's gotten all of you got an email about picking up a t-shirt Let me see Yes, so you can go to the Office the front office, but they're not opening up till one o'clock and we'll meet back here at 130 Okay, 130 p.m. Thank you