 A very quick announcement. Hi guys, I wanted to take this opportunity to make a little advertisement. So my name is Irina, I'm from Master in High Performance Computing. So outside I written there, basically all the information you can find there. I just to be very brief, so what we are, we teach scientific programming. So basically we teach how to write high quality professional codes for science. The codes that can be run on supercomputers or on GPUs. We also do have some machine learning in our program. Though it's not the focus, well, from next year we try to introduce quantum computing courses. You can see the full list of our courses there. I would say we are really unique program in the sense that there are not that many places that teach you how to do scientific programming. So we have scholarships for ICTP supported countries. And the deadline is in like 10 days. So if you're interested, just don't be too slow. We also have OJ scholarships this year for people who are local or like richer people who can support themselves. But at least they don't have to pay the fee so they can get the education for free if they perform the project on a given topic. If you have any questions about this or you're interested, please just find me during the break, lunchtime, whatever. And I can answer all your questions individually. And I won't take any more time from the lecture. Okay, thank you. Okay, thank you very much for this very nice announcement and great opportunity. So take advantage of it. So now we're going to start our next session for today, Introduction to Quantum Aeromitigation. We're going to have two speakers. Our first speaker is Daniel Mills. He's already online. He's one of our research scientists. And then we're going to have Silas Dykes. And he is one of our software engineers. In any case, without further ado, Daniel, it's all yours. Okay, great. Thank you very much. Yeah, so I'm going to talk... Well, me and Silas are going to talk a bit about quantum aeromitigation. The first part, the first hour of this is going to be a bit of a... Some theoretical background into noise and aeromitigation. And Silas will take over the second hour to talk a bit about Kermit, which is a software library we've developed to perform aeromitigation. So aeromitigation is sort of like a near-term way to prevent and manage errors in quantum devices. So later on you'll also hear about some error correction from my colleagues, which is a bit of a longer-term approach to dealing with errors in quantum computers. But for smaller devices, aeromitigation is the approach people have taken to use. And I'll talk a bit about that now. So there are a few ways that you might want to deal with noise in quantum devices. So the first thing you want to try is to reduce the noise in the components of the device as much as possible in the first instance. So this would include things like improving the quality of the pulse sequences that you make use of, for example. You can also use, so this falls under things like optimal control and compensating pulses. And you can try things, you might hear about things like dynamical decoupling, which is some pulse sequences you can add into your quantum computation to undo some errors that you might have inadvertently added. So these are very low-level approaches to managing errors and managing quite directly with the components of the device. You can try what I'm going to call circuit-level approaches. So these are approaches which will change the circuit that you're running, but not manage any of the results, for example. So things like noise-aware compilation and routing. So this would be where you're picking the best qubits to use on the architecture, for example. Things like frame randomization and randomized compilation. So these are adding additional gates into your circuits in order to undo some rotations that you may have accidentally added or to randomize some of the errors to make them more favorable to you. So these are approaches where you take the circuit that you want to run and you change it directly. So this is a little higher level. So you're not dealing with the components of the device itself, but now you're dealing with the circuit that you're trying to run. Above that, there's what I'll call application-level approaches. So these are approaches where you might change the circuit, but you also do some clever processing of the measurement results you get from your circuits. So combining them in some clever way or perhaps removing some that you know to be erroneous. So we'll spend a lot of time talking about the second approach. So this again is a little bit of a high level technique to tackling errors. And then at the very highest level, you have error correction, which you'll hear about later. So during this talk, I'm going to talk about these two central approaches. Approaches that you can deal with by not dealing directly with the device and not abstracting away so much that you're doing error correction. These are also often called things like digital error investigation where you're dealing with the circuit rather than the device itself. So I'll talk a bit about these and these are the techniques that are techniques we've targeted in our implementations in Kermit. So just to prepare you in advance and to prepare some of the terminology I'll use later in the talk, Kermit is this open source Python package that implements these digital error mitigation techniques. It has a compositional architecture based on graph composition. So I'll also talk about this a bit more directly, but I'll also use this graph based approach to describe some of the schemes that I'll introduce. And what I mean roughly by that, just to give some intuition at the beginning, is that you can imagine your error mitigation protocol described as a graph. The nodes in the graph are some sub-prosicles that you're using and you might reuse and combine in different ways. And the edges of your graph just describe how the inputs and outputs from those sub-processes are moved between the processes. So this is the architecture we use within Kermit and I'll use this notation to describe some schemes that have been later on. It's built on top of by-tickets. So if you're familiar with by-ticket, you're ready to go with Kermit, basically. And you can quickly install it through pip with usual commands and there's some documentation available at kermit.it. So I've stolen the Italian domain name, which I hope doesn't offend anybody. And there's its open source and there's a manual available at the GitHub repository there. Okay, so that's Kermit. So let me just talk a bit about what I mean by noise and some of the terminology I'll use for different noise sources. So first of all, I'll distinguish between coherent and incoherent noises. So coherent noises are noises that can be represented by unitary operations. So for example, you can imagine if you're wanting to implement some kind of gate in your circuit and your gate is poorly calibrated, you might want to rotate the block sphere by some angle, but actually you over-rotate it or you under-rotate it. And these types of errors are called coherent errors. And you also have incoherent errors, which cannot be described as unitaries and an example of this would be like depolarization. So here what the creditor described depolarization is to say that with some probability the state that you want to be manipulating is replaced just by the maximum mixed state. So you just lose a little bit of information under the effect of a depolarizing noise channel. So you can sort of imagine coherent errors as if your block sphere is rotating in a way that you wouldn't like. And depolarizing noises would be something like your block sphere contracting. So you're losing some information to perhaps the environment. And just a final example I give here is a bit of flip error. So in this case you're basically your zero states are being flipped to one and one state is being flipped to zero. And you can sort of imagine this as your block sphere being contracted along one of the axes only. So information on one of the axes is preserved, but along the other you're losing some information. So these are a couple of noise channels, some very common ones, and I'll use some of these terms, electron as well. In practice if you want to simulate noise, which you might be interested in doing, there are some tools to do so in Kiskit for example. There is the noise model, there is such a thing as a noise model provided through Kiskit which you can use to describe some simple noise channels. For example you can input just some depolarizing noise on single and two cubic gates for example, or some bit flip errors on your readouts, on your measurement errors. And in practice what these simulators do behind the scenes is add some kind of errors after each gate. So quite a simple noise model, but these kind of tools are useful for doing some testing of your code or testing of the robustness of your circuits for example. So I just bring this up in case you want to have a look that up a bit later. Okay, let me describe a simple, again describe again the noise channel and the simple error investigation technique. So I'm going to talk about state preparation and measurement errors, or what you'll hear often referred to as spam errors. And here the idea is that with you perform your circuits and you perform a measurement and with some probability the measurement causes measurement outcomes zero to flip to be one and for measurement outcomes one to flip to be zero with some probability. So you have this simple channel to the left here. You can describe this using a matrix N. So here we have like one minus E with probability one minus E, the zeros remain in the zero states are measured at zero with probability are the one states are measured at zero with probability E, the zero states are measured at zero. So with this matrix describing your noise channel to hand you can recover the ideal probability distribution just by inverting this matrix. So there's a simple technique. You've characterized these values E and R to invert the noise and improve the quality of your measurement outcomes. So just to say that a little bit more formally you have this probability of measuring some outcome Z given that the true correct measurement outcome was Z dash here. So this generalizes to more than just one qubit. So in the example I just gave you had just a single qubit being measured but this generalizes to many qubits. So you have in the end very large matrices. The size of the matrices grows exponentially with the number of qubits. And you can perform the error investigation by inverting this matrix S on the noisy probability distribution to recover the error mitigated probability distribution. So just to return to the graph notation that I mentioned before to describe this error mitigation technique you start with some unitary you want to implement. The first step is to perform some tomography to establish this matrix S. So these first two boxes at the top on the top right here correspond to that procedure. Then you have two tasks to compile and execute the original circuit to gather the results from that circuit. And then the last step is to use the matrix that you've learned and the results from the device to invert the noise. So this is how we would describe it within kermits and also a convenient way of explaining the protocol. So what we've talked about here is correcting errors in probability distributions. So this is one situation that we would be concerned with. In kermit we refer to these techniques as result mitigators or mis-res. But there is a second type which we call expectation mitigators or mit-exes. And in this case we're not concerned with mitigating the complete probability distribution but instead with some statistics of the distribution. In particular the expectation values of calculating some, the expectation values of some observable acting on the state produced by our circuits. So in such experiments the inputs that you're concerned with are the circuit unitary and some observable that you want to calculate the expectation of. So I have here the goal is the expectation value of this observable defined by this trace quantity here. So these science experiments are quite common for example in quantum chemistry which will be covered a bit later in the week. So as an example here calculating the expectation value of this X observable corresponds to just calculating the difference between the probabilities of measuring zero and one if you act this Hadamard state before the measurement. So you have to perform some rotation on the output state and measure that. And from that you can recover the expectation value of measuring the X observable. So in general, so I have this, so this gives a little example but in general the procedure can be described to the right here. So you have an input as an observable and unitary. You use your observable to calculate some measurement circuits. So in this case the measurement circuit is simply just this acting of the Hadamard gate. But in general your operators might be much more complicated measurement circuits. And you repeat the same kind of process as before. You have a compilation step. You accordingly execute the circuits and apply the measurement circuits that you've built in the first step. And then as the very final step you recombine those measurement outcomes to calculate the expectation value of the observable that you began with. So you can see a bit of a difference here to the first type of error mitigation techniques. Instead of calculating probability distributions we are just calculating some statistics of the output. So there are some different, there are other error mitigation techniques developed for this setting as well, which I'll discuss. Yeah, so this just says in words what I said before. That's these different steps of appending measurement circuits, combining the circuits, performing the measurement and then recombining the measurement outcomes to calculate the particular observable you're concerned with. So this gives you, this procedure gives you a noisy approximation for the expectation value of the observable that you wanted. So you can sort of picture it as if the expectation value is noisy and sort of like as you can imagine as if it's a distribution that's been shifted slightly. The goal is to derive a better approximation for the ideal and this is the target of performing error mitigation on these types of experiments. Typically what happens is you improve the accuracy of the approximation but you increase the variance in the approximation of these quantities. So this is the trade-off that you inevitably have. So you can have this kind of picture in mind of the distribution being spread out by performing error mitigation but being drawn closer to the ideal. So just to outline some general steps for error mitigation of observable expectation values. So typically the first step is to take the circuit that you want to run, perform some operation on the circuit which I'm representing here by these u's. This basically allows you to build up some data that characterizes the noise of the device in some way. It's also taken that you have at your disposable some relationship that the data has between itself. So combining the data in some clever way should be able to reduce the errors. So this functional model step of an error mitigation scheme. And then finally use this data that you've generated in the first step and some knowledge you have about how this data should be related to itself in order to recover an error-mitigated approximation for a observable expectation value. So there are these two steps, two kind of things that you need to learn during an error mitigation scheme and then finally you can recombine them to produce a better approximation. Okay, so this is a general outline. Let me give a particular example of an error mitigation scheme. So first off I'm going to talk about what's called zero noise extrapolation. So the idea here is that at least the intuition is on the bottom right here in this graph. So what you can do is you can take your circuit and you can run it on the device and you get some value for this expectation value. So what you can't do is reduce the noise but you might be able to increase the noise. So for example, if you increase the noise, run the circuit again, increase the noise, increase the noise, then you build up these green points which have this, for example, have some decay. So as you're losing information to the noise, the expectation values might decay towards zero, for example. Then what you can do is get some fitting function, fit to those decaying points and extrapolate backwards to the zero noise value and this is roughly the intuition behind zero noise extrapolation. So there are a couple of things that you need in order for this to make sense. You need an approach to increase the noise and you need an approach to you need to agree on a function to fit to the decaying value of the expectations. So one approach that you might use to increasing the noise is to replace every gate in the circuit with itself, followed by its inverse, followed by itself. So the kind of second to gates, the gate and its inverse cancel out in the unitary that you're implementing. So the circuit that you're implementing is unchanged. But the result is that you're in, if you replace every gate in this way, that you've increased the noise by a factor of three. And you can repeat this process to increase it by a factor of five, any odd integer. And in this way, you can increase the noise in a way, in a controlled way. And there are other techniques to increasing the noise as well. If you have lower level access to the pulses, for example, you can consider stretching the pulses or you can replace the gates in some other way. But this is one example you can keep in mind. So that's the first step, the first step to build up this data that you need to perform the air mitigation. The second step is to consider this functional model. So this is the function that you're fitting to the data. And there are, again, a few different functions that you might try and fit. Some common ones would be to fit some exponential decay to these values or some polynomial function. Each of these have their own justification for doing that. And it'll depend a bit about on the device you're working with or the circuit you're working with as to which you might choose. And then the final step in this case would be just to, so you have your data, you have your function that relates the data and now you can use that just to extrapolate back to the zero noise value. So this is the zero noise extrapolation technique in our abstraction described. So just to say that the function and noise scaling technique that you use will depend a bit on the device. So I have a couple of devices here, credit here to the team out of Mitic for performing these experiments. In this case, the ideal outcome would be one. And on the two different devices, you can see how as the noise is scaled, the decay is different. And indeed the extrapolation technique, which corresponds to the color of the points at the zero noise scaling value. The accuracy of those scaling of those extrapolation techniques depends on the device. So for example, on the IBM London device, exponential extrapolation is closest to one. On the Rigetti device is Richardson extrapolation, which is a form of polynomial, a kind of polynomial that you would use for extrapolation. So it's a bit unpredictable, which is the best one. That's something to keep in mind when you use these techniques. So what I said when I was describing zero noise extrapolation is that you can increase the noise, but you can't access this area in red here that corresponds to reducing the noise. So there is one case where you can reduce the noise, and this in particular is when you're considering Clifford circuits. So Clifford circuits are subtenable circuits that happen to be classically simulable. And the approach of Clifford data regression, which is another error mitigation technique, is to make use of this fact. In particular, the approach is to take the original circuits as given and to look into the circuit, and for every gate in the circuit, which is not a Clifford gate, to replace that with a random gate from the Clifford group. So the result is that all the non-Clifford gates in your circuit have been replaced by Clifford gates, and as a result, you can classically simulate the circuit. You run those classically simulable circuits both on classical devices and on quantum devices, which allows you to build up a relationship between the noisy and exact values for those Clifford circuits, which are similar to the original circuits. And you can use that relationship to relate the noisy expectation value from the original circuit to what you can conjecture it would be in the exact case. So that's the intuition by this Clifford data regression technique. So you can see sort of the techniques are kind of similar. So on the left here, I just do a little pictorial representation of Clifford data regression. You take the original circuit that you want to run. You replace the non-Clifford gates with Clifford gates so that it's classically simulable. You can use the classically simulable Clifford circuits to calculate these green crosses at the noise value of the device and at the noise level and at zero noise level and build this relationship between the two pairs to use that to perform error mitigation. Or in the case of zero noise extrapolation, you can increase the noise using these identity gates. So the unitaries and the inverses and use that instead to extrapolate backwards. So these two techniques are sort of similar in that regard. More generally, the process I've described is to start with some unitary U in red here at the top, do some transformation to generate new circuits, run those new circuits and the original circuit on the device, use that data to build up some relationship between the training circuits and the original circuits and use some classical post-processing in order to recover a non-noisy, a noise-free value for the expectation value. So again, so I can write this down in my favorite graph notation. So I'm, yeah, just kind of the same thing I've described to you. And it develops, you build up some circuits that produce your expectation value, generate some modified circuits, execute those on the back end to do some classical post-processing on those circuits, and use that in combination with the original circuits to generate an error-mitigated value. So, okay, so now we have an idea about result mitigation and expectation value mitigation and a few examples of such schemes. The next thing you might try and do is to combine these schemes and in some cases this is a very sensible thing to do. In particular, a lot of the assumptions that we've, that go into developing these, there are a lot of assumptions that you have to make when developing these error-mitigation techniques. In particular, this relates to kind of, for example, the extrapolation function that you might use. The extrapolation function you use depends on some, the particular noise characteristics of your device. And if those assumptions that you've made about the noise characteristics of the device were to be incorrect, then it might affect the quality of the outcome of your experiment. But sometimes you can enforce these assumptions by using other error-mitigation techniques, perhaps some of the lower-level circuit error-mitigation techniques that I described before. So for example, you can imagine using zero-noise extrapolation and spam errors to correct the measurement outcomes, and that results in some improvement, some improvement. And you can also use something like frame randomization, which basically makes the noise in the device look like a depolarizing noise channel, which is favorable to some of the extrapolation techniques that zero-noise extrapolation uses. So you can think a bit cleverly about the different ways to combine these techniques when either of them result in better performance of the others. And these kind of combinations are at the core of the design philosophy for Kermit using this graph structure, because you can just swap in and out different nodes of the graph to combine error-mitigation schemes. So this is often beneficial and something that we try to make straightforward with Kermit. So just a final example of combining error-mitigation scheme and how it's combined beneficially. So you might have come across these variational quantum circuits. So the idea here roughly would be the idea in a variational quantum experiment is that you have some circuit with some gates that are parameterized, and you change the values of these parameters by searching across this parameter space until you find, for example, some minimum which corresponds to some quantity of concern. So if you have a noisy device, then the landscape that you're searching with these parameters becomes, you know, not the ideal one, it becomes a bit bumpier or something like this. And what you might want to do is to smooth out this parameter landscape to mirror the ideal one. And one technique that you might do this is to explore the entire parameter landscape and then do some filtering to remove, say, the, you know, take some threshold to remove some very low values, for example. So this is the effect of smoothing out some of the noise. So this is another error mitigation technique that you might be worried about. So this image here just gives, imagines that you have two parameters, two parameters you're searching over. You can build up this entire bumpy landscape and then remove the kind of bumps in the landscape. So the insight is that you can note that some of the points in this parameter landscape will correspond to Clifford circuits. So some of the angles will make the parameterized gates, Clifford gates, so that the whole the whole circuit will be Clifford circuit and you can simulate this classically. And there are several points in the parameter landscape or that will be true. So the way that you combine Clifford data regression and this spectral filtering technique would be to calculate ideally all of the points in the landscape that correspond to Clifford circuits. And so you can use that information to to judge your so this basically gives you points in the landscape which you know perfectly. So this improves that again on top of the filtering technique improves the accuracy of the landscape that you're exploring. So you can see just an example here being conducted on IBM Q's Sydney where you've combined both threshold and Clifford data regression which gives you something much closer to the exact classical simulation than the original one. Okay, so how about the performance of these air mitigation schemes in practice? So just returning to my example of zero noise extrapolation there are a few things that might go wrong in practice when you're implementing air mitigation schemes. So the first thing is that for example in this picture here you might say pick the wrong extrapolation technique like your extrapolation function doesn't match up with the one that is best suited for your device. It might also be the case that the variance in the expectation values you're calculating for noisy values is quite large so that it's very hard to fit an extrapolation function to for example. So these are a couple of things that might go wrong and result in you picking the incorrect extrapolated value. So I'm going to show some plots that were conducted some experiments that were conducted to measure the performance of air mitigation techniques just to quickly prepare you for how they're going to look. So we're using what we call the relative air of mitigation displayed at the top here at the top this top formula here. So this quantity is the ratio of the error in error mitigated value and the noisy value. So this means that if this quantity is below one then that means the air mitigation has been beneficial to use. If it's above one then it's not beneficial. And we'll display these as these as these colors that I describe on the bottom here. So for example, so basically the bluer the outer square the better air mitigation is performed the inner square shows the worst case of the experiments so that would be higher than the average and towards the red means that air mitigation is not really doing much. So these experiments in particular were conducted on noisy simulators so not real devices. Along the x-axis here I have the depth of the circuits and on the y-axis I have the number of qubits that the circuit covers. The circuits that have been used in these experiments are so-called random circuits which are used during quantum volume experiments which you may come across. Basically just gates applied very randomly between random pairs of qubits in the circuit. So nothing to do with any application in particular. So you can see that as the depth of the circuits increase the air mitigation schemes perform worse and as the number of qubits increase the air mitigation performs worse as well. So we recall that these blue values indicate that the air mitigation schemes are performing well and the red ones indicate that there was no real point in performing air mitigation. So this is to be expected. So you can imagine that as the circuits grow in size the noise really just takes over. So for the very largest circuits basically you're just getting noise you're just getting random values from your circuits which means that the kind of data collection phase of the air mitigation experiment isn't really able to collect any information from the device. It's just, you can't learn anything about the noise because it's just completely random. So for the very largest circuits you expect that air mitigation schemes won't be able to do anything and indeed that's what you observe. Okay so what about instead of using random circuits if I use more application motivated circuits so here I'm using basically circuits that are inspired by those used in quantum chemistry experiments which will come across a bit later in a week and in this case so I didn't mention but the difference between the two left and right plots on the left most one is using Clifford data regression and the one on the right is using zero noise extrapolation so you'll see that when using random circuits zero noise extrapolation performs very well on smaller circuits and Clifford data regression performs well but not as well as zero noise extrapolation. So this time on the left again we have Clifford data regression and you can see that for smaller circuits Clifford data regression is performing very well and for for larger circuits again it's not performing as well for zero noise extrapolation we kind of have the opposite relationship now in this case it's not performing as well as Clifford data regression and this is kind of to be expected because the chemistry circuits that inspire these circuits that are being used in these experiments contain a great number of Clifford circuits Clifford gates so this means that basically when you do Clifford data regression the circuits that you generate during the data gathering phase of the experiment is very similar user circuits that are very similar to the original unmodified circuits so you can expect that the data gathering phase to be quite productive in that case. The interesting feature that emerges is that now you can see on this rightmost plot there's sort of like a valley of success that cuts down the diagonal here so for smaller circuits zero noise extrapolation in the worst case is not doing terribly well and for larger circuits so this might also be to be expected because for very small circuits the devices themselves are probably performing pretty well since their circuits are quite small so there's not much improvement to be gained from using error mitigation. Okay so let's switch to these experiments which were conducted on the device so again on the left I have Clifford data regression on the right I have zero noise extrapolation and I'm using random circuits so you see the same kind of phenomenon emerging that for random circuits zero noise extrapolation performs pretty well on the devices and Clifford data regression performs fairly well but not as well in both cases it's not performing as well as it did on the classical simulations and this is because the noise model used for the classical simulations are of course much simpler than the noise models on the real devices and probably lower in attitudes so on the real devices you have much more complicated noise sources and the assumptions that go into using the into developing the error mitigation techniques are not necessarily met by these real devices whereas for the classical simulators you're sure that they actually are so you see that the the real devices don't perform as well but error mitigation still seems to help you out so I'll just so I'm going to talk about the chemistry circuits run on the real devices so I'm going to flash this up a bit earlier but the chemistry circuits sort of have this kind of ladder structure where you have CZ gates acting on two qubits in this kind of ladder structure with a small rotation just a single qubit rotation at the bottom followed by this increasing ladder of CZ gates and this occurs for many layers so running these circuits on the real device you see the same thing as you saw with the classical simulations in particular roughly speaking CDR the cliver data regression does better than zero noise extrapolation most of the time error mitigation helps in the worst case it seems not to and there are quite a few squares here which are reds where error mitigation was not shown to result in any improvement so you can see that on real devices the performance of these schemes becomes a bit less predictable so this could be for many reasons that we don't understand the noise channels particularly well that they change the noise profiles change very quickly and lots of other things so this is run on the IBM Lagos device just to emphasise that it depends heavily on the device characteristics as well these experiments were conducted on the IBM Casablanca device and you can see that in this case zero noise cliver data regression really just performs it completely adds no benefit running cliver data regression on these circuits so you can see just to emphasise that it really depends on the device it depends on the device it depends on the circuits and it can depend on the device characteristics at any particular moment so these results show that it is a bit hard to predict the performance error mitigation schemes in practice but there are some things you can take away from this for example this plot in particular tells you that maybe for IBM Casablanca which I think is not running anymore but you should not worry about using error mitigation for chemistry type experiments it's beneficial to use cliver data regression and for other types of experiments it's probably beneficial to use zero noise extrapolation so there are a few things that you can take away from these experiments so that's just about all I have to say on this so just to prepare you for Silas's talk there are several schemes currently implemented in Kermit that you might like to play around with so zero noise extrapolation and cliver data regression are there for example there's also probabilistic error cancellation which I haven't mentioned but this technique basically relies on the idea that you can modify your circuit in such a way that combining the results of the modified circuits has the effect of inverting the noise channel on the device after doing some characterization of the device it has spam which I've discussed frame randomization which I mentioned briefly and it allows you to combine existing methods quickly it's built on top of by tickets so you should be able to use it quite quickly and hopefully some of the techniques that are implemented there already can be easily reused and developed into new schemes if you wish to try that out due to this modular structure that we tried to make use of and similarly combining error mitigation schemes should be quite straightforward so Silas will follow up with next with some code examples for using Kermit and develop a bit more backgrounds on error mitigation through practical use cases so I hope you enjoy that and I'll stop there and take any questions if there are any okay let's have some questions from the audience we're going to need to have a microphone in order to communicate with our speaker Dan you can hear me yeah yeah okay good questions hi thank you for the nice talk could you just spend a few minutes explaining again the notation with the coloured blocks it's not familiar with that if you could just explain again how that works these ones here yes thank you just what the inner and outer colour represents yeah sure so the inner the outermost colour so the largest square indicates the average performance of the error mitigation scheme and the innermost square indicates the worst case performance so this was conducted over five random circuits so yeah so we're saying for example let's see maybe the bottom left square on this left most plus here we have a blue background which means kind of on average the error mitigation scheme results in an improvement so anything that's sort of blue or green means the error mitigation scheme is performing well and the innermost square is reds which means that in the worst case the error mitigation scheme doesn't do anything so there was a circuit run during these experiments where error mitigation performed as well as running the noisy experiments thank you, thanks okay more questions I'm going to get a work out now is there a fundamental limit for error mitigation complexity how much error mitigation complexity for example do we need for different problems so are you referring to the resource consumption of the error mitigation schemes yeah so they there are limits so the resource consumptions to maintain the same accuracy of the prediction of the expectation values grow exponentially but so these techniques will not work when circuits become really very large the idea is roughly that these error mitigation schemes will carry us along until we have enough qubits to perform error correction and then error correction will sort of take over yeah, thank you okay great other questions I'm just wondering why the expectation value decays when the noise is increased for the zero noise extrapolation technique yeah, so you can sort of imagine like if you keep adding noise then you're sort of progressing towards just a circuit which just generates uniformly random bit strings so in that case the expectation value that you'll get uniformly random bit strings is zero so you can sort of imagine that as you progressively go towards generating just that like uniform distribution so like get progressively closer towards zero thank you another question thanks for the nice talk I was wondering if you could comment on the sampling overhead required for some of these error mitigation schemes yeah so yeah as I mentioned you would typically require there's something overhead to increase exponentially for these experiments we sort of the sampling is a number of samples you take is increased but the factor by which they're increased is constant just to demonstrate that in practice you can sort of circuit things out of it without going too hard on the number of shots you take so I think in this case there was a factor of maybe three or four yeah thanks great, more questions? I have a question for you so if I do an experiment and I want to figure out if I should use error mitigation what would be the steps I should be doing in order to determine if it's going to help obviously sometimes from your plots it works and sometimes it doesn't so it doesn't seem like there's a feature I can kind of bite on for a given application so what would you do? so I think in practice the results from these experiments suggest that you probably need to kind of think very carefully there's not so many broad strokes things that I can say I can say that if your circuit has a large number of Clifford gates then Clifford data regression seems to reliably result in an important improvement otherwise you can use zero noise reference so this is like something you can take directly out of these experiments but in practice you probably really need to sort of experiment a bit with smaller instances of your problem with some schemes that you think might work and run with that so a lot of schemes will work quite reliably like spam error mitigation is quite will reliably work for any consistency so you can make use of that but you can try a bit with some of these more general techniques there are also techniques that are specialized to particular problems for example in chemistry there are a lot of cases where you have some symmetries in your problem which you can make use of to do some error mitigation so basically there might be a spoke techniques which you should definitely use there are techniques which perform reliably generally such as spam there are techniques that work in general if you have a lot of Clifford gates and apart from that you'll have to play around with some smaller instances probably yeah okay thank you we have one more question hi sorry I am a little bit lost but this application of denoiser I mean it's at the beginning of the program of the final for each cubic I am a little bit lost or after some operation sorry the denoising yes so these were zero noise extrapolation and Clifford data regression so in this case the sort of where the steps that I described so there is sort of like the data gathering step where you change the circuits and you run the altered circuits the device so this builds up some data the correction is sort of like some classical post processing on that data so the the improvement in the result comes about because of the classical post processing on the data that you've gathered okay thank you okay thank you more questions yes one more over there hello thank you for the talk I was wondering if there is any problem when trying to combine different error mitigation techniques can we have a situation where for example one error mitigation technique can give the opposite effect when combined when they have counter effect with each other for example sure that could arise there are definitely problems that you for example they might be tackling the same noise source in which case it's sort of a bit redundant to combine error mitigation schemes so one example where the combination might be beneficial might not be beneficial so there are some error mitigation techniques on post selection so for example they basically detect that some for example they might detect that some error has occurred in your circuit and then delete that delete that particular shot from your results but in principle it's good but it has the effect of changing the description of the noise model and that the description of the noise model might be relied on by an error mitigation scheme that you use later so by removing that noise source you're sort of changing the assumptions that another scheme might make for example so there are examples where combining them you should think a bit carefully about okay thank you okay thank you so I guess if there are any more questions you can always put them in Slack and Daniel will check Slack if there are any more questions great so I didn't say that Daniel Milse joined us from the continuum office in the UK and we'll have a 10 minute break now it's thank Daniel again for a very nice talk and 11.20 we're going to do continue with part 2 okay thanks thank you