 Thank you for the introduction and thank you for the invitation and for the organizer to this nice conference. So I'm showing from the University of Oxford. In fact, they invite Simon Benjamin, my supervisor, but he's super busy right now, so I just come instead. To be honest, I just started three months ago, so I don't have many new and interesting staffs to share with you, but today I will also talk about some recent results by my colleagues and also some recent findings by myself, so hope you will be interesting. As you can see, I will talk about error mitigation for shallow circuit, in fact, and also its application in quantum chemistry simulation. So here is an online. First, I will talk about the motivation and then discuss the error mitigation, and then I also quickly review chemistry simulation and finally I was combining error mitigation and chemistry simulation. Okay, here is a roadmap for for torrent quantum computing. So here so here you can see the threshold for the two-cube gate is 99 percent and you can see now for the systems, probably you cannot see this, I cannot see. So you can see that for all the three systems now we can already achieve these thresholds. And especially for the iron trap system by laser control, we can already get 99.9% fidelity and also for the others, two systems, superconducting and the other iron trap system, we also have quite nice fidelity. This graph is from the NKIT website where I think you can also find some more interesting stuff. Speaking of NKIT, so Simon asked me to add these slides to you to introduce the NKIT program. So I just went through these slides. So NKIT is the networked quantum information technologies and it is the largest of the four UK national quantum tech hub and it started at the end of the 2014 and we have a lot of money and we have Oxford and many nice universities and also companies and I think the goal of NKIT is to create a modular quantum computer. That means so we can make some very small size quantum computer, which we have a very high fidelity and we can make many that kind of modular quantum hardware and then we connect different quantum hardware using quantum optics. So here you can see a schematic way to do this. And also we have many experiment groups in NKIT. Here I just mentioned some recent progress of the Oxford team and you can also find other progress from the NKIT website. So for now we have a very high fidelity for quantum gates, especially for the two-cube gates. We have a 99.9 and for one-cube gate we can just get probably just perfect gate. And also we can apply gate between two different species of atoms in one trap and we can use very cheap microwaves and quite recently they also achieve very fast operations. So now the operation time is eight orders faster than the decoherence time. So please don't ask questions about these slides. I don't know. So if you if you want to know more, please visit NKIT or ask Simon. Okay, so from the experiment side, it seems very promising. So we have a very accurate quantum hardware. But in practice, to realize the for torrent quantum computer, still there is a huge overhead to implement physical, to implement logical qubit with physical qubit. So here let's take a very quick example at the Schur's algorithm. For example, if we want to factorize 1,000 bits, then we need 6 million, at least we need 6 million physical qubits to realize this. This graph is from this paper where Err, I think he's also in the audience. If you are interested, you can also ask questions about this this graph. And so 6 million qubits, I think it's promising probably in 5, 10 or 20 years. But before that time, can we do something without error, without for torrents? So one solution is called the hybrid algorithm. So the hybrid algorithm is a combination of the classical and quantum computation. So the algorithm is divided into two parts, where one is solved by the easy one is solved by the classical computer and the hot one is solved by the quantum computer. So here let's take an example. Here we have quite a small size the quantum hardware, and it is also of quite shallow circuits, where each gate is controlled by a few parameters. So for each round of the algorithm, we perform, we use some parameter setting, and then we write, we perform some measurement, and then based on the measurement outcome, we can update the parameter according to like the gradient descent or any algorithms. And then with the updated parameter, we can run the hardware again and again to achieve some goal. So here this is a combination, it's a quick example of the hybrid algorithm. So I get these two graphs from this paper, where you can also find some quick brief summary of the hybrid algorithm. Of course, recently there are many works about hybrid algorithm. So it seems that, so here the hybrid algorithm, of course, we can use a quantum error correction to correct errors. But remember that we have a huge overhead to implement the logic qubit. So instead, how about we don't implement quantum error correction? Then can we still get some accurate result? Then this talk is about how to mitigate errors for this kind of a shallow circuit. So one method I will introduce is called error mitigation. What I will talk is many about digital quantum simulation, while the previous talk is about analog quantum simulation. So for example, let's consider a process, quite general process, where we have a step brought in and we apply several unitary gates sequentially, and then we perform some measurements. Know that the information is encoded in the measurement outcome, that is the measurement probability, instead of the measurement outcome we get. So we have many examples for such cases, like the phase estimation, swap test, also hybrid algorithm. While we also have some other algorithms, such as the Shor's algorithm, Grover's algorithm, where the result is encoded in the measurement outcome, instead of the measurement probability. So here we just focus on this type of computation. Then for the output states, if we have no noise, we can get a noiseless state, but actually in practice what we get is a noisy state. Then the error mitigation problem is to infer the noiseless measurement outcome from the noisy measurement outcome. In fact, there are a few error mitigation methods. The first one is called error extrapolation. This method works quite interestingly by deliberately making the noise worse. So let's see how it works. So first let's consider linear extrapolation. Consider for simplicity that the noise channel is a stochastic channel. That is the error only happens with a small probability, epsilon i. Then we can expand the noisy value by ignoring higher orders of the noise rate. Suppose in experiments, we can make the noise worse. That is, we can increase the noise. Then we can get another noisy measurement outcome. Know that here the noise measurement outcome is a linear relationship to the noise rate. So it is quite natural to get to have an estimation of the noiseless value by using linear extrapolation. So here linear extrapolation works by assuming that the total noise rate is not high, such that we can ignore higher terms of higher terms of the noise rate. In practice, we can also convert an arbitrary noise model by using the Turing technique to the stochastic noise model. So that is, we can just add extra poly matrices before and after the gate to get to convert arbitrary noise model to stochastic noise model. And also if we cannot proportionally increase the noise rate in experiments, we can also randomly add extra poly matrices to increase the error rate. For the more, we can also consider to use more extrapolation points to get even more accurate estimation of the noiseless value. This method is proposed in both these two papers, where the first one is by my colleague, while the second one is by the team here in IBM. So for the first one, they also propose another simulation method. While the second one, they propose two error mitigation methods. While the other error mitigation method is the negative probability, which we will discuss very soon. But before that, I will discuss something more, which is called exponential extrapolation method. So here you can find a new batch. So here, so if you found this new batch, it means that this is a new result. So new recent results. So remember that the noise value can be expanded in this way. Here PI, PK, means the probability that K errors happen in the circuit. So here we still assume the stochastic model. And that K is the average measurement outcome when we have K errors. So here N is the number of gates and R is the error rate. Here for simplicity, we'll assume that the error rate is for every gate is the same. And then when N is large and the total number of error rate is a constant, we can convert the binomial distribution into the Poisson distribution that is like this one. So you can see that the probability PK is proportional is exponentially decreasing to the error rate. Then it is naturally to have a guess that the noisy value is also exponentially decreasing to the noise rate. Then instead of linear extrapolation, we can also try exponential extrapolation. So later I will show simulation results of the linear and the exponential extrapolation method. But before that, I will also talk about the other method introduced by the IBM team, which is called the negative causal probability method. The method essentially is to synthesize the unphysical inverse of any noise channel. So here for simplicity, let's just consider one gate and one noise. Suppose that we consider the depolarizing noise channel. Then we can easily find the inverse of the channel by just solve the equation. Then we can also write it in a more elegant way. Here P1 means the probability that we do not affect the state, while P2 means with P2 we apply some random polymetrices to the gate, to the state. But unfortunately here we have a minus sign. So this minus sign generally prevents this channel to be physically realizable. So we cannot realize it in practice. But we should remember that instead of realizing the inverse channel, what in practice we need to do is to get the noiseless value. So here we denote the noiseless value as O and the noise value as O prime. And then the negative probability method works as follows. So with the probability P1 we add nothing, so we just add identity channel to, identity operation to the state to the process. And then with the probability P2 we randomly add XYZ operations to the process. Then in total we will get four measurement outcomes. And according to the form of the inverse channel, it is natural to get a relationship between the noiseless value and the full noise value. So in practice what we need to do is to rerun the circuit many times while for each time we randomly add some extra gates. While in total we can get the noiseless value from the different noise values of the different runs. This paper is introduced in the IBM paper where they also consider some other noise channels. While recently our colleagues, my colleagues, they also consider to extend the IBM work first. They generalize the result to general Markovian noise model. So in general we can use a basis to compose the channel and of course then we can also decompose the inverse channel. And then we can use this basis to cancel any local Markovian noise. And furthermore we can also cancel any multiple cubic gates by just tensor-producing the basis. Another problem is in the negative probability method we need to perform a process tomography to apply the probabilities. But in practice we may also have imperfections in the process tomography. So we may also introduce errors from the process tomography. But fortunately as we just focus on the average value of the measurement outcome we can apply the gate set tomography method to eliminate these errors in the process tomography. So next I will show some numerical simulation to compare these three in fact error mitigation methods. The simulation is a swap test simulation using 16 qubits. Here the swap test in general is to evaluate the fidelity between two states. Here the two states are the cast states and the ground states. So the preparation of the realization of the swap test is shown here where the first qubit is the ancillary qubit. It is the first realizing the plus states. And then we apply control swap operation on the whole system and then we measure in the X basis to get the outcome. So the fidelity between these two states can be obtained from the measurement of the ancillary qubits. Here we choose a very naive decomposition of the control swap operation. While in practice there may be some better more efficient decomposition. In the simulation we assume that the noise happens before after every operation. And we assume that the recovery operation is applied after each gate. And also that we assume the noise also occurs in the recovery operation. So here is the simulation result. We consider two different noise models. One is the inhomogeneous error while the other is the leakage error. So in the inhomogeneous error the probability of having x, y, z errors is 0.1 and the other is 0.006. And so why we choose this probability? This is because this error rate is what we can achieve currently in experiment. While in fact we assume the error rate for any gate. But while remember that the fidelity for single tube gate is very, very high. So this is even more pessimistic. So from the simulation result here we have three different values. While the orange one is the result without error mitigation. The blue is the result with a linear extrapolation. While the green is the result with a cause and probability method. So remember that the fidelity between the cast state and the zero state is 0.5. So what we should get is the exact value of the measurement outcome should be 0.5. So we can see that the result without error mitigation is very far away from the true result. So it deviates a lot from it. While the linear extrapolation works slightly better. So it indeed makes the result better, improves it a bit. But still the prediction of the linear extrapolation is still not that good. While for the cause and probability method you can see that the mean of the histogram is just 0.5. So it says that it works quite well. So here we consider 1000 samples for each experiment. And we repeat 100 steps to get this histogram. But in practice you can use more samples. Then the variance of the estimation will be much smaller. Here we also compare the linear extrapolation and exponential extrapolation. We use the same experiment setting. And the result is shown as this one. So you can see that although the linear extrapolation doesn't work well. But quite surprisingly the exponential extrapolation works quite well. So finally we just combine the results into this one graph. Here the red bar is the result by exponential extrapolation. While the green line is the result by cause and probability method. So you can see that for the inhomogeneous error case the cause and probability method works slightly better because the variance is slightly smaller. While for the leakage error case the result is different where the exponential extrapolation method works better while the cause and probability method works slightly worse. But you should be careful that the range of the x-axis is different. So in fact for the leakage error case the negative probability method doesn't work that worse. So the range of the x-axis is different. So from the simulation results we can see that error mitigation works quite well at least for this medium sized circuit. And then we go to the next topic which is chemistry simulation. First let's have a quick review about chemistry simulation. Here we just focus on the ground state energy of molecules. Here is the Hamiltonian of the molecule. And in general to solve the ground state energy of this Hamiltonian it is hard in classical using classical machine while with quantum mechanics we can either consider first quantization method, that is we just discretize the space and then we directly implement the space wave function. But for this first quantization method we even need a lot of qubits for even a very small molecule. So it is not very efficient. While a more efficient method probably for small molecule is the second quantization method. Here we just convert, we just choose the energy basis for each atom and then we can convert the Hamiltonian in the first quantization to the second quantization. So with the second quantization Hamiltonian we can use some fermion encoding method like Jordan-Wigner or Bravais-Kinaev method to convert this fermion Hamiltonian into a qubit Hamiltonian. So here you can find a recent review about chemistry simulation. So then essentially what we need to do in chemistry simulation is to find the lowest energy of a qubit Hamiltonian. There are also two methods. One is called phase estimation. The method works by first preparing a state phi that is close to the ground state and then by using a neoline and then we apply phase estimation to prepare the two ground states. This method is first proposing in this paper. Another proposal which is quite popular recently is called the variational quantum eigen-sover method. Here we just try and answer states by preparing the state by applying some operations that is controlled by some parameters. This is quite similar to the circuit that we discussed at the beginning of the talk, the hybrid algorithm. So we apply some unitary gates where each gate is controlled by some parameters. We also have several answers for the variational eigen-sover method while the target is to minimize the average value of the energy by using for example gradient descent method. So the first realization of the variational quantum eigen-sover method is in this paper. Okay so next we just jump to see the combination of error correction, error mitigation and chemistry simulation. So first let's see the case for the simulation for the hydrogen molecule. The hydrogen molecule in the minimum basis is a qubit Hamiltonian. In fact where here the coefficients are dependent on the distance between the two atoms and the trial states of the of this Hamiltonian is as this one. So I just copied these results from the recent Piax paper by the Asperger-Gerzinger and the Martinis team. So here this is also their realization of the of the trial state and the right is the result of the of the ground state energy of the hydrogen. So here they compute both the two method variational quantum eigen-sover and the phase estimation. You can see that variational eigen-sover works slightly better, much better than the phase estimation method. But still if you can see still we have some errors in the variational quantum eigen-sover method. So this means that the experimental noise in the circuit still affect the ground estimation of the ground state energy. Then we just consider to apply the error mitigation method to the simulation of the hydrogen molecule. Here suppose that the noise in the simulation is a depolarizing noise channel. In general it works for arbitrary channel. And suppose that the error rate for single cube error is the same and for two-cube gate is also the same. Here is the simulation result. Here we assume that single cube error is 0.0005 and two-cube gate is 0.005. So maybe you cannot see very clearly because both the two results without error mitigation are quite well. So here is a detailed graph of this one. So you can see that for the result without error mitigation we do have some mistakes of the ground state energy estimation. While with error mitigation we can reduce the error quite a lot. Here we also simulate the result with a larger error case. The single cube error is 0.005 and the two-cube error is 0.005. So here you can see the result with error mitigation improves a lot compared to the one without error mitigation. While it was to mention that here we use the same sample size for both method. That is we just use one medium sample size for both the error mitigation and the conventional method. And we just use two extrapolation points in practice. We can also use more extrapolation points to improve the result. So we also consider the case about the simulation for the helium hydrogen molecule. Here is the Hamiltonian and here is the realization, the circuit and the result from this paper. So you can see the noise in the circuit indeed affect the simulation results. And while for our application of the error mitigation method we consider the same depolarizing channel. But now instead we have six parameters to optimize. And here is the simulation result where you can see here we assume the single cube error is 0.001 and the two-cube gate is 0.01. And you can see that the result with error mitigation is quite accurate. So from these two results we can see error mitigation indeed works for quite shallow and small sized circuits. But in practice what quantum chemistry is attractive is because we can indeed achieve very accurate estimation of the ground state energy. But in practice to achieve accurate estimation of the ground state energy we need to consider a more complicated basis. So in the previous works we just consider in the minimum basis, so which is the red curve in the bath. And while in practice we need to consider a more complicated basis for the simulation. And in that case you can see the energy curves are still quite different. So even if we can get the most accurate energy estimation for the minimum basis still we have a huge gap, a huge difference to the actual energy curve. Well in the quantum chemistry simulation the required number of qubits is related to the number of bases. And furthermore if we consider the unitary coupled cluster ansatz the number of parameters is also increasing to the number of qubits and also the number of bases. Here I just quickly estimate the number of qubits and number of parameters that we need to in the optimization. So here for example for the minimum basis without any reduction we need four qubits and we need 272 parameters. While for the most complicated basis here we try we need 200 qubits and we need two billion parameters to optimize. While I also calculate the number of terms in Hamiltonian by using the open fermion package. So I just calculate the four bases because with the larger bases the number of terms are larger and the computation time is very long. So for the for the CCA PVAT is that basis you can see we need more than 100,000 terms in Hamiltonian. So it seems that to get some very to get the actual Gauss-Stay energy of the Hamiltonian here even for hydrogen we need more number of qubits and we need to optimize a lot of parameters. So to summarize our mitigation works quite well for shallow circuits and chemistry simulation. While on the other hand to get some very accurate results of chemistry simulation we indeed need to consider a large basis while this seems to be still quite challenging. There are a few potential solutions to to this problem. So the first one is probably we can try other encoding method for example the first quantization method where the where the number of qubits doesn't increase with the with the accuracy because we just consider the directly the wave function of the space and this this method is a recently discussed in this IBM paper and another method is also introduced in the IBM paper. They consider the Hamiltonian reduction that is probably there are some redundancy in the Hamiltonian. For example for the minimum basis of the hydrogen we can reduce four qubits to just one qubit using some Hamiltonian reduction but I tried the I tried the algorithm but it seems that the reduction doesn't work very well it's so it seems to just reduce at most two qubits for larger basis. So another possible solution is to try hardware-efficient ansatz which is also recently realized in the IBM experiment paper so instead of using the unitary couple cluster ansatz we may try some other ansatz that is more experimentally friendly and the final one is probably we can try some other optimization method. So before the end of the talk I just want to show some results that is done by the by last week so it may be not it may not correct but I just want to show you show you and to ask for your advice comments and yeah so here we consider the problem for the helium hydrogen Hamiltonian and remember that originally we need to solve the ground state energy for this Hamiltonian but instead here we consider another a tunable Hamiltonian here we have a parameter t that we can control in experiments. When t equals zero we have a Hamiltonian which is very simple just this Hamiltonian that is just has local terms so we can analytically solve this Hamiltonian and then with a t by increasing t from t was t was zero to t equals one we just recover the original Hamiltonian. So this idea is similar to adiabatic quantum computing so here we just solve the ground state energy for t equals zero just by analytical solution and then we gradually change t and then we can gradually get the correct energy curve for the original Hamiltonian and I also estimate the energy gap between the ground state and the first excited state for for certain distance and you can see that the gap is still a constant at least for this simple example. So in general the problem this algorithm should be analyzed in more carefully to study its energy gap after considering scaling of the qubits and so on. So okay so this is a good photo and thank you very much. So thank you so questions yeah so I'm just wondering with this with this active error minimization thing I have this error parameter that I say I can tune it I want to tune it to zero but I can't tune it to zero so I tune it to some cutoff and then I extrapolate back to zero. Now if I work in a if I'm in in the lab and I'm doing it well not me but somebody else is doing a real experiment then they don't necessarily know what this error parameter really looks like and usually they have multiple knobs which they can tune to try to improve their circuits so you're not entirely sure that you're gonna go back to zero with the same you know like at the same time with the same knobs. How can I how can you extract in a single epsilon parameter that I can I can take my energies from from any method I have and extrapolate them back to zero and have a good guarantee that this this epsilon parameter is a faithful representation of what I have. Okay so in fact there are two realization of the linear extrapolation method so one is you just proportionally increase the error rate so in that case you need to know that for example for the two cube gates you have this you have this error and then you can just increase error for example you just round the two cube gates for longer time to proportionally increase the error rate but if this is not possible in practice then there is another way so in that way you need first to perform process tomography to the to the to the to the gates and then with the process tomography so we can apply some extra poly gates after at the end of the circuit to effectively increase the error error rate. Okay so I actually tend to agree with your sort of arguments about the scaling of unitary couple cluster and why it's not well suited to near-term devices but I just want to make a comment about this because I've noticed several talks and actually papers recently which seem to have a common misconception the proposal has always been to run classical couple cluster in order to understand say what transitions are symmetry forbidden and so forth and like in several papers I've seen recently people say things like oh molecular hydrogen there's several thousand parameters and at 10 qubits you know it would be 10,000 parameters or something so for context LIH has 12 qubits there are seven parameters when you run classical couple cluster so that's sort of an important part of that in fact it always has fewer terms than the number of terms in the Hamiltonian and not the other way around. Okay I'll pass to Garnet who probably has a real question now. Yeah sorry I also wanted to make I guess not a question but a comment which was on the the use of kind of larger basis sets to approach accuracy from that perspective like well that's true in general I think that if you follow the traditional path of classical quantum chemistry that many of us follow I think in early quantum devices you're going to want to use qubits a little bit more parsimoniously than that and that you reserve them for only the parts of the space where there's strong correlation so that even if you use say at CCP 5 Z basis with you know 220 basis functions the strong entanglements maybe only happening in say for eight of those orbitals and the rest can be treated perturbatively to high accuracy which is kind of the standard approach for things like DMRG and others right now and I think that's probably the approach to want to use on a quantum computer as well rather than attempting to treat all of these nearly classical orbitals using your quantum qubits I'm sort of adding to this a bear making slightly disappointing there is unfortunate error in the counting of parameters that you're using for the couple cluster theory so the exact parameterization for H2 only scales quadratically with a number of qubits because you only have two particles so the number of parameters is equal to the number of particles multiplied by the number of qubits so it's actually because classically that's a similar you know the number of parameters you have is much larger than the Hilbert space size now classically that's a simulation which runs in a second or something and that's because you don't have so many parameters for the exact solution so here the simulation I just considered for simplicity the depolarizing channel but so in practice we can have an arbitrary quantum channel and so yeah so the our mitigation method works for general quantum channels so that's the answer to your question you mean the symmetry of the sorry maybe we can discuss in private yeah yeah I have a comment to make about that last question I have actually looked at the lithium-hydride curve yeah and it's not actually in that case it's not actually a mixture of spin states it's actually comes from a at the equilibrium distance it's an ionic state and at the separated distance it's two neutral radicals and so you have a portion in the middle where there are two states singlet states interacting and so you don't have a clean strongly dominated state and because you can't run for enough you don't have enough depth on the current system you can't deal with those correlations properly okay