 Now it works. So, yes, I will just start my stopwatch such that we can drink coffee on time. Hello, welcome. And I guess you can already guess where we are and you can, some of you can guess who this guy is. And the guy's, his name was Heraclitus and what he said is that no man can enter river twice, the same river twice. And during this talk, I will tell you that, yeah, while this is true, sometimes you can enter for the second or for the first time, almost the same river. And since we are talking about the river, let's use a sailboat and there is somebody sitting on the sailboat there and let's treat Einstein as a qubit in our quantum simulator, which we have to control somehow. And naturally the life is a little bit cruel because we have to interact with stuff and in principle for this talk, what will be important is low frequency noise. But if we have low frequency noise, then we can do a little bit of measurements and maybe perform something like sail or perform coherent operations, which brings me very nicely to my talk and the talk will be indeed about reinforcement learning, aiding Bayesian estimation. And at the end of the day, we will see a qubit that is coherently rotating around, just something we don't control. My name is Jan Anderengschift, I came from the Netherlands, from Leiden, where I'm a postdoc in Aver's group. Aver had a great talk a few hours ago. There are also PhD students from our group, somewhere around the audience, but don't worry, there will be a pictures of them there about me coming here from the Netherlands. Also, ironically, I almost didn't get here because of the low frequency fluctuations of velocity in air called wind. Good. So I'm a part of AQUA as well and my task in AQUA is just to understand hardware limitation and basically turn the box into a feature, in fact. Good. So in particular, this talk will be about spin qubits. So just to make sure we are on the same side, on the same side, I will just share with you all of my knowledge about spin qubits in three minutes. So, well, in order to understand spin qubits and why they are relevant, first of all, I have to do this a little bit of explanation, spin qubits are at their infancy. Namely, we are working on two qubit gates. This is all we can have. Maybe some people have six, 16 qubits, but this is a very preliminary work and then understand what is the spin qubit. Well, you have to work a little bit in Leiden. This is in Leiden. We are looking at the wall of one of the buildings and this should commensurate the master thesis of George Ulembeck, who basically postulated that electrons has the spin. The thesis was all about it and the spin can be there are two types of electrons, spin up and spin down. So naturally, whenever there is magnetic field, the two energy, the two levels of the electron splits and splits by omega. So there's a nice qubit addressable with energy splitting omega. The problem is that if there is any source of random magnetic field, there is also a noise, delta omega, and this will be a crucial actor in my talk. And in fact, you can also model this delta omega. So how those nuclear spins, because this is usually the source of magnetic noise, how they are evolving in time, they are doing diffusion evolution, so something like a random walk. And basically, this is another wall from the Netherlands, but this time we walk from Leiden to Utrecht. And in Utrecht, you can find a prescription how to simulate something which we are calling Orshten Ulembeck process, also known as the brown noise. Okay, but to be honest, we don't really have to bother that much about the brown noise or it's an Ulembeck process because we know so chemistry people and they have a way of getting rid of those nuclear spins and basically replacing them by something spinless. And if we do so, we have again, very nice, healthy spin qubit. The problem is the moment we try to control it with electric field this time, we are unfortunately getting this random field again, delta omega this time is a charge noise, a little bit different noise, but again, temporarily fluctuating spin splitting. So altogether, there is a message. So somehow the control and the coherence of the qubit, they are interconnected with each other. Okay, why the idea? Why not use the noise to control the qubit? This is a little bit provocative question, but I will try to do exactly that. Okay, and having said that, there is another part when I'm showing off what I know and I know everything about the stochastic process, which I will tell you in five minutes. So, first of all, whenever someone says noise, you usually think of the white noise. So like vacuum cleaner, like a fan doing, ooh, so like a noise that is uncorrelated in time. And this is why the spectrum is flat. And the same noise like uncorrelated in time is used in many approaches to modeling open quantum systems, which are called Markovian because they are uncorrelated in time. Quantum optimal control, quantum error collection usually assumes the old one uncorrelated noise or quantum error mitigation. Good, so what, why I have a problem with that? Well, because in this approach, we are treating quantum device as static old TV. And so it's going shh, and the agents can come and interact with the quantum computer and like get some experience. He can do it again. So second experience and third one, and all together all of those pieces are different and uncorrelated. You can, for example, shuffle them around and nothing will change and you can do millions of interaction. And at the end of the day, figure out some effective model of your noise, but it doesn't, this model doesn't really care in which order you interacted with your system. And if you do this, you have something which we are calling Gorinikosakowsky-Linblad-Schudarshan master equation, which is used in many, many literature. Good, okay, so we are at this time. And obviously this is not all the processes that are happening in the nature. For example, okay, this was one citation which I forgot to mention. I read a little bit about the quantum optimal control and one of the papers I just learned that the specific question, so there are open challenges and it's a specific question to what extent all of the noise that is not Markovian, so not uncorrelated temporarily, to what extent the optimal control can be reached in all the non-Markovian system, it's still an open challenge. And in some sense we will attack this open challenge now. So sorry for this. But again, so what are other sources of noise and what other sources of noise you can meet in nature? So one of them is the pink noise, so one over F noise and you can hear it for example in the Netherlands very often when it's raining outside. So you can just listen to the sound. Pink noise has this feature that it has low amplitude in high frequency and high amplitude in low frequencies so it's generally slow. And example relevant for me is the charge noise. Also we have a brown noise and you also can hear it in the Netherlands a bit when it's heavily raining like yesterday. And it's even more low frequency noise and the relative amplitude and high frequency is even lower and this again is related to magnetic noise. So as you can see, pink hubris are rich in different colors of the noise. But the difference between them is even more striking when you look at the trajectories. White noise is where there is no structure whatsoever. Pink noise has a little bit of trend and brown noise is really like a very slow variation of something for a pink noise. So for a charge noise very recently people have been trying to counter attack and basically to correct for the very slow drift. You see there, you see the drift and people have been trying to correct for this drift but I would like to highlight the time scale on the X axis. It's like almost you can call this experiment calibration. So we call it calibration. Why I'm calling calibration? Well because my friends or our friends are a little bit better so the scale is different here and this is the experiment our friends in Copenhagen done and we tracked down this time of brown noise with much higher resolution and maybe it gives us opportunity to react. So this was the old publication and yesterday we just shared some improved version of it. Good. So let's say we have this trajectory. What can we do with it? Obviously there are more or less we can collect a single shot every 50 microseconds and as you can see on this time scale the field is barely changing. So why not forget about this field changing during single evolution of the experiment just update before consecutive algorithms, single shots and this is called quasi-static approximation and because we have a lot of those single shot we can just sacrifice some of them to learn about the noise. This will be the red dots and in green one we can do something. For example, your favorite quantum algorithm and other strategies would be to bunch it a little bit so like spend 10 shots on estimation and 20 shots to run algorithm or basically pick your poison. And one caveat, those indeed the green ones are any algorithm you want like quantum algorithm you would like to perform. This would be the operation of our quantum device, let's say but the red ones are the one which are estimation algorithm. So it should be simple algorithm, such simple algorithm that we can calculate probability of given correct answer and basically infer in Bayesian way what's the value of the parameter. So this red one should be simple green one. You call you name it. Okay, so the time has come to show you rotate a qubit that is being rotated and it will be rotating in the gallium arsenide spin qubits that this is less relevant. We concentrated on single triplet qubits so basically there is a way of turning two electrons into one qubit. If you are, yes basically if you have two dots and there is a little bit of exchange interaction and in fact this is the effective Hamiltonian of the system we have a knob, control knob and we have this completely random noise which we are not controlling, the average is zero. And because we have this control knob we will turn it to zero now. Yes, so basically now we have just the Hamiltonian which is completely stochastic. It's like it has a term which has zero average and this is why I sometimes call it quantum glider because it doesn't have an engine. It's just, that's its thing. And in this particular example basically the estimation and coherent rotation will be the same experiment but in, yes you will see the difference soon. So let's start with coherent rotation around the x axis by theta. It's rather simple. Yes, I have a gadget somewhere. I should have it with me. So we are starting, there is a way of initializing along the north pole, let's say, okay, perfect. And now we are going to let it evolve for time t and this is a very nice control, right? We have a Hamiltonian and we are just waiting some time for the qubit to rotate. You can think of this time as a, okay, of the whole experiment as an estimation of the bias of a coin and this toe is more or less the strength or like how high the coin goes. And of course the time is just too long then I don't know anything or like too small in this case. So basically this is like the strength of my toss. And in principle, if I'm not sure about this delta omega so at the evolution generator or like the rotation generator then from one realization to another I might make mistakes in the estimation and those errors in estimations are being directly translated to something which I'm calling gate infidelity. And now you can average, sorry, you can average over all possible realizations of your errors and by doing it you quickly realize that the gate infidelity is proportional to your uncertainty. So the width of the prior distribution or your knowledge of the system and yes, so it's like proportional to uncertainty. So the task would be to decrease this uncertainty in such a way. So kind of extract knowledge from the system and be more certain that the value of the field is given by something, by a constant. And in order to do it, we just need to use bias on estimation. This is why you see a train going through. Okay, so we do bias on estimation with exactly the same, more or less the same algorithm. So you rotate but this time, yes, you just rotate. The thing is the algorithm is so easy that we can very easily find the probability of measuring zero and one as a function of the delta omega and the time we pick, good. So we can invert the probabilities and update our knowledge. And now the task would be to pick the best tau. So how long this agent, how long we have to wait for the qubit to rotate such that our knowledge about uncertain parameter is kind of getting, the knowledge is improved in the optimal way. And again, here we have this notion of kind of waiting too long because if we wait too long, the initial probability distribution might become multimodal and we basically, we are losing a little bit of Gaussian structure of the prior. So the posterior is kind of, it is very complex and this would be another equivalent of doing it just too strongly. And finally, once we pick tau, we just measure and depending where we got plus or minus, we have new posterior distribution which can be fed up back to and treated as a prior for the next estimation. Perfect, so we did it. This is the experiment. I will quickly go through it. This is done in collaboration with Qdev. So with Copenhagen and those are two PhD students working on this subject. They're in black because they are PhD students so they did that hard work and I was just helping. So this is the experimental data. What they did, they just rotate the qubit by tau. So constant increments, this is the x-axis. Black ones are ones and white ones are zero. You clearly see for example here that for some time the frequency of those oscillations was faster and then it was relatively constant and because it happens and the frequency was a little bit changing while collecting the data, if you average them, you have a decay and we call it in homogeneous broadening or like T2 star like decay, doesn't matter. But we were clever so what we did, we just used each of those row to estimate the frequency. So this is how the frequency was evolving in this experiment and after each row, so we just take the row, estimate it and use this estimation to perform coherent oscillation to basically peak. So once we know what is the omega, we can now wait certain amount of nanoseconds such that the qubit rotates by given angle and in this way from the bed oscillations. We got almost stable oscillation especially here, there are stable initial decay comes from outliers so some kind of bed gases. And for some reason this guy is here, you can think about it. It's a little bit like Maxwell-Demond, right? So we are kind of looking at the random particles and opening and closing the gaps between the two systems. Perfect and we are getting better at it. So this is our new addition yesterday on archive. So now we are physically, so we know what is the model of the noise so you can propagate the distribution while we are not estimating. So we have a better way of kind of using the previous posterior as an X-bryor. We also can, we are now adaptive so the evolution time is basically inversely proportional to uncertainty. This is based on some AdVisenters experiment and this trajectory is basically generated by using only 10 shots for estimation which is really nice in my opinion. And finally Jacob in separate paper with our help, Mian Ebert, he thought about limiting memory of the whole thing and memory can be limited by saying we work only with Gaussians so it's enough to update the average of our knowledge and uncertainty. And yes, so there are now few things we can investigate. So first of all, how many of those things we can do online? Maybe some learning, maybe some agent this would be the rest of the talk but also there are separate topics which we are trying to investigate within the Aqua. So with Felix, one of our PhD students we are thinking about using a kind of physics methods to maybe propagate the distribution while we are not waiting. So you know, learn what will be the future distribution of the process just by looking at the trajectory. This is Doe Jones. Also with Ilse we just started to think about speeding up the readouts by doing a little bit of classification and maybe using even not discrete outcome but continuous outcome. But for the rest of the talk which supposed to be like a 10 minutes, I think so. Where is mine? Yes, I will concentrate on this part. So now let's define a game and define some kind of heuristics. So what is the game and how it should be played? First of all, why I would like to phrase it as a game? Why? Because in the literature, people have been already thinking about using reinforcement learning to improve initialization in this case. And the key element was field programmable gate array. So like a thing where you can program and put on top of your qubit and it's doing a little bit of logic for you. And they use it for initialization. They put it inside the reinforcement learning loop and the agent was either terminating initialization, flipping the qubit or like doing nothing. So it improved initialization but it was based on this Markovian approach. So kind of one step was independent of another step vaguely speaking. But our game is different. We have this slowly varying environment to which we can adjust. So we'll define actions as do an algorithm. In the simple case, this campus which I'm showing now, it will be a spin flip. So whenever there's a green dot, we are trying to flip the spin from the north pole to the south pole or the red one is just estimate. And for the time being as a heuristic, I will use this NV center motivated waiting time which is inversely proportional to uncertainty. Okay, so now results. First of all, I will present results on this plot. It has two axes, both axes are bad. So going vertically up, our algorithm is getting worse. So more time we are trying to flip while we are not flipping. So we go up and going, sorry, going right. We are wasting more bits for estimation. Also don't do it too much. We would like to flip a spin from time to time. So ideally, we would like to be there. The stupidest strategy of all would be to always flip. Yes, it would be to always, but there's also probably somewhere there was a parameters are used, they are trying to mimic this nuclear spins in silicon, more or less. Okay, so this is always flip strategy. It's like a reference, you should think of it as a limit of infidelity of zero estimation probability. So if we always flip, okay. And there are two strategy which we thought of. One is periodic. So just estimate every N shot or probabilistic. You just estimate with probability PE. Something happens here. I guess this is because periodic needs to have discrete steps. So sometimes it just doesn't work that much. But probabilistic is a good estimation, a good heuristic to beat. And my question is, can we get there by doing something, you know what? We will bring the agent to the game and try to learn this guy to play the game. So this is simple setup. This is why I left only three minutes to do so. So we will bring the agent and try to and feed him with the observation of the mu and sigma. The actions will be either to run the algorithm or estimate the field. And we will give a reward every time it successfully flip or penalize it. So we've negative reward if the spin stays as it is. And of course, if he is estimating, then we are updating the knowledge as I was describing. So this is the result what agent is doing in the case when he's not controlling time. So he's only deciding, let's flip, let's estimate. But if it estimate, it's using this heuristic what time to use. And you clearly see that as we are increasing the penalty for bad flip or no flip, and the agent is slowly going towards more wasted beats, so higher estimation, probability, but also lower infidelity. So as we expect, we go along this line. But it's on par with, let's say, the heuristics. Okay, but now we now let the agent control the time. And unfortunately, this would be a negative result now. It didn't beat the heuristic. But this is not the point here. I mean, I'm still happy about the result and I will show you why. Well, because let's look at this guy. How clever is this guy? What this guy is doing is first of all, tracking trajectories. Good, so you don't see two lines because trajectories are lying on the real values of the noise. It applies one over sigma strategy. So just to say, what does it mean? Controls time is just picking how long to estimate. So this tau time, this is what agent is picking. So he discovered one over sigma strategy. You see that this is how long it waits. And if sigma is small, it's just waiting longer. And if sigma is getting higher, it's just waiting shorter. And finally, you can look how the agent plays a game. So it flips this, but sometimes in order to keep uncertainty rather constant, sometimes it has to estimate. And basically in those six games, we have two errors. Good, it plays somehow. But my favorite one is this guy. So unfortunately, not that the best guy is my favorite one, but the most clever one. Why it's clever? Well, because it learns to flip only when the field was sufficiently large. So the black one was estimation, red one successfully flipped. Also, it just, it learns when it's lost. So it's kind of trying to flip and then, okay, something went wrong here. And you see that the real noise, the orange is kind of getting lost with respect to estimation. So the blue one is estimation, the orange is the real noise. Also, it correlates his uncertainty. So the sigma with the error he's making is an example of trajectories. The orange is the real noise. And in this part, he was not flipping enough. And basically he's of the real value. So he thinks the meal is something, but in reality, the field he's estimating is far away. But he has this information because the shaded region is a sigma he's assuming. And finally, he's using longer evolution time for small frequencies. And this makes sense, right? If my qubit is very slowly rotating, then I have to wait longer to see something, to see a contrast between zero and one. So he's doing all of those things. And I'm happy, yes. The question is, and this is what we will do next, is just to look at all of the strategy of this clever agent or maybe even breed a more clever one and try to learn from it, interpret the strategy and apply it to maybe create a better heuristic or apply it to more complex algorithm like two qubit gates or maybe even use two agents. And I know Jan is using two agents so I will try to ask him for help. Good. And this brings us back to the coffee break, but then my summary. So we develop matter for fast and the rest are sufficient estimation of the field. Now we are wondering, can we generalize it behind the brown noise but also deal with charge noise, for example? We use the agency to select whether we are doing the algorithm or estimation and the agent was kind of adapting to the current knowledge, to the current state of the device or the simulation in this case. The question is, can we do it on a real device and can we extend to more complicated algorithms and the next one in the queue would be two qubit gate? And finally, we had the agent which was clever, we had the agent which was on the par with a heuristic, I mean with heuristics but can we combine, can we beat the heuristic with probably some super clever agent? This would be exactly it. So thank you very much for attention. No, this was the slide. Thank you very much. Yeah, for this nice talk, so questions. Okay, so let me ask myself, which I'm not an, so what is the problem to generalize that for other types of noise? Okay, so the problem is that the noises are just fast. So for example, there was this trajectory of the noise which I can go back. So the brown one is just the slowest of them. And as I said, you can get rid of it by doing isotopical purifications. It's an expensive process, but still. And what you are left with is the middle one, so the pink noise. It has a little bit of drift, but it has also a tail of high frequency noise. So even if I'm estimating though that my field is something, in reality the moment after, it can just be off my estimation because it has a very non-negligible, amplitude of high frequency noise. You cannot use the control then. You can control, but it's not going to be as efficient as here, and the proof of principle is this paper, and what they achieved with relatively slow estimation is two-fold improvement of T2 star. So basically it's decay, it improved from one microsecond to two. For the scent of noise, because obviously the noise is quite often not white, but not brown, no? Yeah, so the thing which we are discussing now is just to combining this method. So just once you get rid of low frequency noise, you effectively have uncorrelated noise, so we can apply all of the methods for Markovian noise, so kind of, this would be just a method of creating high pass filter. So getting rid of the drift. Four o'clock with the coffee break, which is going to be 50 minutes, and we start four o'clock again here. Thank you very much. Thank you, dear speakers.