 So if we can stop sharing and Estelle can start sharing her slides. Can you see my slides? Yes, we can. Okay, I can start whenever you want. Okay, the next talk will be from Estelle in Mac on the traditional neural annealing. So you can start it anywhere you read it. All right. So thank you very much to organize us for giving me the opportunity to present a very recent work that hopefully will be out soon. And it's entitled version. So what we are trying to achieve here is okay, one second. So is that we want to solve optimization problems and optimization problem are ubiquitous in different areas. So be it in scheduling in science and industry in finance. So here I gave a couple of example here you have the so-called traveling salesman problem, which is a problem where you are actually you want to know the shortest path for let's say a businessman who is traveling to a certain number of city walls and you want to know what is the shortest path of it. Right. This is another problem in less in quantum chemistry where you want to know the minimum configuration of atoms interacting gas on potential gas protein folding crime. We want to know what is the native state of the protein and yes, portfolio optimization where you want to know what is the best way to to invest the asset you have in the portfolio and then there are minimum. So why are we physicists interested in that kind of problems, especially the ones that do not seem to be directly physics physics related like portfolio optimization or trying to see some point. So the reason is that most of this problem will be encoded in a form that we are more familiar with, which is the form of a classical is a model, right. And then the coupling constant that actually encodes the specific kind of problem of Newtonian that you're interested in. And solving this optimization prime is just simply finding what is the ground state of this classical Newtonian. But the catch is that for hard optimization problems, like for example, in glasses system utilizing glass in random field, finding the lowest state is actually and we have, right. And so basically, people are giving up the idea of finding what is the lowest state or if it's the direct program, what are the lowest states by finding approximate states. It is sometimes. Okay, so they use wasted methods to do that. I explain in next slide what it is. It's sometimes advantageous to view this kind of classical energy as some sort of energy functional over some basic configuration state or maybe conformation if you're protein folding prime. But for hard programs, basically this kind of energy landscape is very rude and and is exponential like a lot of number of local mini mass other point of stuff like that. So what you receive metal do is that they actually perform a search in that landscape. And instead of finding the deepest value in this landscape, they find values that are not so far from the deepest value. So one of the metal that has been used over the years to do this kind of search is thermal annealing where you use them are activated process to basically overcome the barrier when you are doing the search. And this has been expired by very old metallurgical technique called thermal annealing whereby to make less metal modulable or more robust, you hit it up at a very high temperature, to give like some sort of kinetic energy to the atoms of the systems and then you throw it cool down so that you find yourself in a configuration that minimizes the free energy of the material. And this has been implemented on the on dedicated hardware by by fluid. Another priority game to perform the search here is using by using quantum principle, so called quantum turning process whereby you basically turn it through the barriers in the search of this energy landscape. And this has also been implemented on dedicated to hardware. So say you don't have access to those hardware is and then you want to simulate either classical annealing or quantum annealing on your laptop. So you can still use these heuristic methods mostly based on Monte Carlo methods to simulate annealing paradigm. But what I want to highlight in this slide is the fact that Monte Carlo methods originally are not designed to to simulate annealing paradigm. It is designed to simulate the properties of classical systems or quantum systems. And so we thought of methods less in machine learning that also have the same purpose of stability dynamics of systems. And we got inspired by these two papers. There's this very famous paper by Gallia and Troyer, where they were actually using a neural networks to find ground set priorities of one many body systems on this very recent paper by Wu Wang and Zhang, where they use neural networks to find properties of classical systems. So in both of those papers, they use a variational principle to estimate the equivalent properties and they use neural networks as variational answers. So we thought of using this kind of framework, especially the variational framework to actually simulate the annealing paradigm. And to do that, we use as a better for the quantum case, the partial Monte Carlo method, which I explain in the next slide. So this is going to be like the most technical slide of my talk. I apologize about it already. So BMC is actually a quantum Monte Carlo methods that is used to simulate transit properties of quantum system at zero temperature. The way it does that is by considering the so-called variational energy, which is the expectation value of a quantum Hamiltonian over some variational state, right? And you can prove that basically this quantity is an upper bound to the exact transit energy, whatever answer you consider. So you just have to minimize it. So in practice, what you do is that we place this quantum expectation values by some statistical expectation value where you basically, let's say, take the average of some kind of local quantity of a sample that has been generated according to this probability distribution. By construction, this probability distribution is positive. And so there's no sign in variational Monte Carlo, which is different from other quantum Monte Carlo methods, right? And so when you have this framework, what you can do, you need to optimize the answer. This is an exact formula that you can use to find the gradients of each parameter that you have in your answer. You just have to replace the quantum mechanical expectation values by statistical expectation values. And then you use your with your favorite optimizer, written optimizer to basically update the parameters that you have in your answer, right? And next, the question you can ask is which kind of answer you're going to choose to emulate this kind of variational Monte Carlo. So also we got inspired by models in by no language models, very powerful models that I use actually to capture the distribution of a sequence of words like the main characteristics of this distribution of sequence of words in natural language. And when they do that, they can actually predict what is a next word, what is the next word giving some input. So I gave an example here of a text generation. And which is basically, I take a neural network that has been trained, right? And I put as input this dissenters African Physical Society Interest Conference and I asked the network to generate me some text. And you see basically that it generates pretty current text. So this is a machine generated text, it's not a human generated text. I don't know whether you guys talk about climate change today, but that's what the neural network is saying. And this is basically that's the use this this website. What you see is that the neural network has learned to has learned context, right on how to generate current sequences and that's that learn correlation in the different I mean, inputs that are given to it. So in a more concise way, let's say, kind of cultural way of understanding how these these sentences or these words are generated in the program. This way, I gave an example of a regular neural network. And so let's say that I want to see how this dissenters have highlighted in yellow is being generated by this kind of probabilistic model. So you give an in as input what I chose a PS International Conference, you pass it from RNN and really generate a word according to some probability. In this case, it generates also. So basically, you have the conditional probability of the word also given this input, right? And then in the next iron and step, basically, you give this as an input and it generates the next word and so on and so forth until it generates your last your last word for the same sentence. And you see that if you take the problem of this conditional probability distribution is actually the probability of having all the words in your sentence given the input. So now I've been taking talking about sentences. This is not related to physics. But imagine that I want to generate actually a spin configuration, right? And so instead of having words have a spin that could be spin up and spin down. So you can still use the same kind of autoregressive sampling whereby you generate you start with some input to pass your nn and then we basically generate this new spin configuration and so on and so forth until you basically generate your word sequence, right? So this edge I've written here is a hidden state of your nn in close information about the previous spin. So it somehow captured the different correlation that you have in your spin configuration. What is interesting is that using the chain rule of the probability distribution you actually can obtain what is the joint probability distribution of generating this spin configuration. And we can use that to model the amplitude of our function. In this case, for example, if it is so-called stochastic Hamiltonian, you can suppose after the square root of this probability distribution. So this kind of autoregressive sampling is actually doesn't have any auto correlation time. So compared to Markov, Markov chain Monte Carlo, for example, if you have to generate spin configuration, you have two correlation times. And for the glasses system you're interested in, it can be very, very low. And it's directly parallelizable. This is another advantage of it. And by construction, the wave function is normalized to unity, which is different from other kind of neural networks like conventional networks or residuals machine. So now that we have the answer that we are going to use inside our Vaishan Monte Carlo scheme, I can move on to explain to you how we perform our annual Vaishan annual prize scheme in the quantum case. So what we do is that we have less energy levels with respect to some sort of less transverse temperature, right, that you're going to tune somehow to perform the annealing, the annealing evolution. So usually when you want to implement quantum annealing, you want to prepare your system in the ground state for the Hamiltonian that is very easy. So what we do with RNNs is that we randomly initialize them, the RNN wave function with random weights and bias, but you want to prepare a system in this ground state. We simply do Vaishan Monte Carlo, that is we apply some gradient descent step and we land on the ground state energy of that Hamiltonian. So next we need to do the annealing procedure. So we time evolve the system. What happens is that when we change the parameter in the Hamiltonian, we actually leave the instantaneous ground state of the Hamiltonian, we need to fall back there and so we perform gradient descent steps to fall back there. And then we perform another linear step and then some gradient descent step and so on and so forth. And then hopefully when we have removed, so this is supposed to be beta x0, we have removed all the quantum fluctuation in the system and then we hopefully should be found in the ground state of the parameter Hamiltonian that we interested in solving. So this is how we perform Vaishan quantum annealing and we came up with a Vaishan algebraic theorem that basically bounced the number of gradient descent steps that we are supposed to implement if we want to remain adiabatic. This delta m is basically a minimum gap between annealing and epsilon is basically the overlap between the instantaneous ground state and all the excited states. So to test that this is actually working, we consider the quantum easing chain right where we have a time dependent transfer field. And as a metric, we look at the instantaneous expectation value of the Hamiltonian given a lot of time. And so we plot this, the instantaneous value of the Hamiltonian respect to the transfer field as we perform the annealing. And we find that the green curve and the black curve, they fall exactly on the same, on the same, actually the same, where the black curve is the exact energy of the quantum easing chain and the value of gamma. What we call transfer learning here is when between two adiabatic steps, we use as input the weights and the biases of the previous adiabatic step to somehow encode the annealing project. When we don't do that, when between two subsequent adiabatic step, we use, we randomly initialize our RNN and we use the same number of gradient descent step per annealing step, we find this no transfer learning scheme where you basically go out of the adiabatic scheme. So in this sense, the adiabaticity here is, or annealing is well captured by this kind of transfer learning parameters. And we see also that the, so this quantity is basically the current energy of the RNN minus the exact energy. And that is the error in finding exactly the transit energy during annealing. And it's much lower than the gap. So we have the instantaneous variational principle that is respected at any step, right? So this slide tells us that quantum annealing dynamics can be well captured within this kind of variational protocol. So next I will move on to explain the classical part of variational annealing, which is basically this. So here now we have a new cost function, a variational free energy, which is basically made up of two terms. We have the expectation value of the program at Newtonian over basically the distribution of probably the distribution encoded by the RNN minus some temperature-dependent schedule. And then we have that multiplied with the von Neumann entropy. This is easily... You have five more minutes, sorry. Oh, fine. Thanks. Which is well captured by using an RNN. So this is another advantage of using RNNs compared to other rational answer. And so this variational free energy has a nice property like what we had in quantum annealing in the sense that it is also an upper boundary for free energy. And so we just have to minimize it for which annealing step. So we check that also. It's a proof of principle. We use the easing model. And then we look at the instantaneous free energy as we change temperature. And again, we find that we are able to capture the classical annealing procedure here. All right. What I would like to mention is that a similar procedure was used in the article I mentioned before, but they were using that word mode collapse as they were preparing or trying to simulate the dynamics of the classical system. But here we use it to solve optimization problems. Next, I will move to results. So we have the first results on the randomism chain and the randomism chain is given this by this problem Hamiltonian. So we are trying to find the minimum of this problem Hamiltonian. We have basically random couplings between the nearest number of spins. And we consider two kind of random couplings. One is a uniform disorder between the one and the other one is a discrete disorder. So as a metric, we use what we call the residual energy, which is the expectation value of the problem Hamiltonian at the end of annealing, with respect to the probability distribution of the RNA. Minus the exact energy that we know for this randomism chain. We consider three different system sizes. And at the end of the annealing, since the autoregressive sampling is not that expensive, we just generate a million samples. We consider 25 random deposition of the disorder. And this is the RNA that we use a positive tensor as RNA. And I didn't have time to explain what it is. So here are the results. We have the regenerative spin, respect to the number of annealing steps. So I should mention here that for both classical annealing and partial quantum annealing, we use the same velocity of annealing. So that it should be kind of a fair comparison that we do here. What we notice is that when the annealing is kind of slow, we are doing some kind of quenching in the system, the quantum annealing or partial quantum annealing is superior to partial classical annealing. But when the annealing is slow enough, that is the number of annealing steps is long, we see that partial classical annealing is superior to partial quantum annealing. And this is the same thing that we observe for the discrete disorder. Actually, these are very interesting results because it is different from what was found by Zanka and Santoro, where they use other formulations of classical and quantum annealing, and they found quantum annealing superior to classical annealing. And more importantly, they found that quantum annealing, the residual energy was going down logarithmically, whereas here we see some sort of power scaling. So here we get partial annealing superior to SEM, both core and quantum annealing. So next we move to harder model in 2D, another one that's on spin glass, which is characterized by this kind of quantum Newtonian. We consider first a 10 by 10 spin system. And this is a kind of architecture that we use for the RNN. We see again that the blue data VCA is superior to the green data. So here again VCA is superior to VQA. We try what we call renormalized VQA whereby we inject a fictitious thermal kind of search inside the quantum annealing search. It's a little bit better by still inferior to quantum annealing. And for this green data, we basically use what the optimization of this program is done on our answer. So this depends on the number of gradient descent steps. And we see that as we train the models, though the reason energy is going down is still out of magnitude lower than this year in the QA. So next we compare VCA with simulated annealing and SQA, simulated quantum annealing performed with pattern on Monte Carlo on the 40 by 40 spin. And we see that for long annealing time basically VCA is three out of magnitude is better than both simulated and linear simulated quantum annealing. And, and yeah. And so with that, I conclude, we found that actually we have two different parts of the VCA. So we have a problem of doing a search in a configuration landscape and one of doing a search in a virtual landscape. And it looks like the search in a virtual landscape is better than the one in a virtual landscape. We found that the virtual classical annealing is better than simulated annealing, virtual quantum annealing and simulated quantum annealing. So we advocate that is a good candidate for real world optimization problems. And we find that that neural traversive neural networks are actually very powerful and that drive this version elimination of both classical and quantum annealing. Right. And with that, I would like to thank my collaborators, especially Muhammad was the main drive behind the project I did most of the simulations and especially he's the expert of neural network. And for the purpose of this conference, I will mention that Muhammad is actually promoter. All right, with that, I thank you very much for your attention. And I think I have time for questions. Okay, thank you. And it's helpful. This very nice talk and thank you for sticking to the time that was allocated. There was actually already a question. I mean, a set of questions from my children on the chat. So I will ask him to unmute himself and ask maybe a couple of those questions and the rest will come in the discussion session later. Okay, Esther, can you hear me? Sure. Yeah. Okay. So I have some few three questions that I really need to understand. So the first question is, I'm not really able to cope with the fact that the classical annealing is superior to the quantum annealing, especially at what extent? How do you perform the circle? Yeah. Okay, I mean, for I mean, usually, for most of them that have been tested using a similar kind of methods, people have found quantum annealing to be superior to classical annealing, though there are some firms where classical annealing was superior to quantum annealing. But the main idea is like what I mentioned. So wait one second. So if I do this, and I come here. So basically the idea was what people think is about what is this barrier looking like? So if you imagine that you have thin barriers, probably quantum annealing is going to be better. If you have like a very large barriers, so thermo annealing and not too high, thermo annealing is probably going to be better. But typically you don't know what is the form of this landscape. For the results that we have obtained in late, for example, if I look like, if I look at this 2D model, it's not clear to us why this year is better. But one intuition that we have is that, so when we include some kind of fictitious thermal fluctuations in the quantum annealing simulation, we find that quantum annealing is kind of better. But this is not the true entropy, quantum entropy that we have introduced. So maybe it could be that if we're able to use the true free energy of the quantum simulation is possible for this visceral quantum annealing to be better than the visceral classical annealing. But we don't have like very strong arguments for that. Yeah. Yeah, okay, but what about the annealing time in classical and quantum annealing compared? I mean, this is the same, we use the same number of annealing steps for both of them. So the velocity of the annealing is the same for both quantum and classical annealing. In this case, we use a linear schedule for both of them. And same initial conditions. Okay, so the second question so is I wish to know, is there any possibility to consider the contribution of the environment? That is the stochastic contribution in classical annealing. As you know, practical implementation of this may necessarily need some stochastic contribution on the environment. Yeah, I mean, this is something we thought about you be interested indeed to include the environment contribution and see how it's going to change. I mean, there's been working quantum annealing to show that the annealing actual environment actually hampers the quantum annealing schedule. You basically have some sort of minimum, yeah, minimum in the expense of the residual energy respect on any time. And as you increase the strength of the noise, basically you start, I mean, yeah, landing is some bit of stability basically. But here, yeah, it would be interesting to include it. So I think we can stop for now and we can go to the next speaker.