 I mean to the speakers of Giuseppe Carleo. So Giuseppe finished actually his PhD here in Trieste, in CISA some time ago. After which he did his postdoctoral stays in Institut d'Optique in France and also in ETH Zurich in Switzerland. Then he joined Flacarion Institute I think in New York and since recently he's a professor at TPFL and where he has his research group. So you're happy to have you here Giuseppe and please go and read your lecture. Thank you, Asya, for the introduction and also of course thank to all the organizers for having me here today. I'm always very happy to be at ICTP even though this time only visually and I will have to skip the nice view of the sea but I'm really happy that this school could still be done online. So today, I mean I will tell you more about mostly the applications of what Filippo has started introducing in his lecture. And this is, as you've understood, essentially applications of machine learning ideas to many bodies, essentially interacting quantum systems. And as you are seeing also during this conference, this conference, this school, this is part of a much broader if you want things that is happening in the context of physics which is the application of machine learning to several realms of physics from particle physics to chemistry, statistical physics and also what we're going to talk about today is mostly quantum physics and you will also see more I guess from Juan Carrasquilla in the next lecture. So this is a review where you can find somehow an overview of what's going on in the field and all the explosive development that have happened in the past couple of years. Now, just as a short one slide summary of what you've seen during the last talk, what you've seen is essentially this idea of neural network quantum states. And again, so these are a parametrization of your variational function sites of your quantum state, the state describing complex quantum system. And what you do is that you have a non-linear function that given an arbitrary set of quantum numbers, for example, spin quantum numbers or electronic positions, whatever you have but your quantum system will return, I mean, will output the amplitude of psi sigma, psi s, s is the ensemble of these quantum numbers. And these essentially these quantum numbers, these amplitudes of the wave function that are complex as you know in general will depend parametrically. So this is why this is called the variational approach on some parameters, for example, those in a deep neural network. So you've seen this form maybe not written this way but you've seen this during the initial lecture. So this is a deep network where you read it from the right to the left. And the first variable that you see here is a vector s which is the ensemble of your quantum numbers, for example, plus or minus one for the spin system. Then you have a linear transformation. So you apply a matrix w, these are your parameters, the thing that you can change. And then you apply component y so to all the entries of these vectors, vector, a non-linear function g, for example, the value or any other non-linear function that you've seen during the other lectures. And you do this operation a lot of times until you reach the final layer, so called the network, where you will have in this specific case only one output. So this only one output is for a given choice of the input quantum numbers, the amplitude of the wave function. So this if you want bracket value. So this is the main idea of this neural network on the states. And then, I mean, as also Philip was anticipating and seeing also maybe in the previous lectures, one of the reasons why we won't use this kind of approximations is that they are very powerful. So non-linear functions are very powerful at describing highly dimensional objects or highly dimensional functions. This is based on some theorems. These are somehow modern reformulations of a famous theorem by Kolmogorov and Admund at the beginning of the 1900s. And these are written in terms of neural networks. Essentially, what this theorem says is that if you have a sufficiently large neural network, if you have sufficiently many neurons, then you can represent an arbitrarily high dimensional function provided it is sufficiently regular. So regular means not crazily infinite at some point and this continues. These conditions are typically made by wave functions and that's why we also use these neural networks to describe wave functions. But as you've seen in other applications, they're used in many other cases to the correct images and all of them. Now, from a physical point of view, we might have also heard about entanglement. So entanglement is this property of wave functions of quantum system that essentially if I make a measurement on some part of the system here that is far away from another part of the system, let's call this part A and the other part B, then the property of entanglement is that essentially the outcome of this measurement, A on A, will influence directly successive measurements that I do on B, the other part here. And you can show it is possible to show that if you have a parametrization of your state in terms of a neural network, you can actually find a system which is entanglement even if these two parties A and B are very far apart. So this is what is sometimes called volume loss essentially the scaling of this entanglement that goes like the volume of the system and not like the surface as some other case. For example, if you have a deep network like a convolutional network that I'm sure you've seen in the previous lectures, it has been shown that if you want to encode these long range correlations, essentially this long range entanglement, the depth of the network should scale at the most essentially polynomially, so polynomially fast with the number of spins, the number of degrees of freedom at the end of the system. So this is a very important property because this tells you that you don't need exponentially many large, exponentially large neural networks, for example, to describe this fundamental property of quantum system which is entanglement. Now, as Filippo was also mentioning, they're mainly two applications. So one which is about simulating quantum system and it is what both me and Filippo will focus on today. But there's also another part of the story which is about characterizing quantum hardware or somehow learning if you want wave functions from experiments or if I have an experiment that contains, so to speak, a certain wave function, I can try to represent that wave function on my computer using these representations. This is not what I'm going to talk about today because we don't have enough time but I will concentrate on applications in the first three arms, so simulating quantum systems. It should be stressed that this is very important that these kind of applications are relatively different from the kind of applications you've already seen, I guess, during this school where you have data sets. So in standard applications of machine learning, you have a lot of data that is generated by, for example, images that are taken of cats or dogs. And then you, for example, try to classify these images with these very large databases. However, in the applications that we do here, we don't use databases, but in a sense, we self-generate these databases. So this is the sampling step that Fripo's discussing for a long part of his talk that is essential to compute, for example, expectation values of quantities in quantum systems. So in this sense, this kind of applications is self-learning in the sense that we don't have an external pre-solved solution of our problem, but we try to find it on the fly. So this is similar to somehow learning yourself to how to walk without having somebody that shows you how to walk. Okay. Now, concerning the simulation of quantum systems, again, there is several applications. One is if you want to find the ground state of a given Hamiltonian H in approximation of this ground state or some excited states or similar to unit dynamics. So solve if you want to the Schrodinger equation, the time-dependent Schrodinger equation, or there are even cases where we actually solve approximately for the dynamics of the density matrix of the system. So this is for open systems. If you want to find a temperature of somebody, in some cases, somebody will say, ask me something. So let's focus again first on this part of the story, which is the ground state search. You've seen from Philippa, again, one slide reminder that what we do is that we consider the variation of principles of the expectation value of the Hamiltonian for the thing that discusses interaction in my system. And we know that this expectation value is strictly larger than the exact ground state damage. So what we do is that we try to minimize this quantity, E of w as a function of the w that are, the parameters that are in the miracle. So, and a very important point is that you can rephrase these as an expectation minimization problem. So I have a probability distribution, which is a size squared as Philippa was out flying. And we minimize the expansion expectation values of a quantity that is this local energy to this by Philippa over this probability distribution. So this is the main step that we do during this optimization if you want this learning, the operational learning of the neuron. Now, this, again, I won't go too much into the details of the theory because you've seen already some, Philippa will give you some applications, an idea of the application and the flavor of what we can do. So the first kind of applications, I mean that we do, for example, in the condenser matter is about studying interacting, for example, spin models, right? So you know that in some case, you can describe, you can have a Hamiltonian, an effective Hamiltonian for even an electronic system, for a system of interacting electrons that reduces only to spin degrees of freedom. So we freeze some of the translation on if you want degrees of freedom, the fact that the electron can go around, we mentioned that they are on a square lattice, for example, like in this case, and then what is left are only spin up or down degrees of freedom for this area, okay? So one famous model, which is what I would discuss about today is this kind of family of models where you have what is called an exchange, I mean what is called an isomeric interaction. So essentially you have an interaction between spin, S, I, J, so these are vectors of poly matrices interacting on two sides, if you want of these two dimensional square lattice and for example, only the nearest neighbors of this lattice, this is the first term here. And then you can have also interactions at second nearest neighbors. So this is this J2 term on the diagonal of this square line. And I mean, one reason why we use this model as a benchmark is because first of all, it's easy to write down and also because we don't know its phases exactly because it's very hard to solve. We don't have any other technique that can be used to solve these either analytically or computationally in a controlled way. So for example, one question that we would like to understand this model is if when J2, so this interaction is comparable to this other guy, J1, you can have a phase of... Sorry, I think we lost sound. We lost it, okay. Sorry, what? We lost the sound for a while. Okay, can you hear me now? Yes. Okay, so I was saying that this spin-leaved phase would be a disordered phase of matter of these pins, which essentially contrasts these disordered cases where you, for example, when J1 dominates, so this part dominates, we have this kind of ordering. And or when J2 dominates, you have this other kind of ordering on the vertical law. So the question that we'd like to understand is then essentially find good approximations of the ground state of this very challenging model. So one way to do this that we started doing in these works is essentially to take a neural network, which is a convolutional neural network. It's very successful architecture that people use in the image recognition problems to recognize cats and dogs and use them as a wave function, okay? So a two-dimensional wave function. So we take this kind of architecture that I guess you've already seen during these lectures. And we use this kind of representations. So here, the weights are essentially the edges of these matrices, these square matrices that are called those filters in general. And then, I mean, what we can do is try to see what is the accuracy that we get on the energy of the ground state for some cases, for some values of this J2 and J1, so for some values of the problem. So if we take, first of all, J2 equal to zero, so in this Hamiltonian, when we take this, also when we don't conflict these interactions, we only conflict these interactions here. This is called the standard Isomer model. What I brought here is essentially the error that you make on the energy that you can compute exactly in this case with other techniques, compared to the sites of the network that we have. So here we use the, an RBM, that was also talking about, and alpha is the width essentially of the neural network. So the larger alpha, the wider is the neural network. So this is a shallow, very shallow, just one layer essentially neural network, but that you can make more expressive by enlarging like this horizontal. So and you see that, I mean, at that time we were able ready to improve some of the, at the time, best rational results that people were obtaining on this model with general function answers. And if you start using the networks that are also more expressive because they are deep. So here we are not using deep networks, just these simple, one layer of very simple neural networks. But if you play a bit more and you increase the depth and you use them closer to the state of the art networks, you can also systematically improve these. And I will show you later also how you can even go beyond these results. So you see that here, for example, the error that you make on the energies of your 10 to 12 minutes. Now, the, the thing becomes challenging when you start turning up this J2 interaction. So this second, for this interaction to diagonals of the square lattice, right? So J2 again is the interactions of J1 is the interaction between the nearest neighbors on the spot. So I have a spin here, a spin here, this is my J1 interaction, even these two spins. And J2 is the, instead the interaction between on the diagonal of the square lattice, okay? And so, and things that becomes challenging and I'm solved essentially when you turn on this J2. So there's no way we can solve the problem exactly when J2 is different from zero. So in this limit, I mean, in this case, you can see what the only thing that we can do is to compare, for example, the accuracy of our method to, so this is typical of variational methods. The other thing we can do is to compare the energy, for example, that we get with other energies that people have obtained in the past or are obtaining as we speak on this model using other approaches, okay? So for example, people have used the PMRG, MedExport of States, Kwandu Monte Carlo can use the gain only for this specific point of J2 equals zero where you don't have the same problem. And then there were also other applications based on other variational way functions, et cetera. So what you see here is essentially the difference between the energies obtained with this, all these techniques and our energies. So essentially, when these points are up, in this upper plane, it means that we have lower energies. So in principle, a better approximation for the ground state. Otherwise, when you see these kinds of points, it means, for example, that in this specific case, the energy would have a slightly better energy than our best approximation of the end of 2019, okay? So this is the state of the art in 2019 for this problem. And you can see that essentially, apart from a small part of the phase diagram, we were already, this neural network can really help you in finding better approximations for the ground state and possibly also solve some open points. I have to say that nowadays we know also how to improve this around here. And hopefully we will see soon some new works where actually all over the phase diagram, the neural network will have essentially the best approximation for the ground state. Now, there is, of course, challenges and reasons for improvements as in all techniques, in all approaches. And this is what things that I think it's very important to discuss and understand also in this context, okay? So one reason for the challenge, for example, if you go at this 0.5 or somebody has already noticed it, but you see that here and why would this point be more challenging than others, right? So one of the reasons is that is being pointed out in this paper is essentially the number of samples that you need to generate. So essentially how many times you have to do this Markov chain that people was talking about during this lecture in order to get a good estimate, if you want a good hint about the ground state properties look like, okay? For example, in this paper, what they did is that they started the overlap. So essentially how good is the approximation of your ground state with the neural network as a function of the number of samples. This is a quantity which is proportional to the number of samples. So you see that what you see is a very nice phase transition actually. And you see that you need a certain number of samples in order to get a good accuracy. And I mean, what they found is that when you have strong frustration, so essentially when J2 is comparable to J1 for a similar problem, but not exactly the same, but pretty much related, it can happen that the number of samples that you need is pretty large because essentially the kind of problems that you are trying to learn are very disordered. So it might be that also to learn this kind of properties you need also to see a lot of different combinations. So this is one of the limitations if you want of this learning-based approaches that are based on the number of samples. However, I mean, and it is quite crucial, important. It seems that the critical number of samples to learn if you want to approximate these wave functions with a given accuracy which is not too bad, doesn't seem to scale this point even though we only have small systems, these studies are very hard to do for larger systems. But it seems that the scaling is not too bad. So in the sense that if you have 20 spins, you need of the order of 10 to the three samples, even for very challenging models. If you have 36 spins, it seems that you have, you need maybe 10 times more, but not one million or 20 million more samples. So this hints to the fact that hopefully this idea that also Philip was showing that essentially given this large inverse space or vector space in this case of quantum systems, we only want to describe a small portion of it parameterized by neural network, a series of neural networks. Then indeed this is somehow a correct image in the sense that we can hope that for ground state of physical systems, this part can be addressed using a number of parameters which is not exponentially large. This is a hope it's like hard to prove analytically there are non-artificial kind of examples but in practical systems we see that this typically works. There is an improvement over the old things, older things that I was discussing until now and that Philip also was telling you about. So this idea of doing Markov chain to samples or to generate these many samples from the wave function. So this can be brewed even faster. I will not go into much into the details but I will just tell you that there is a family if you want of the neural networks that are called auto regressive neural networks. That can be generalized as we've done in this work to quantum systems. And this family of neural networks satisfies the property that they can be, you can essentially sample from these quantum states without doing Markov chain Monte Carlo and in a completely efficient way. So these are really the incarnation of what, of the definition of computational tractable quantum states that Van der Nest was introduced and that Philip was doing his presentation. So just to give you a flavor of how these things work but essentially you have to make sure that your filters in neural network are such that the correlations here, even only on the previous spins and not on the on the on the successive one. So you do a sort of decomposition of the wave function in terms of conditionals even though these are not probabilities but complex objects. And then you can efficiently impose these quantum conditions if you want to normalize to one. Again, I will not go into much into the details but this will allow you to do exact sampling and you don't need to do this Markov chain metropolis sampling. So just to give you an idea of how good these things are if you take again the case of the Asimov model which is an important benchmark, you see that if you use this exact sampling approach and also much deeper network you can crank down or go down in your inaccuracies or improve your accuracy of more than a factor I mean a factor of 10 or so compared to standard deep neural networks. And at a point that is 10 to one is five where I mean essential this problem is one practical purpose is exactly so. Okay, so this kind of new ideas that are really influenced by machine learning are also making an impact in techniques now that are used to study variational. Now, I mean there is so maybe some okay, so maybe I will take just one or two questions at this point before moving to the next topic. So there is a question which is from Giancarlo Francis who's asking what about spin glasses and what about finite temperature results? So I'm not sure what you mean by spin glasses so you mean classical spin glasses Yes, so what about that? I'm not sure I got the question maybe can you mute yourself? Carlo, you can talk now. Hello, can you hear me? Yes. Yes, thank you. So yeah, I was asking about spin glasses so when you not only have frustration but also disorder. Ah, okay, yes. Yeah, that's a case that we have not studied in this case but can be addressed. That's, I mean, so this family of models, for example has been used classically to study spin glasses where this is instead of psi at the P for both you try to approximate the Boltzmann distribution with a family of auto regressive classical models. I mean, now I don't want too much to do but I mean, yes, you can do that too. There's no problem in doing that. But again, the issue will be the number of samples that you need to learn a spin glass. That's my question. If there is any results about that if anybody did something about that? No, not in the quantum case. Okay, and so you are only interested in finite temperature. So you don't mean the quantum case is only related to the finite temperatures so the zero temperature, sorry. So you are not concerning the any finite temperature case, right? So in this case, I'm only discussing about ground states but I will show you an example later of also excited states. Thank you. Thank you. Okay, so, okay, so, so there is, okay, this other question which is, is there a conceptual reason why NQS doesn't work when J2, I wouldn't say it's larger than 0.5 but it's actually around 0.5, as you can see. On this thing here, okay, because for larger, sorry, for larger values of J2, you see that our approximation is good. I remind you that if these points are up, it means that we are doing better. So you see that it's around 0.5 where we don't, I mean, in this paper two years ago, we were not performing as well as the energy. So the reason that I was telling you about this is because of this sampling. So essentially the number of samples that you need to learn in these over the phases. Okay, so now I will move maybe to the second part of my talk and discuss about something which is pretty quite important. I mean, I guess for those of you who will be interested in their future research activity in studying electronic properties or sort of other things related to interaction of an object. So what I will discuss now is indeed how do we address one of the most fundamental symmetries of nature which is essentially the exchange segment, right? So the fact when you have two fermions, the wave function should change sign. So this is one of the main, the first thing that you learn when you take a higher level course in random makeup. Now, the main, one of the things that you can do, which is not the only one, but one that is the one that I will discuss today. What you can do is that there is a way, for example, if you have fermions on an elliptus to map these fermions onto a spin problem. So there is, for example, a very famous mapping that was devised by Jordan in Big Ben that essentially allows you, for example, to map a generic fermionic Hamiltonian, so Hamiltonian where you have electrons, onto a Hamiltonian of spins. I will tell you in a second about this mapping, but there are also other mappings, for example, this maybe less known in Connets-Matter mapping, but this is viewed by Braby and Kiddife, which is also a way to map fermions to spin, in a way which is, for some cases, in some cases, better suited for, especially for simulations with quantum computers, but also, in some cases, with classical. So I will give you maybe just a rough idea of what the Jordan-Winger mapping is, but I mean, maybe you know that if you have fermions, you can describe them with the so-called raising and lowering fermionic operators C and C dagger. So these are anti-commuting operators. And the idea of this mapping is that you can essentially turn these, for example, these destructions for fermions C on the fermions side J, onto a spin operator that has a lowering matrix, sigma minus, Pauli sigma minus, on the same side times a string, so called string operator, of products of sigma z on the previous side. So this, if you want this string, and cause the sign of this configuration. The same thing can be done on for sigma dagger. And using this very simple rule, you can essentially transform any Hamiltonian containing fermionic degrees of freedom into a spin Hamiltonian on this one. So if you want, you can turn the, if you want the generic discrete problem of fermions into a problem of spins. And then we can use all the machinery we've developed so far for spins, exactly. There is an alternative mapping. Unfortunately, I won't have time to go too much into how this works, because it's a bit involved by Braving-Kedave. And the main reason why this mapping is interesting is because instead of mapping these operators on two objects, like in this case that have N-BOD interaction. So you see that here I'm making interact through this product, essentially N spins. So a lot of, so this is an N-BOD interaction. It's a very unphysical in a sense. The main advantage of these other mapping by Braving-Kedave is that these interactions that you did arise in spin Hamilton are quasi-local, not exactly local, but quasi-local. So they only involve at most log of N number of spins. So this is something that can be used also in practical, classical simulations in one application. Now, just to give you one example we've applied this approach to some small molecules essentially through these fermions to benchmark against other approaches that people use, for example, in the quantum chemistry. So this is the case of two molecules, C2 and N2, so two dimers of carbon and nitrogen. And you can see that, so what I'm showing here is the energy of the ground state of these things as a function of the nuclear separation. So I have two atoms, for example, one carbon and one other carbon and I can separate them by certain distance and I can predict the ground state energy of this specific state. So the red line is the exact solution that you can still perform for these small molecules and these green line points are the arc gain, so these neural network results. And you can see that these are pretty much close to the exact one, for example, you can see on this region. They predicted the correct dissociation energy, how it's called, and they also, in some cases, get better results than existing approaches that people have used for years in quantum chemistry, for example, it's a couple-caster approach. This is true especially for more collated molecules like N2, where you can see maybe from this plot that these green points are pretty much below these the curves that you can obtain with other things. Of course, these are small molecules and there's a lot to be done in the future, but this tells you a flavor of, again, how you can apply these techniques onto realistic or close to realistic systems that are relevant also in some cases for cancer. I will skip the large, if you want, table. You can find it in the paper. And here, as much as for the other case, the main problem is that we encounter the improvement in the accuracy. It's, again, the number of samples that you need to learn in the wave function, even though here the challenge is different, it stems from essentially how the correlations of the system are done and the fact that there's a single configuration that is dominating, is actually focused in the configuration that somehow spoils down all the others. But, again, also in this case, we found that one bottleneck is the sample size, but there are ways to improve on this. And, again, this is something, issue if you want here, is not related to what I presented before for the J-1. Just also as a matter of reference to other works, there have been other approaches that are not based on this, if you want to, Jordan Bigner or David Kiddard mapping on the lattice that are instead based directly on the continuous space divisor freedom for electrons. Most notably, there's been some work in the group of Frank Neue here in Berlin, his work that was later published in Nature Chemistry, where they use this kind of neural network architectures and also this paper done by people in DeepMind, so Google DeepMind, who are now interested also in fermions. And this is an alternative approach to compete a different in spirit, in the sense that they don't work on a lattice, but they work in a real space. But the essence also of this kind of approach is that if you do things and take networks that are relatively large, also for some larger systems that what we started with our approach, you can get competitive energies and start essentially adding results for more challenging systems that cannot be addressed with our techniques. Now, I believe I have 10 minutes. Please ask if you are correcting me if I'm wrong. We can go until 2.45, so together with us. Okay, yeah, okay. Yeah, I wanted to reserve a few minutes for questions at the end. So let me actually take already many few questions because before going into my final part. Okay, so there is more than computable set energy. Is it possible to characterize phase transition, determine critical exponents? Yes, so I mean, essentially, once you have the wave function, you can compute as people showing you arbitrary operators, arbitrary expectation values of operators on these wave functions. So if you add to, for example, characterize a phase transition, you want, you would like typically to measure correlation function of spins or any other quantity that is important to characterize phase transition. And you can do that for different distances and then with that, you can extract with the standard techniques, critical exponents and determine precisely where this phase transition exists. So, okay, so we need to know certain parameters to do computation of a system. What are the parameters we approximate for machine learning codes? What are the inputs used in general for various system? Is that okay? Yeah, I'm not sure I understand this question in details. If you can please ask it again, clarifying what you mean, what are the parameters we approximate for ML codes? This is the part of the question, I don't understand. Netcat, yeah, is a software that allow, it's not like VASP or I've been using that, it's more concentrated if you want on the discrete systems, quantum systems, lattice quantum systems. We typically deal with smaller systems because we solve the sharing, we try to solve the sharing equation, not with a DFT approximation. So we try to solve the correlated ground state for correlated ground state. But I mean, the spirit is all these things is always to try to find the best approximation for the ground state. So when you do DFT, you use a different approximation which is not in general variational. Here instead we use an approximation which is variational and better suited to characterize quantum system where you have strong interactions of correlation among the different degrees of fluid. So then there's another question. So you showed that the case when the coupling is near some years next years, but what do we, if we are with long range coupling, then do we need to find a new networks? Well, I mean, one of the main advantages of working this kind of architecture is that typically if you modify a little bit the Hamiltonian, this doesn't mean that the wave function architecture should change too much. In this case of long range couplings, if you allow your wave function to also have long range correlations, and this as I showed you at the beginning can happen if you have a deep neural network, sufficiently deep, then typically you can take the same kind of architecture. So you don't need to change too much the neural network. So he's using the Bravi, so I assume here you mean the Bravi and Kittai method gives the same results. Yeah, unfortunately I didn't show this, but in practice, yeah, this was a bit disappointing if you want, but if you use this Bravi Kittai mapping, I mean all these models, the variational results that you get with our techniques are more or less equivalent to what you get with the Jordan beginner. But this is not generally, I mean, I believe that there are other cases where this mapping can be superior, you can get better or more easily to represent with that. Okay, so now let me move on to my final part of my presentation, and then I will take more questions. I will be happy with that much. So in the final part, so I will tell you something which is related to, again, going beyond the ground state properties, right? So until now I've told you that I'm interested in, I was interested, we were interested in finding essentially approximations for these equations, for this very important, the famous eigenvalue equation for the ground state side zero of my Hamiltonian H, okay? But what if I want to do something that goes beyond the ground state? So one example that I will do, which is not strictly speaking in the realm or not yet, or that's a matter, but it's pretty much related, it's what people call the quantum circuits. So the idea is that in that case, what you do is that you want to simulate, so you want to generate a state so let's call it psi of K, which is the result of the application of a sequence of unitary. So that goes like this, so UK, UK minus one, UK minus two up to U1 on some initial state psi. Sorry, this is not very clear because I don't have much space here, but let me rewrite this in a better way. So in what a quantum circuit is, for those of you who never heard about quantum circuits, it's the very simple statement so that I want to generate a state psi K, which is the result of the action of a sequence of unitary's U, so unitary operators. Onto some initial, possibly trivial, if you want state psi. So the output of the circuit then is psi K, so the general, the state that you generated at the end after you've applied this sequence of unitary operators. Now, so this is of course very relevant because for example, when you do the time dynamics with the Schrodinger equation, you know that the unitary that you approximate in that case is just, if you understand, it's time independent, because the exponential of minus i h, assuming each part is ht, assuming that, so let me, so in the case of, for example, standard, let's say, unit dynamics, and the Hamiltonian time independent is K, this would be your unitary U, right, that depends in general, okay? So for, you can generalize these to case where you take other kind of unitary's and this is what a quantum circuit is, okay? So just to give you an idea, there is then this notion of gates, so a gate is essentially those, the unit that you put here in the circuits, and there is a universal set of gates, so essentially a set of unitary's, local unitary's, local means that, as Filippo was saying, that acts only on one or two spins, for example, cubic in this case, and you can show that you can generate an arbitrary quantum circuit but just using, for example, a set of three so-called universal gates. So one, for example, that I will consider is the Adamer gate, this is a rotation, a local rotation that amounts essentially to putting you in the basis of the C-max operator, or you can do a rotation in the z direction, so these are z by some angle phi fi, or you can do a so-called control z rotation on two pieces. So these are just, if you want, the building blocks that you can use to build a more complicated quantum circuit. Now, I mean, what you can do is that you can show that if you have a neural network, so if you have a wave function that is represented by neural network, so a very simple neural network like an RBM, so again, one-layer deep network, so very simple-minded, you can show that you can apply to these neural networks, these gates, and get out another neural network, typically. So for example, if you apply this gate here on this qubit, so this is now my set of my qubits, I imagine that I apply my gate for my unitary only on this qubit here, you can show that the neural network that will result out of this operation is another neural network with some of the weights modified, depending on the fact. The same thing is true if you do this control z rotation, which is applied in this case on two qubits, so it's a so-called two qubit operator, and then you can show that also in this case that the result in neural network will have essentially, which is just slightly larger than the previous one, but you can also write it down exactly. So the application of these two gates is it can be done in an exact way. There is however, this Adamard gate, which is again the only one remaining to implement all the universal possible circuits that cannot be applied exactly. So it's known by from this paper that if you apply another my gate onto a generic neural network, you cannot generate another strictly speaking neural network, which has a simple form like in this case. So what you can do in this case however is to use another variational principle, which is if you want the more general, that the one that we've seen so far for ground states. And the idea is the following. So imagine that I have my neural network state, so psi w, so this is an arbitrary neural network that depends on some parameters w, okay? Then I act with a unit that in this case is just this Adamard gate. So beware that now h is not the Hamiltonian anymore, but it's this Adamard gate. So this is a local unitary that acts on some qubit and that acts on this one, okay? So in general, the output state would be another quantum state phi, okay? Now, what we know is that in general, this quantum state phi is not another neural network, otherwise we will just solve the problem exactly. However, what we can try to do is that we can try to approximate this state phi, which in general is an arbitrary quantum state with another neural network that has this time some parameters w prime. So you see the problem. I have a genetic state phi and I want to approximate it with another neural network that has some parameters w prime. So this is now an approximation problem because I want to somehow match these two quantum states as closely as possible. So if you want what you can do is that you can minimize, instead of minimizing the energy as you do for the ground state, you can define the cost function. So the thing that you want to minimize is a machine learning approach, the overlap or in this case, we use the logarithm of the overlap, but this is a minus log of the overlap, but this is not very important. So you can maximize the overlap or minimize the log minus log of the overlap. But I mean, essentially you see that when phi is equal, sorry, when the psi of w prime is equal to phi, this overlap here would be equal to one and the log of one would be zero. So this loss function would be equal to zero. Otherwise if psi is closed, it's not close enough to phi, then this loss function will be non-zero. So you see that this is a sort of energy if you want for your system. So this L that depends on this parameter w prime and it will be exactly zero only when the two quantum states are identical. So we can play the same game that we played before the ground state, but this time minimizing not the energy, but this loss function, which is the infidelity for the log of, it's related to the infidelity. So why do we care about this? Well, we care about this because we want to see, for example, how hard it is classically to simulate the quantum computer. So for example, if we actually need to run a certain quantum algorithm on a quantum computer, or if we can try to hope to approximate that certain quantum algorithm on a classical computer. So most of the hardness result that we know, so most of the things that people will tell you is that this is a desperate task because the quantum computer is much more expressive than a classical computer. You can encode the explanation in many states that you cannot do classically. This is a valid argument, but in the most practical cases, this is not entirely correct because there is several quantum algorithms that can be efficiently approximated classically. So understanding where the limit between the quantum computing and the classical computing lies is very important also for the development of quantum computing itself. So for example, I mean, let's take a very simple example. So what we do is apply a sequence of Adam and Gates or this H1 on each spin on each qubit for initial state, which is the ground state of the transfer field as a model. So this is one of these three models that I was telling you about before. So this is the overlap that you get out of this variational result. And you can see that you can get pretty high overlaps at the end of the circuit. So essentially the final state is identified by one in these units. And you see that the final overlaps that you can get, for example, on as large as 60 qubits, so what you can do these days experimentally is around 98 or something in this kind of applications, even for two-dimensionals. And so for example, what you can do is that you can compare the error that you make in this approach. So again, we are trying to estimate this quantum state with a classical neural network. So we are going to make an error in this approximation. But if you run the same quantum model on a computer, quantum computer, also the final state that you will get in the quantum computer will not be exact. So this will be affected by, for example, noise. So you will have the coherence, the fact that your qubits are interacting with the external environment, you that these qubits are talking to one another. There's all sorts of noise that can disrupt your quantum computation. So, and this is an exercise that we did in this paper that has been also retaken by, done later by other people. But I mean, essentially you can compare this variational noise, so if you want effective error that you make in the variational simulation to the noise that you have on the quantum computer. And you can see, for example, so this is the kind of accuracy that you get for these simulations with the neural network. So this is the overlap that we get. So this is a straight line here. And this is the step, the error that you get on the quantum computer when you change the noise, this is the simulation of the noise, not an actual device, but a realistic simulation. And you can see that essentially, if you want to achieve the same markers that you have with the neural network, you need to have a noise level on the single qubit noise that is relatively small. So this is comparable, actually it's below what people can do these days, even with state-of-the-art qubits. And you see, again, this tells you that classical computing is very competitive with even for the simulations of complex quantum circuits. And one should also always keep in mind that one source of noise, which in the classical cases, due to your approximation power, it can be comparable to another source of noise in the actual hardware, which is due to deco-events. So this is the main message. I will just flash one of the more recent results that we have on another approach, which is called another more complicated quantum algorithm, which is called the QAOA, Quantum Approximate Optimization Algorithm. I mean, I won't go too much into the details, but it's described in this paper. This is what also Google has popularized recently with the work where they do this, where they implement this algorithm on the quantum hardware, for this famous quantum supremacy, if you want architecture that they've used last year. They've also run this paper, these kind of circuits on the hardware. And I mean, we show that this work isn't work, more isn't work that, again, with an RPM, a very simple mind that RPM, you can describe do a very good job at describing also the output of these kind of circuits. And the kind of values that you can achieve, so again, the accuracy on these approximations in the regions of interest for the algorithm are very good compared to what you can get on the app. So, and the important point is that you can also scale this up to very large number, 54 qubits, this is our simulation for 54 qubits. And for example, for this amount of gates, so if you want for this large circuit that we started, we get an accuracy which we believe at the moment is not achievable in the actual experiment. Okay, so this was a comparison with the network, I will skip it, but in essence, you cannot see near the circuits in sensor networks, if you want, I can tell you why later. There is our software, which is NETCAT, you've already seen about this by Vlipo, I reiterate that this software is completely open source, you can download it, work on it, contribute to it, if you want, we have a GitHub repository where you can also submit your issues if you have problems or it's something that you don't understand, we can try, we will try to, we do our best to reply to your questions. There's a release 3.0 coming soon, and you will see that it is based on this JAX, which is a very nice framework developed by Google to handle the neural networks. Okay, so I will then leave my last slide, which is about challenges, open things, mostly related to fermions or optimizing design phases, for example, in the circuits. But the main message that I want to give you today is that if you are interested in studying one system, in drafting one system, there is a good chance, it's not a guarantee, of course, that if you represent that wave function with neural network, you can find typically a good approximation for the kind of properties that you are interested in. Of course, there is a lot of research going on, there are problems where we don't manage to do this as good as we would like. So one example was this infamous 0.5 point in the J1J2, but there are other examples. So this is why, this is a research, and that's why we are here, but it's only thanks to the new generations that we will extend these applications and find some more interesting results. So thank you, and I will try now to answer some of your questions. Okay, thanks for this episode. I'm not sure from the last time if some new questions arrive or you want to check yourself the box, yeah. So, I mean, there is a general question on, yeah, on codes in the condenser material. So I guess this is not directly related to what I was saying that today. Can these ideas be used, no, another question is, can these ideas be used in high-temperature superconductivity? I mean, potentially, yes, in high-temperature superconduct, I mean, there are some simplified models of high-temperature superconductors, so the famous upper model, which is not realistic, but simplified, and I mean, in this kind of applications, what you deal with is a fermionic Hamiltonian, which is written in terms of the C and C diagonal operators, and you can try to essentially find the ground set, for example, these Hamiltonians and see if these support superconductivity, so superconductivity in that case, some that can be measured if you want on the refresh. Has anybody done this yet? The answer is no, because this is a challenging calculation. We did something related to this, but not exactly the upper model, but I forecast that this year there will be a lot of papers on this topic, so you can stay tuned on this. So this is done, so can I suggest some interesting problems to work on for pursuing a PhD? Yeah, so there is several lines of research, so one that I was mentioning is related to understanding how many samples, for example, do we need to learn, so to approximate a given wave function, so this is something that should be explored more. Another thing would be, for example, understanding a better understanding or actually impose symmetries in a better way than we are doing now. There are some family of symmetries that are harder to encode. These are a bit more technical, but like SU2 symmetry is a symmetry, it's not so easy to enforce, for example, using neural networks, so a general flavor of one of the things that people are working a lot on in these days is really understanding how to enforce symmetries in neural networks that are used for physics and also in quantum physics, this is one of the main hot topics, so the symmetries in neural networks. So when considering a specific Hamiltonian, how do you initialize the neural network? The term is depth and other architecture aspects, yeah. So this is the one million dollar question. You don't have a general recipe because otherwise you would have solved essentially all problems that are affecting humankind, but so what you do is that you start typically, I mean, it's the same strategy that people adopt in machine learning, essentially. So you know that a certain architecture, for example, a convolutional neural network has been shown to be very effective at identifying images in two dimensions, and it is very effective for a lot of applications on two-dimensional objects. And indeed, we started from this idea, we've taken this convolutional state neural networks and use them also for two-dimensional quantum problems and found out that in Alessio, whether the issues, but in let's say standard problems, also for quantum problems in two-dimensional, this work. So the idea is that you have to start typically from where somebody has already investigated, I mean, unless you are the first to do that, but then you have to do a bit more work, but if you are starting something that is related, for example, if you take a two-dimensional speed model, you would know at this point that a good architecture or a convolutional neural network. So this is the idea. You look at a certain class of architectures that people have already worked on and try to improve them. So there's a question, an etiquette is based on machine learning, but it requires a few steps to optimize the ground-state energy. So how it would be different from the standard DFT method? So DFT method is not based on the wave function, as you know. So it's based essentially on the density functional and it's based on another quantity, which is the density for not the wave function, which is much more complicated of it. So the two approaches are very different. Again, the density functional theory is not in its practical incarnations a variational theory. So the approximation that you can get can be uncontrolled in the sense that the energy, for example, you can get can be lower than the exact energies. In our case, it is not possible in the sense that the energy is strictly larger than the exact energy. So is it possible to apply these ideas to see the ground-state phase transition of skirmjons in a 2D magnet? Why not? I mean, you can try. Yes, that's the, if you have Hamiltonian and you can write it in terms of local operators, some in local polymedicine. For example, you can try to use NETCAT and see how well you can approximate its ground-state. Can it happen that the entanglement or coherence be destroyed by these neural networks? I'm not sure what you mean by destroyed by these neural networks. So these are intrinsically classical objects. So there is no, so the description, so the wave function, if you want, it's a classical description of the quantum system. So there is no intrinsic, if you want, collapse when you work with a classical wave function. So the collapse happened when you actually measured the quantum system and is not what we do in this case. So here we use a classical parallel representation of the quantum system. That's what we work with. So I think we can now finish. I mean, there are a lot of questions. And so thank you so much, Zep and also for your presentations. I hope you enjoyed it, everyone, the participants and also the lecturers. So yeah, thanks. And so next week we'll have the last lecture on the machine learning in condense matter by Juan Carlos Quila. So the topics are more or less similar. So yeah, see you all then. And enjoy this week. So thanks. Thanks, everyone. Bye. Thank you, Zep. Next time in Trieste. Yeah, I hope so. Bye-bye. Bye.