 and let me welcome our speaker for today. Our speaker this time is Giuseppe Carleo, who is a computational quantum physicist. He obtained his PhD in 2011 from CISA in Trieste. He then held postdoctoral positions at the Institut d'Octique in Paris, EDH in Zurich, and also worked as a research scientist at the Flatiron Institute in New York. He is now a professor at the EPF in Lausanne and the head of the Computational Quantum Science Lab. And you can see Lake Geneva there in the background. His work focuses on the development of numerical methods to study many body quantum systems. I don't think I really need to tell you that this is a very difficult but also very rewarding problem to solve. So I leave it up to Giuseppe to tell us how machine learning turned out to be a game changer in this affair. So Giuseppe, please start whenever you're ready. Thank you. So let's see. Okay, so thank you, Philip, for the introduction and thank you for having me here today virtually in Oxford. So it was a pleasure. So today I'm going to give you an overview of the applications that we've been doing and also others have been doing in the context of studying many body quantum systems and quantum computing with machine learning techniques. And I will also give you a couple of ideas of what are the challenges in the field and also some of the latest applications we've done in the field. So, but the central topic that I will talk about today is essentially related to many body quantum science. So this is the light motif of the talk. And I mean, essentially, I mean, before moving to the many body scenario, I mean, you know that even if you have a single isolated spin, so the simplest, if you want a quantum system, spin one out or in the sense a qubit, you know that the properties of this qubit are fully described by a wave function which is a linear superposition of the two basis states, for example, up or down, or if you think of this as a Schrodinger cut, it can be in a superposition of being dead or alive, right? And I mean, for a single qubit, this is relatively simple in theoretical in the sense that the only thing that you need to describe is the inter-essential of these two coefficients, see up and see down, that characterize, for example, if you take the square of these coefficients, the probability of observing respectively the spin up or down when you make a measurement of this object, right? I mean, this situation, however, becomes, you know, rather intricate when you consider, instead, the many particles, many electrons, many spins, it is a simple scenario, because in that case, I mean, even conceptually, there is something that is inevitable that happens, which is essentially one of the postulates of quantum mechanics. And the fact is that now the state, psi or wave function, your state vector is a superposition of exponentially many possible states. And if you want to describe theoretical, if you want the properties of the system, you will need in principle to know, I mean, to infinite precision, to some extent, all these two to the end coefficients. And to the point, to the extent that the people like Ulster Kahn was, you know, famous Nobel laureate in chemistry, at some point said that this state of N quantum particles is essentially a monster or something that we should really not even look at, because I mean, if you consider the case of a typical material, essentially, if you think in terms of coefficients or expansion of the wave function, the complexity of this object is really higher than the number of atoms that we have in the universe. It's something that is really not, to some extent, physical, at least in his words. I still believe that the wave function is a valid description of nature, but it's clear that this exponential complexity reflects in a lot of issues that we have in several fields, starting from chemistry to quantum physics, but even quantum computing. And for example, I mean, if you want to solve the Schrodinger equation for the ground state of a given Hamiltonian that describes interactions of your system, well, you know, that in general, this is an exponentially problems, no matter how you solve it, even if you have a quantum computer, this is still an exponentially problem, as far as we know. And even apart from these standard classical problems that are also other problems, more subtle, but they are still exponentially scaling, essentially, because of this complexity of the wave function. For example, if you try to prepare a certain quantum state with a quantum circuit, so if you have a quantum computer and your goal is to prepare a certain quantum state, then in general, for the most general state, we know that this task is exponentially complex, and this is again also a consequence of this original proliferation of coefficients, if you want, in the wave function. But I mean, in general, I mean, again, if you think about even worse solving problems in condenser matter or in chemistry using classical computers, just in terms of storage, this is essentially a hard problem, because if you wanted to store the wave function of 54 qubits, you would need, essentially, all the memory of the largest supercomputer we have to date. And if you want to store the wave function of a material or something that contains even more than 200 qubits, then this is essentially impossible if you use classical storage, essentially, based on hard pieces, because this is really an exponential work. However, I mean, the thing that we have to ask ourselves, I mean, this is in line, to some extent, but from a different perspective to what Walter Cowell was saying, we have to ask ourselves whether this complexity, so this nominal complexity is really exponential for the systems we care about. So the physical systems we care about, the materials we care about, the molecules we care about. And I mean, in this sense, I mean, this brings me to the notion of some of our corners of Hilbert space. And the idea is that it's true that, in general, your quantum state will be a generic vector in a huge Hilbert space of all possible quantum states, or all the possible to gain coefficients if you want. However, it's also true that the states we care about are not, in general, random states in this Hilbert space. So they have very well-defined properties, because they are, for example, the ground state of physical Hamiltonian that are electronic Hamiltonians, or Hamiltonians describing other phenomena that are not, essentially, random. And this means that we have some simplification we can leverage in the sense that we can hope to describe a much smaller portion of this extremely large space of possible quantum states that corresponds only to physical green states in this notion. And essentially, this is a problem that arises in many other fields. So not only in, I mean, of course, from a different perspective, but not only in quantum physics we have this problem. So, for example, I mean, if you think of the problem of classifying images in machine learning, which is one of the standard applications for machine learning, it's clear that if you consider, for example, if you have an image, it's clear that if you consider the state of all the possible, the space state of all the possible images that you have two dimensional images of certain sites over certain amount of pixels, this is also an exponentially large object, much larger than the universe, if you have a standard JPEG image that you can show on your iPad. But it's also true that the states, or if you want the images we get about typically are not random peaks, because there is no way we can classify those. But those are very well-structured figures and pictures of people, dogs, or cats. So we can, this somehow dimensionality reduction is the core, essentially, of successful machine learning techniques that are leveraged every day, essentially, in applications in the industry. Now, and this is essentially the heart also of our idea that we introduced a few years ago, which is this idea of introducing neural network representations. So if you want the low dimensional representations of these otherwise complex quantum states, this is what I call the neural network quantum states that you can see in this slide. And essentially here, the idea is that instead of storing, if you want, all these exponentially many body amplitudes, for example, again, for a spin one-half, this would be two to the end. We have a black box, in this case, a neural network that computes the many body amplitudes on request. So essentially, I give you as an input this neural network, a bunch of quantum numbers that can be, for example, plus or minus one if you are working with spins, spin one-half or plus or minus one-half. And we are working with discretized quantum states, quantum numbers, but this can be generalized also to continuous quantum numbers. And the idea is that I have this black box, so I'm never storing if you want this two to the end coefficients. And I use a neural network to compute to find an approximation of this neural net of these coefficients. And the typical approximations we use, even though these are not the only possibilities, but they are based on neural network functions. So a neural network, a deep network, this is nothing but the composition. As you know, I'm sure you've seen already this during this series of seminars. It's a composition of linear and non-linear transformations, layer by layer. And this is what I've written here in combat form, where you start from your initial state vector of S. So S now is a vector of these quantum numbers. You apply linear transformation, parameterized by some parameter W. You convolve it, I mean, you actually, you apply component-wise a non-linearity, then you apply a non-linear, non-linear, et cetera, until you reach the output. So the final layer of this network, but in this case, it's just one number. It's actually, in general, a complex number, which is also something funny about applying neural networks to quantum systems. But in principle, you know, we have a combat precondition, which is entirely parameterized by these parameters W and essentially also the structure of the network. Now, I mean, as you know, I mean, and this kind of parameterization that in practice work very well and for industrial applications, they're very powerful at describing, classifying cats and dogs. They also have some mathematical guarantees. So I mean, I'm citing two here that are, I think, relatively well known. I mean, one is a standard result from Sipenko, saying that essentially, if you enlarge sufficiently, if you take, for example, a single-layer network and you take it sufficiently wide, so enlarge with shallow network, then you know that you can represent essentially any reasonable function, a high-dimensional function, if you take the number of neurons in this network to be sufficiently large. Also, there's the counterpart of this here, which is a bit more recent, that says that tells you that if you take a fixed-width deep network, so now, instead of growing it horizontally, you grow it in depth, you can also approximate an arbitrary, essentially, a dimensional function. And you know, these results carry over also to, of course, to quantum states in the sense that the wave function is nothing but a high-dimensional wave function. So it's clear that if we take a sufficiently large network, where sufficiently large might be, in the worst case, exponentially large, network we can, in principle, approximate any quantum wave function. However, I mean, there are more stringent results that, for example, concern the representability of states that are relevant for physics. So I'm not giving you a full account of all the possible states that people have discovered that are representable in terms of neural networks, of small neural networks. Small means polynomial-sized neural networks. But, for example, if you take famous states like Lafflin states, Jastro states, so these are typical wave functions carrying the names of people who have been introduced to describe a specific phenomenon. For example, Lafflin states are very famous for topological properties. And you can show that there are constructions, efficient construction in terms of neural network that allow you to write these states exactly, essentially, using small artificial neural networks. And I mean, it's slightly more general result. Concerns also the fact that if you have, for example, a deep neural network, like a convolutional neural network, like one that you see here, a common, I'm sure you know, but this is essentially the case where you take these filters to be short-ranged instead of reconnected and you stack them in a sequence of layers. Well, it's not that in this case, you can efficiently encode, for example, volume-low entanglement. So volume-low entanglement means that you can, principle, for example, describe typically high energy states that are, for example, the demerged fire temperature or even critical states or long-term dynamics of some quantum system. So at least, I mean, we know that there exist wave functions that are polynomial inside that they can describe these kind of traditionally hard states in a combat. Now, once we have an answers somehow for the wave function, it's clear that we can do several things with these answers. And this is somehow the learning part. So if you want, the machine part is the description of the wave function in terms of neural networks, now we need to learn the parameters in the neural network. So we need to find the best parameters that solve a certain task we are interested in quantum science. And essentially, in all the applications I'm going to describe in the following, all the tasks are well-described in terms of what we can call an expectation minimization of the problem, which is very common in standard machine learning. So the idea is that you have a loss function that depends in general on all the parameters of your neural network and you want to minimize this loss function, okay? And typically we write this loss function as the expectation value over some probability distribution that in general also depends on the parameters of some per sample loss that is L that I've written here. So I will give you specific examples of this learning somehow framework, but in general, I mean, all the somehow learning that will be performed in the following is of this form. So we have some neural network that contains all the parameters and we want to minimize expectation values over probability and both the thing that we are averaging and the probability can depend in general on the parameters of the neural network. So just to give you, I mean, an overview at this point that would be essentially two scenarios that I will focus on in the following. So there's a first scenario in which essentially I don't have external data. So for example, imagine the problem of finding the ground state of a given Hamiltonian. And in this case, the probability that appears here is nothing but actually the wave function square. So in this case, you can see that it depends explicitly on the parameters because this is also parameterized by a neural network. However, there are other cases where instead I have a stationary probability in the sense that this will not depend in general from the neural network parameters. And this is the case which is essentially much closer to the standard application of machine learning, at least not up to reinforcement learning. So let's say unsupervised then unsupervised learning in the sense that I have a lot of data. This is a data-driven approach and the data is distributed with this probability. So in this case, I have a stationary probability and I will try to learn my wave function from this data. So since, I mean, this is a scenario which is closer maybe to what you've already heard during this series of seminars, I will start from this one and then I will move to the second, to the other scenario, which is a bit more complicated from the machine learning perspective. So the data-driven scenario is essentially the following. So I have a representation of my quantum state in terms of neural networks, how can I learn? How can I, if you want to match these parameterizations from, for example, results that I have from an experiment? So what are the experiments that I have in mind here? So for example, I have in mind things like quantum gas microscopy experiments that can be performed in with ultra-cold atoms or I have in mind even quantum computers. So in all these settings, what you do is that you can perform a so-called projective measurement of your system. In the sense that every time you image the system, for example, with microscope, you will find that your atoms are sitting here, here, there and these are really like individual atoms in your system. And of course, since the quantum mechanics is probabilistic, every time you repeat this measurement, you will find a different outcome for the positions of your atoms. And what we know from the postulates of quantum mechanics is that these positions, for example, in this case, we'll be distributing according to psi square or phi square in this, now I changed notation, which is essentially the wave function of the system. Now, an important difference, I mean, an important feature if you want to quantum mechanics is that we can perform measurements in different bases. So instead of measuring, for example, only in the sigma z basis, as we are doing, for example, in the case of, or in the position basis, as we are doing for electrons or for ultra gold atoms, we could, for example, image the system in momentum basis or measure the velocities of the system or a composition of the two bases, right? And typically this is important because it will give access not only to the square of the wave function, but also to information about the phase of the wave function, right, that otherwise would be impossible to infer just from these images. So this is the concept of the tomography in, for example, in quantum optics or in quantum computing. And this is what we also used in this paper, some how to learn our wave functions. So the idea is that I obtain measurements, I obtain images, snapshots of my system in different bases B that I have access to, for example, in my microscope. And then what we do is that we try to match this probability distribution in all the different bases I have access to in my experiment. And essentially each wave function in the different bases is just a unitary rotation of the wave function in the main bases in the so-called computational basis. So these are objects that typically can be computed in most cases efficiently. And then at the end, so this is a typical unsupervised learning scenario in the sense that I have data from probability, from a probability distribution and I'm trying to learn how to reproduce this data that comes from this unknown probability distribution. So I'm trying to, if you want to fit this unknown probability distribution of which I only know data that has been sampled from this probability with some model distribution which is fit, which is parameterized by some parameters that, okay. And so this mathematical it is in practice is done in all machine learning settings minimizing the Kuba-Kleiber divergence which is essentially a measure of the relative entropy between the exact distribution and the approximate one. So essentially when this object attains a minimum it means that we are describing our data in the best possible way. Now, just to give you an example why do we care about this one? First of all, because if you want to characterize quantum hardware, the standard techniques that exist that are still used often in the field which is called the state tomography require a lot of measurements that scale essentially exponentially with the number of particles, number of spins that you have in your system. So if you take for example a very simple state that contains eight spins, if you want to describe if you want to infer the wave function from measurements in this paper, in 2005 they did really an epic series of measurements. They took like one million measurements in I don't know how many days to reconstruct the wave function only eight spins which is not a very large system. And this gives you a sense of how this technique standard techniques scale exponentially with system sites because they are essentially doing histograms of the wave function and this is not scaling very nicely. However, I mean if we do the same with our technique based on neural networks you can describe with a high fidelity so this is the fidelity. Some of the state, the same that we're discussing here with a number of measurements that is more affordable even for much larger systems. So if you take eight spins something that cannot be done with this all the approaches you would take it would take us something or 10 to the four or so measurements to describe the same system. This is a simple state, by the way so also that's the reason why we only need a few measurements to describe. But still this is advantageous because it allows you to characterize quantum model that otherwise you wouldn't be able to characterize explicitly. And it also allows you to measure other quantities that you were not able to measure directly in experiment. So the idea here is that if I have a parametrization of the wave function that I've obtained learning the wave function from experimental data now what I can do is that I can use this parametrization on my, if you want, classical computer to measure other quantities other functions, other observables that I couldn't measure directly in the quantum hardware. So for example, one typical example is the entanglement entropy which is something theoretically important for some applications but that is very hard to measure experimentally. The record is for a few particles where it has been directly measured and instead with this approach where you have a reconstruction of the state you can infer this entanglement entropy for a much larger system that you can of order 20 in this case that you can do directly on the hardware. So this is also a way if you want to extend the capabilities of the hardware to some extent. Now, the other application which also goes in this direction which is a newer application goes in the direction of also improving the quantum applications, quantum algorithms that are run on some specific hardware for example on IBM, quantum computers or on Google hardware, et cetera that are still not the large. So if we can help them and run more smoothly this would be also an advantage for them. So what is the problem there? So the problem there is that in a nutshell if you consider for example one of the most studied quantum, variational quantum techniques that are used to implement on the quantum hardware which is the VQE approach. It isn't the details are not too important but the idea is that what you do there is that you prepare for a quantum circuit a certain final state that depends in general on some parameters that are the knobs that you have in this quantum circuit. And then at the end in this kind of applications for example you are interested in measuring the expectation value over this quantum state that you prepared in the hardware of some observable. So for example imagine that you want to estimate the expectation value of a Hamiltonian so of an energy on the quantum state that you've prepared with your quantum hardware. And essentially this can be done just using again measurements on your quantum system and you can do typically a few of them efficiently. However, I mean the problem this is actually a serious bottleneck for these applications is that if you want to obtain a very precise estimates of the energy for example in this digital quantum computing setting because that's the key of the problem if you want to. Then I mean typically if you want to resolve the energy for example with an accuracy that is interesting for chemistry typically the so called chemical accuracy then even for very simple systems and I will give you an example in the following if this would take like several millions of measurements we are back to the millions of measurements that you had to do to some extent for tomography even though this case is better but still I mean the issue at play here is that the Hamiltonian has a lot of terms and you have to estimate them essentially one by one. So how can we help with this problem in this kind of applications? Well the idea that we did in collaboration with IBM in this case was that we can use this neural network tomography is to provide an approximate reconstruction of the quantum state and then we can measure the energy on the classical hardware. So essentially we are trading a smaller variance that we can achieve using this parametrized version of the quantum state which is again not the most general state but the physical state. That's also what we are trying to prepare on the quantum hardware with some bias. So for example, if we want to measure the energy we will have the tower, the error that we measure with the neural network, we want to be the exact one that we are trying to measure. We'll have some bias. So this is represented by this arrow here but the variance of these distributions of the blue distribution will be much more picked. So also the error that we will make would be much smaller. And with this we will. So and if you know if this bias is smaller and the accuracy that we want to achieve then this will be an improvement also in terms of measurements that we have to perform on the quantum hardware. So this works, I mean we've shown that it works for a few quantum chemistry problems. This shows for example what happens to the dissociation energy course when you increase the number of samples. So these points are the results that you obtain with neural networks and these are instead the error that you would get with standard measurements. From this plot it's not too clear but if you look at for example what is the probability of hitting chemical accuracy with a certain number of measurements with the standard approach. So for example, so these are the continuous line you would say that even for relativism or molecules for example H2 and if you take a sufficiently large basis set it would take you essentially a number of measurements which is in almost in 10 to the eight. If you want to achieve chemical accuracy it's on a very simple molecule which is H2. So this is really a problem for these kind of applications instead if we try to help them maybe paying some bias but still having a slower lower variance using much smaller number of measurements. In this case there's three or almost four or there's a minor reduction in the number of measurements we can get essentially the same precision but with less measure. Again, we are paying the bias because we are approximating the state but this is an example of where machine learning can help when computing also in reducing the budget in terms of measurements that you have to pay when you do this kind of. Okay, now I mean this was the first part essentially of my learning paradigm where what I do is that I have data from experiments and I use this data to learn so to get a representation of my wave function which is parameterized by a new matrix. Now let's move to the second part which is about simulating one new systems and there are several applications and I won't cover all of them because it will be impossible but all of them are based on the variational principle. So both to find ground states, huge dynamics or simple dynamics I'm just putting here some references. What I will do in the following is that I will concentrate mostly on the problem of ground state search because that's somehow so easier to understand it was actually. And the idea here is that my loss function quite naturally is the expectation value of the most owner. So again, for example, I mentioned that I want to find the ground state of a given Hamiltonian. So I can rewrite the expectation value of Hamiltonian as a loss function that depends now on the parameters that I have in the wave function. And I can convert and this is thanks to courtesy of Bill McMillan in the 60s already discovered this wonderful connection. I can convert this quantum expectation value into a statistical expectation value over size square which is if you want a classical or effective probability distribution, the born distribution that is parameterized in our case by a neural network. And the energy that you have to average, the class, the effective classical energy that you have to average is called the local energy. I'm not going too much into the detail of how this local energy looks like, but essentially it is something that can be efficiently computed and depends on the automatic experiments of the Hamiltonian. So if you know the Hamiltonian I can also do the efficiently typically. Now, I mean, just to give an example of how this learning works, I mean, to give you applications, example applications of this approach. One family of models we've been working on lately as and also of other people who are trying these kind of approaches are first set spin models. So these are two dimensional spin models that are not solvable exactly with any numerical approach. For example, if you try to do quantum Monte Carlo techniques, this model would have same problems or it wouldn't be solvable exactly. If you try to do tensor networks, this is a two dimensional problem. So there are entanglement issues and the contraction issues if you use peps. So there is a lot of debate around this problem, essentially because it cannot be solved exactly with any technique we know of. And one debate would exist is essentially whether there is a disordered phase that arises between the two extremes where the two interactions among the nearest neighbor's spins on the square lattice and the next two nearest neighbor's spins on the square lattice have the same order of magnitude. So the issue is whether there is the so-called disordered speed limit phase. Now, I mean, without giving you too much to my details, but what we did was to use convolutional quantum states. So wave functions that are described by deep networks and where your filters are short range. So these are very popular and used a lot in image recognition. And the idea is that in the physical setting is that these wave functions are also nice because they can be made very naturally for example, translationally bad. So they have, for example, you can easily impose periodic boundary conditions. Now, to give you an idea of how well these things perform in practice, I can give you a comparison with the early results that we obtained in 2016 using shallow neural networks. So these are the RBM results in our first paper where we used just one layer of one layer neural networks but these were long range. So these are the restricting of some machines. And this was the limit where we take the two equal to zero, the isomer model. And this is the energy that you make on the ground state as a function of the essentially of the wave of the network. And at that time, we were able to achieve an accuracy of the order of 10 to the minus three or so. Now, if we go deeper, so if we use these components that have four or five layers, so these are not super deep neural networks, but you get here, so in the 10 to the minus four, so almost in one order of magnitude improvement. And if you go actually deeper, so if we, this paper, that's a recent development and you use, for example, what we call autoregressing states that allow us to exploit more efficiently actual deep networks, you get an accuracy which is well beyond what we've done before and it's in the 10 to the minus five. So at this point, essentially, you get an almost all purpose is exact description of the wave function of this interacting two dimensional. But I mean, coming back to the problem where we introduce J2 now, so where we introduce this frustration, which is where we break somehow the exact possibility of solving this problem exactly, for example, with quantum Monte Carlo. When we turn on J2, the only thing that we can do is to compare our energies to other techniques that are operational, for example, states the energy or other functional approaches based on fermionic refunctions. And in general, what I'm putting here is the difference of energy between our CNN results and the original and these other approaches that and this model has been studied for over 20 years, so it's really a very hard problem because there's no total consensus on the phase diagram. And you can see that on almost the phase diagram we get energies that are better typically in the sense that are lower than these other approaches. However, there is a region typically in the middle here on the spectrum of this J2 where our approach is NQS, you know, that the quantum states is still not as accurate as much as we would like. The main reason why this is the case is related to the fact that we have essentially a few issues that are related to the mostly to the symmetry. So one issue that is known for these quantum states up to this point is that, for example, we don't know how to impose efficiently SU2 symmetry in two-dimensional systems. And also one other reason is that in the present of frustration the number of samples we need to describe these systems can become quite large. So in this sense we are limited by these two factors, but both factors, you know, it's something that can improve and we believe that in the near future we should be able also to get essentially state of the art on all the phase diagram. Now, in the last few minutes, I wanted also to give you a more recent application of this idea even though this is still in 2019, but more recent, which is about another kind of simulation that we also care about in this framework, which is about simulating if you want, classically. I have this neural network and now my goal would be to simulate classically a quantum circuit. So if you want, I would like to emulate as much as possible a quantum computer using a classical neural network. So without ever building if you want the quantum computer. So the task is then to reproduce, if you want to approximate the quantum circuit classically and what we can ask is how far can we go, essentially, how big are the systems we can simulate and what are the circuits that we can simulate. So this is essentially the things that you need to do when you have a quantum circuit. So the thing that you have to do is to to be able to approximate the sequence, the action of several building blocks that are called quantum gates and quantum computing. Quantum computing, for example, we know that if you are able to approximate the action of these three building blocks, for example, single qubit rotations, other markets, the qubit controls the rotations. Well, then you are able to simulate an arbitrary quantum algorithm. And then, I mean, it can be shown. This is done, for example, in this paper of ours that if you try to apply these these gates on, for example, a simple neural network that has only one layer. So, for example, an RBM, almost all of these layers of these gates can be applied exactly. Exactly means that you get out another network which is not too large compared to the initial starting network. The only I mean, even including entangling gates which are, I mean, less trivial, but the thing that happens is that if you try to apply Hadar gate, which is a super position gate, then you won't be able to describe the final state, so the output state exactly as an artificial neural network. So this is the origin of the of the issue if you want in this problem. But what we can do, and this is also somehow at the heart of this loss function approach that I've been discussing so far is that we can use a variational approach. So essentially, every time we want to apply this complicated for us Hadar gate, instead of applying it exactly, since we cannot represent if you want the state phi with a neural network, if my initial state is this side W, which is a neural network. Well, what we do is that we can try to approximate this final state phi as much as possible with another neural network that has in general some other weights W prime. So this is now my variational principle, so I will try to match these two states, for example, minimizing some distance between these two states, that can be typically the infidelity or the fidelity if you want. I will maximize the fidelity. Okay. So just to give you an idea of what we can do, and this was the first paper that we did on this approach, I mean, for example, the Hadamard transform, which is a simple circuit where you perform a Hadamard on each qubit, and the initial state was taken to be strong, highly entangled initial state, for example, the ground state of one year to the correlated model at the critical point. Then, I mean, we show that you can get fairly good fidelity as you progress in the circuit. So this is essentially when you start your circuit, and then this is at the end of the circuit, and overall the fidelity that you get are in the 95, 96 also, which is a pretty good number in the sense that, for example, you can compare this fidelity that we get, so this level of approximation that we can achieve with the variational state to what you can perform on the actual quantum hardware, where you will also not be able to prepare exactly the output state that you want, but for another reason, which is different, and the reason there is that you have noise. On the quantum hardware, you are not exactly executing the gates you are expected to execute. You will also have the coherence because you have coupling with your environment, so you will have some intrinsic noise level on the hardware that can be modeled, for example, with this depolarization noise, and what we can do essentially is that we can compare the noise, so to speak, that we have but it's not noise, it's essentially variational input that we have in our system to the noise that you have on the hardware, and for this secret that we started and this is not the general amount here, but an application that we did, you can see that if you want to achieve the same accuracy that we have with the RBM, which is this vertical line, again this is a very simple neural network, using better neural networks you can improve this figure, but essentially the noise level that you have on the hardware is of the order 10 to the minus 3, which is a pretty relatively low noise, even if compared with modern hardware. So essentially, if you take a neural network you can get a quantum computer which for these applications has a noise level which is better comparable to what you get these days up there for quantum systems. This can be generalized and we've used these also to more complicated circuits than just single hardware transform that is this technique that is called QAOA, it's a complex circuit, I won't go into the details because it's not a quantum computing meeting, but this is essentially also a very interesting circuit because it's supposed to be one of the earlier applications where you can try to solve a classical optimization problem using a quantum computer, so the question that we asked was can we see me at this classically without building a quantum computer? The answer is that for some circuits for some problems we show that we can do this quite accurately. For example what I'm showing here is this cost function which is essentially what you try to get out of this quantum circuit as a function of the parameters in the quantum circuit and we can match what can be done on the exact circuit which is this blue line with our RBM which is again a very simple neural network but of course this is not a general statement, it says that there are cases where our fidelity drops for example when you go in some regions of the circuits that are too random for example to be described by the neural network but I mean in general I would argue that in the regimes that are meaningful for the applications of these quantum techniques a classical neural network until now is enough to describe or to do even better than most hardware that is implemented out there in the labs that have quantum so for example this is a simulation for 54 qubits and you know as far as we know there is no way to simulate this efficiently with other approaches for example if you compare this to neural network simulations you can show that essentially to simulate the same circuits with the same actors that you have with these simple neural networks you will need a tensor network so an NPS which is extremely large with a bond dimension for the expert in the audience of 10 to 5 which is typically not manageable in a reasonable amount of times so this is just to say that typically we can exploit the fact that these states can encode a large amount of entanglement to describe some states so for example states that come out of these complex quantum circuits that are typically not efficiently at least describable by other states for example like tensor networks that are typically limited by entanglement ok so with this I will conclude that I will just flash here a slide on the software that we use and that we develop here now at EPFL it's called NETCAT if you're interested you can have a look at this and it allows you to to express your learning tasks in quantum physics using complex value of the typically neural networks and there is a lot of open challenges in the field I think that one of the most important challenges from a theoretical point of view is essentially to understand to better understand how complex are the states that we can learn using wave functions so this is a wider problem if you want but I think that in the context of quantum physics it's really something that we should try to to understand from the point of view perspective or quantum perspective and I think that in this sense we should try to find a measure of complexity so essentially quantify how complex is a quantum states that goes beyond for example entanglement entropy which is a measure that is usually used in the community and because entanglement entropy as we've seen is not limiting factor at least in theory to describe a quantum state with a neural network so there must be some other measure of complexity that enters the game but this is still largely unknown I have to say from the theoretical point of view this is a little bit unsatisfactory so it will be very important to work on this theoretical problem so a little bit this is a provocatory statement if you want something that I wonder is whether we can replace it entirely if you want quantum computers with classical neural networks to seem that useful quantum algorithms not only this Q&A but also others so to what extent this is true and how far can we go I would say that this is completely wide open and there is only a few works on this it would be interesting to extend this a lot beyond what has already been done and you know there is a lot of other problems most related with symmetries I didn't even discuss about fermions but this is an exploding field where there is a lot of contributions that have been done in the past years in the field so thank you I would be happy to take questions thank you wonderful thanks a lot Giuseppe interesting thanks a lot questions if people have questions then please either raise your hand or write them in the chat window and I will read them out for you yes Henrik please go ahead right so I think all the systems you showed were finite like a finite amount of qubits right I was wondering if you can simulate something in the thermodynamic limit where you can like get some kind of momentum resolution in some sense you know what I'm saying yeah so this is an excellent question so there is a technique so people doing tensor networks do that all the time so they have a way to extrapolate their tensor networks to infinite dimension sorry to infinite number of qubits if you want especially in 1D and 2D also um right I think this is possible actually also for deep networks but nobody has done this so far if you want to work on that that's a great topic I think yeah but I think it's possible but this has not been done so far yeah I also have a question on when you spoke about state tomography is there a way to turn the game around and do tomography on the Hamiltonian instead so assuming that your Hamiltonian is supposed to be of a class with certain free parameters that then you want to estimate gif and say a sample of measurements yeah is there any efforts in these directions yeah so there is a field called Hamiltonian learning which is which goes in the direction a little bit of what you're saying so what we've done is recently with Giacomo D'Orlai and others I didn't put the reference here but is that we try to solve a more general problem than that which is this gate tomography so essentially there you try to learn what is the transformation that has a genetic transformation which can be unitary or actually even not unitary and in that case yeah these ideas can be generalized it's a bit tricky because you have to work with choice matrices and other this but yes so in principle you can use that to for example to characterize how good are the gates that you have implemented in your quantum hardware which is a key problem in characterizing and understanding essentially how well the hardware is performing so this can be done but again this set of tomography is this kind of tomography is much more expensive and to be clear this cannot be performed on 100 qubits I mean at least not in the general setting yet but we hope to improve that so it's expensive in terms of the number of samples that you would need it's expensive because you have to work with a density matrix instead of a wave function and you have to perform some operations that are a bit complicated to yeah so you have to work with p-of-m so yeah essentially you have to work with in a larger space where performing the elementary operations is more complicated so the complexity is the square essentially of what we have here and you also hinted that a bit during your talking that you put it again on this slide the problem with symmetries like you said before that wasn't a good way to have some continuous implement or hardwired into the network yeah and here you write fermions maybe that's a hint as to the I don't know spin statistics of your particles that you have in your mini-body system can you say a bit more about what the problem is with symmetries or with fermions I'm not sure unfortunately I don't have a slide about that but the issue is that yeah if you are fermions you know that if you exchange two particles you want your wave function to acquire a minus sign so this is an intrinsic symmetry that you have to enforce also in your wave function neural network so there are several ways of doing that some of them have been explored also in London by DeepMind recently so and yeah the problem is really enforcing this anti-symmetric nature of the wave function in the neural network so this is a highly non-trivial problem there are some a way around this problem one way around that we proposed recently yeah fortunately I don't have the reference but it's a simple mind that essentially you map fermions onto spins and then you use spin Hamiltonian to find the ground state this can be done using the Jordan in the mapping for example or other things but this kind of approach is not essentially not always you know because typically maps a local Hamiltonian fermionic Hamiltonian for example if you have a hypermolar to a very non-local spin Hamiltonian that takes into account these exchanges and this is not the best thing you can do if you have symmetries like translation and other things in the system so essentially combining these two let's say for example rotational-transitional symmetry fermionic symmetry at once in the most general neural network this is I would say an open problem a very important one very nice okay any more questions anybody okay I have one more than the last one which is on your second part you spoke about your hope of being able to replace quantum computers with classical neural networks to a certain extent it's an open problem and it's not clear which for which applications this could be done and for which ones it's just impossible do you have any intuition for what kind of problems would very likely have an approximate solution of that type yeah so yes so if your task is performing a random quantum circuit which is a task that you were thinking about unfortunately very popular these days then you must be aware of the fact that there is no way you can approximate that efficiently with any dimensional reduction scheme so this means that if your process is intrinsically random you cannot compress it essentially because we are going if you want at the beginning I have this slide about this green region so now what we are doing if you do a quantum circuit is that we are going beyond this green region so we are going in a region where instead of having cats, dogs and people we have monsters okay and we don't want monsters but we want physical states so physical states are not random typically and in this sense your task, your quantum task should not be about describing random things so if it is about describing the ground state of physical and mental and then I would say that there is hope we can see the result in the quantum state yes it's physics again that saves your innocence or the structure that's inherent to physics that makes it possible yeah so exactly precisely I mean if your state is random there is no way you can compress it down to the dimensional representation so this is clear I mean even if you try to learn that state with another quantum computer it's actually the same I mean that's the point I want to make so even if you parameterize your state with another quantum circuit and you try to learn the output of a random circuit then you know likelihood I think that you won't be able to learn it even if you are using the most powerful quantum mantas in that case it's really a matter of trying to learn something which has no structure which is universal does not depend on the that makes sense great thanks a lot okay there's one more question here in the chat window by Alexey who is asking just out of curiosity how the idea of using rbms how did you come up with the idea of using rbms in the first place what was the hint there so yeah essentially that's a good point well it's one of the easiest I mean if a physicist gets interested in neural networks the first thing you find are neural networks that are based on energy models so called energy models essentially because these were decided and written by physicists and also the papers that you find are written in ways that physicists can understand so the first time I got interested in neural networks it was kind of natural to instead of going directly into deep networks deep learning I read what was easier for me to understand that there were essentially bots on weights and that's why I got interested in rbms I mean I love rbms but again I mean at some point we ditched them and we moved on to deeper states unfortunately because they are also more expressive as you've seen from the results but it's clear that rbms are nice because it allows you also to do a lot of analytic calculations that you come to with deep states okay great so if there is no more questions then let me thank you once again to the entire audience for the wonderful talk I learned a lot so thanks very much and the recording will be available on youtube very soon for those who want to come back thanks a lot goodbye see you next week