 All right, thanks to the organizers for organizing this nice workshop. Is the microphone working? Yes. So what I want to talk about is three different projects that I'll try to combine into these 30 minutes. In particular, I'm focusing on developing methods that focus on ideally Appinicio Hamiltonians and specifically electronic degrees of freedom. And so what I specifically want to talk about is, as I said, three projects that cover each of what I think are the big pillars of making quantum-anybody solvers. And that's, you know, eigenstate search, time evolution, and open quantum systems. And then even time evolution within open systems. So let me focus on the first one. Eigenstate search made a lot of progress in the last two years, let's say, or three years on developing new algorithms to look for eigenstates, also specifically for electronic degrees of freedom. And what I specifically want to talk about is, OK, that doesn't work, is about this one new way of optimizing any wave function in continuous space. So if you have an electronic, you want to model some molecules. You have the continuous space Hamiltonian. You want to optimize it in a better way. So what I want to do is I want to recap a bit in my own language how I see imaginary time evolution specifically in this case. The reason I wanted to reformulate this is because, you know, on this, I had the discussion with Marcus Hallow, so he told me, in principle, if you optimize a wave function and you do imaginary time evolution, then everything has to go right. And I'm going to show you that this is actually not the case. There are actually better ways to work than doing imaginary time evolution to optimize the ground state. So what I want to start off from is the imaginary time trading equation first line. So I've just written that out in terms of some wave function. And since we're working with, we have some time reversal symmetry, we have a Hermitian Hamiltonian. So we can actually recast this in terms of real valued wave functions. So I can actually rewrite everything in terms of probabilities. And this is where it gets really interesting because, well, this has worked together with a bunch of machine learning researchers, so Maxwelling, for example, and we have Kirill at Vector Institute, who are pure machine learning researchers. And so we discussed a bit, OK, what is variational Monte Carlo now really? So we can recast this thing of wave functions in terms of something of completely probabilities. So what I have is a continuity equation. And this continuity equation tells me how probability flows throughout imaginary time. What I see there is that as the probability evolves, the evolution of my probability is proportional to the probability itself. And that will be a key factor that is a bit problematic that I'll try to sketch in the next slide also, where the problems really can come from. And then since we're working with a variational model, so we have a restricted variational manifold, we have to project it back onto the variational manifold so we can do this, for example, by optimizing a KL divergence that reprojects this green arrow back onto the manifold. And so that's how we obtain a new set of parameters that have a better energy. So if I can reformulate VMC, what we're actually trying to do, and this is, again, this is the way to communicate with pure machine learning researchers who are not necessarily physicists, is we're trying to minimize a functional, so the energy functional, in a space of probability distributions. And we do that over a variational manifold. Now there are a lot of cases, of course, in machine learning that are very equivalent to this idea. And so the idea is that while you're minimizing this functional, you're following an ODE for your probabilities. And this is an interesting idea, again, because, well, there are some famous examples. So this is an example that I took from a paper of Wasserstein Gun, so if you've been in the machine learning community for a while, there was this very popular generative adversarial networks, I think in 2015 and 2016 or so. They were very powerful, if I'm not mistaken. But they were completely untrainable. If you ever tried to train this, it just didn't work at all until, yeah, I see somebody nodding, yes, until somebody realized, well, OK, what we're actually trying to do is we're trying to take the data images, which are high-dimensional pixels, but they are actually living. The data is living in a lower-dimensional manifold. We have a lot of correlations, so the actual data is lower-dimensional. And so if we initialize a model, this model that tries to capture distributions, we initialize it initially on a manifold very far away from the actual data distribution. So if we do an ethyl experiment, we can think about a distribution that is a single line. So we just have the blue line here. That's my distribution that I'm trying to capture. And the model that I put forward in order to capture this is, well, the same. It's a one-dimensional model living in two-dimensional space. And it's just parametrized by one parameter, theta, that says how far it is along the axis. And so what I can do is I can ask myself, OK, what are the typical kind of distance measures or loss functions that I would typically use? So we all know, for example, the second one, the KL divergence, which measures, gives you a distance between distributions. And if you look at what the distance is for this specific example, again, we're in a two-dimensional space, but the actual data lies on a lower-dimensional, one-dimensional manifold, then you see that the KL divergence is infinite if we have a theta that is not exactly zero, and otherwise it makes a discrete jump. Of course, I mean, you cannot optimize this thing. You will never get actually anything that is close to the result theta equals zero. OK, so of course, in the machine and community, people have thought about this. And the answer came from the field of optimal transport, which knows this kind of problem. They know about the fact that two distributions sometimes don't overlap. And the way that they formulate the distance is, and I'm not going to explain at this point what this means, but they essentially say, try to quantify the distance between two distribution by seeing how much effort, let's say, it costs to transport one distribution into another one. So and indeed, if you compute the Wasserstein distance or the Earth Mover distance between these two distributions, you get something that is proportional to theta. So now this is great, because now I have something that I can optimize. I just, I can obtain theta in the right way. And so what we did is we took this example and we said, OK, let's try to define also, let's look at Wasserstein metrics in continuous space. And this is a bit more complicated way of writing things down. But again, Wasserstein metric, essentially what this says is we have two distributions, p0 and p1. We want to compute a distance measured between them. And we compute some expectation value where we're looking over all the possibilities of how we can transport one distribution, the mass of one distribution to the mass of the other distribution. That is, in broad terms, what this equation says. And then again, if you define a way of what distance is, you typically also get an ODE from that. And so the ODE that you get from that is this on the second line, which is that now we have a continuity equation where the distribution, sort of the probability mass goes not along a term that is proportional to the density already present, but it's rather along a vector field. So we're propagating this idea of optimal transport. We have now a vector field that transports one mass into the other one. I'll show some visualizations later. And then if we try to optimize some functional now with this approach, we get a functional minimizing C Wasserstein approach. And so we can compute what this vector field is actually what it's like, depending on the function that we're trying to optimize, which in our case is the energy. So if you write it all down and again to recap, what we had before was, if we're using in the imaginary time Schrodinger equation, which I can just tell you is an evolution according to the Fisher-Rau metric, we get something that is proportional to density itself. And now what I'm saying is we need to basically add an extra term that can help us to overcome this problem of these different manifolds. And so again, this is another sketch of what I'm trying to say here is what I told you that variation in Monte Carlo is. It's defining a distribution and how it evolves throughout time. And then we do this projection on to the variational manifold. What I've just told you is that instead of taking a Fisher-Rau gradient flow, which is what I would find if I look at the imaginary time Schrodinger equation, I can just define another direction according to another metric, which is the Wasserstein gradient flow. And then I just do the same approach like I did before. I project it back onto the variational manifold. And so what is the difference now between both? In terms of the energy, the left-hand side, I think everybody who has done variational Monte Carlo recognizes this, right? So we have some differences between local energies, but then again something proportional to the existing probability. And on the right-hand side, we now have our different continuity equation. And so on the left-hand side, this is said to be trend. We teleport the same mouse. So the idea is there that, again, because it's proportional, the change of density with the density itself, we can only, it's a kind of sink-source kind of problem where we remove some mass here and then we let it pop up somewhere else. So if we would look at a two-dimensional bimodal distribution, it needs to first overlap with part of the distribution, right? So it needs to already overlap. And then it can reduce and put it somewhere else. And initially, it goes quite fast, but then at some point, it needs to really transport all this mass, and that's when it really goes slow. On the other hand, if we do the same on the right-hand side, and now we don't do this teleportation, but rather make a continuous flow, things behave a bit more nicely. It's now a simulation where, again, we're just following the probability vector field and we nicely match between the two distributions without ever needing to do this teleportation. And so, OK, if you now are curious, I do variational Monte Carlo, what do I need to do in order to implement this algorithm? It's actually pretty easy to do it. It's almost a one-line change. In practice, it's not, but in theory it is. So given that this is my variational Monte Carlo algorithm, I have some samples from a distribution queue. What I do is I evolve, imaginary time evolve my distribution. I project it back to the variational manifold, and I go through this loop again. And I'm telling you now is you can just change this loop, this one item in the loop, and go along a different direction on the probability distribution manifold, but then you just project it back. And so here are some results. It's a bit noisy because it's molecules. So anybody who's done molecules know that these things are noisy. What you see is the blue curve are the energies with the traditional variational Monte Carlo. And then the other curve, so the green one, is purely Wasserstein. The orange one would be like a hybrid version between both, where we take equal amounts according both to both directions. And we see that in all cases, we end up under the ground state energies obtained with normal variational Monte Carlo, specifically in the H10, where you have a lot of multimodality. We consistently go beyond that. You can see this more clearly in the variance. So the variances, if it goes to zero, we know that we're hitting an eigenstate. And you see, for example, on the right-hand side, that we gain almost an order of magnitude improvement with Wasserstein and variational Monte Carlo. So this is a really exciting approach that it's really helping you go. Yeah, you can make it hybrid. It's in every loop, you can choose what you do, right? We didn't optimize that principle, you could, yeah. So yeah, this was together with Kiril at Vector Institute. So this was really exciting work, just from making a connection and sort of trying to explain to each other what we're now actually doing and how we can use the field of machinery and research and the theory from there to improve also our variational Monte Carlo algorithms. Okay, so that was how we can obtain better ground state energies. Now the second thing I wanna talk about is one of the works we've done recently, which is on doing time evolution. And so again, focusing on up-in-issue electronic systems. So for that, I'll, even though I used that in the previous slides in principle, let me zoom into what are the kind of models that we can use for electronic degrees of freedom. And the reason I wanna focus a bit more on that is that for ground states, ground state searches are often easy in a way because they contain not too much correlation. You have a lot of locality, but if you start time evolving things, things that are way more difficult. Correlations start building up, they become longer range. And so your simple approximations, I mean, you can think about tensor networks, okay, they're not simple, but the approximations that you make there once you start doing time evolution, things become, yeah, quite difficult. So in what the traditional way is of parameterizing things in electronic degrees of freedom is you take, the simplest is Harti-Fock, right? So you take a single state determinant, the particles only interact with a mean field. So you have single particle orbitals. We can make those time dependent. We just evolve this throughout time. Again, there's no really correlations that we can include except, you know, sort of interaction with the mean field. The simplest next thing we can do is to say, well, maybe one determinant was not enough, so let's do the same like we did before, just multiple determinants. But with a polynomial amount of determinants, if you go throughout time, correlation start building up, you won't be able to describe anything anymore. The next thing that people haven't done then is to look at just throw factor, so we can make that also time dependent. And the difficulty there is, it handles correlations really well, but it's a bit restricted in the sense that it inherits all the nodal structure from the single determinant as well. So we're a bit limited there. This would be, as if you would do the fusion Monte Carlo on top of a choice, you cannot change where the nodal surface is. So the one thing that I'll show is to use time dependent backflow transformations that we introduced. So you can think of a backflow transformation in easy terms as we have an interacting system and we map the system of interacting particles onto a non-interacting system and then we use the typical Harti-Fock approach. So right, we take the original particle positions, we map into a new set of particle positions that knows about the positions of the other ones and we just take a single state of the determinant and evaluate it. That's how I see backflow transformations. So that's this transformation going from interacting to non-interacting as a backflow transformation. We can make that time dependent again as well. And now we can change the nodal surface and principle describe whatever ground state that we want. And yeah, by making it time dependent now as well, we can also throughout time change the nodal surface and we can accurately capture the states of fermionic systems. And so how do we evolve this thing through time? For that we use, in this case, well, the first slides will be based on that and the other one based on a new approach. But we use a time dependent variational principle and so the idea is there that we look at what is the distance between an initial parameterized state phi of theta t. We do a very small time evolution projected back onto the variational manifold just like I did in the Fisher-Rau approach before. And what follows from that is a linear set of equations where we have theta dot, which is the change of the parameters throughout time. So how I need to update my parameters to move through time. And in order to obtain that, I need to compute what's called the forces of the energy gradients. Can estimate that with Monte Carlo. I do that all the time with in variational Monte Carlo as well. And on the left hand side, we have the geometric tensor, which is sort of the correlations between the gradients of your model. And so for that one, actually, the TIA has a really nice poster on how to do that. So definitely check that out because his method really made it possible to do these things before. For me, at least, it was not really feasible to do that in continuous space. So without going into details, there are a lot of details behind this approach, but I just want to show some results and convince you that what we're doing makes sense and that it works. So on the left hand side, we have the first benchmark case. So we want to compare to a solvable model. So what I can do is I can take one-dimensional fermions, I put them in some harmonic well, and I make them interact with some harmonic interaction. If I modulate this interaction, so it's an interacting system, first of all, and I can modulate this interaction in such a way that I get the sort of breeding mode behavior that you see on the left hand side. So you see the monopole, so you see the fermions going out and in based on this driving of the system that I'm doing. What you see is, well, you don't see it, but the exact solution is shown in orange, and then you have the green one, which is the neural network quantum state approach to this, and you see, well, you see no difference, which means that up to long times, even if it's an interacting system, we can actually reproduce the dynamics of the system very well. Then we want to look at a more realistic example, and in quantum chemistry, a lot of people have looked into what is the effect of shining an intense laser on a diatomic molecule. So we simulate the system, and so what happens is just to, well, let's maybe visualize it first, so on the top we see the electric field, and on the bottom we see the distribution of the electrons. We see that as we move through time, the electrons start moving around. So this is shown here, according to the dipole moment, so I'm comparing here with time-dependent Hartree-Fock, with ED within a given basis set, and we typically choose a basis set which also restricts the accuracy that we can have. And so the first thing you can see is that I'm showing in the Astriot-Rigid basis, which is a small basis set, the results of ED, and this is like the smallest green curve. If you're using neural network quantum states in that same basis in second quantization, we can do the dynamics almost exactly the same. We don't see a difference. Now if I start increasing the basis set, so I start looking at a bigger Hilbert space, I'm getting more accurate to the real dynamics, and so what we see again is that the effect seems to be that the oscillations are a bit damped. We see some interference structure on top of it, and then at later times they start oscillating a bit longer. Now we don't actually have to be restricted because we're doing it fully up in issue. We don't need to choose a basis set, so we can actually take a continuous model. It can be a neural network, whatever you want, and evolve that through our time. And you see some, which is now the S plus C plus BF, so the backflow transformation throughout time, and we can actually get really good results that go beyond exact diagonalization in a given basis set. And then the last example I wanna show is the quantum dot. So the idea there is that we take fewer electrons, we put them in a harmonic well, and then they interact with each other through the Coulomb potential. And by changing the properties of the material that this quantum dot is in, we can quench the interaction so we can, it's as if we are modifying the Coulomb potential. So that's this kappa. So we quench this system, so we first look at the ground state, we quench the system, and we look at how it evolves through time. What will happen is again a sort of monopole form on the left, but I'm not gonna show that, what I wanna show is, what is the effect of these backflow transformation and time, well, time dependent backflow transformation. So on the top you see R square, which is the integration error. So the lower the R2, the more we satisfy the time dependent shading equation. And so what you see is the S mean such the single state of the term and then in blue, I add a adjuster, a time dependent adjuster to it, and then finally a backflow transformation. And you see that as we evolve through time, the accuracy improves by an order of magnitude so we can really using time dependent backflow transformation can get accurate results. And I wanna tell you also always that in dynamics, it's mainly the correlations that are important and I just wanna highlight that by looking at the G2. So the G2 is a quantity that looks at correlations and it's zero if you have no correlations. So indeed the green, we confirm this by looking at our single state of the term and there are never correlations. And what we observe is if we look at just through unbackflow correlations, that indeed in very first few time steps, a lot of correlations build up and then they dominate the rest of the dynamics. So you cannot even simple approximations will typically break down at the initial few time steps. And then this is with another method that it's not on the archive yet so I didn't wanna show the approach but it's a new approach of the dynamics and it's specifically shown now for a simple model. It's a spinless fermions that say, so fermions interacting via the TV model. So we have some hopping terms and a nearest neighbor interaction term. And again, what we do is we look at three fermions and then we quench it and make them interact with each other so it's like changing the medium. And what we see is you see the different kinds of lines. The S again corresponds to a single state of the term and so things start deviating quite rapidly compared to ED. And on the bottom plot, I show you the infidelity from step to step which is also the lower the better. And you see that way again with backflow transformation, time dependent backflow transformations you can get essentially any accuracy you want. And then finally, this is a bit of a different topic but I'm still enthusiastic about what we did there so it's not necessarily fermions but it would be cool to extend this also to fermions. It's the idea of now doing the same like before looking at quantum dynamics but seeing what the effect is of putting things at a finite temperature. So the first thing we do is we prepare a thermal state. So how do you prepare a thermal state? Well, typically that's done through something which is imaginary time evolving. So the idea behind that is that if you write down what a thermal observable looks like we have and we rewrite things a bit we get this sum over n which is the Hilbert space and we evolve an initial state just a basis state over an imaginary time beta over two according to the Hamiltonian that we're considering. And so there are multiple different approaches to do that. So there are some popular techniques and the first one is just to take this formula as it is and to sample indeed initial sample basis stays in my Hilbert space evolving them over beta over two where beta is the inverse temperature and just doing this multiple times and then the TPQ theorem tells us that as the system size increases we actually need exponentially few samples to do this. And METS is another project commonly used for tensor networks and there is a sort of same idea but you do a Markov chain version of it, let's say. So you evolve through time, you sample from it and you restart it to evolve again through time. Now with neuronetic quantum states we preferably don't do too many time evolutions because it's honestly expensive but what we can do better than these other methods is we can better handle correlations, right? That's the whole thing about neural networks is in principle they can handle volume law entanglement. So preferably I take something that is different than these approaches but something that can actually go through highly entangled states. So what we do is we think again about what a thermal observable looks like. And so we look at the stress and what we can also do is instead of sampling over these states and evolving them through time it's rather to introduce an auxiliary system that takes the effect of tracing out the money, generating this trace. And so the idea is that we introduce what's called thermo field doubles. So it's essentially as if you would take the density matrix, you have the rows, you have the columns, the thermo field doubles would be sort of the columns. These are the column samples. And so what this equation tells us is that if we start from the identity state, so again in terms of density matrix it's just the identity matrix, the diagonal matrix, if we now evolve it over beta over two according to Hamiltonian that only operates on let's say the rows, then we will actually be able to compute thermal observables. And so what we need to do in order to start this evolution is we need to start from the identity state so the infinite temperature state which is basically a set of product states where we pairwise entangle or spin systems which is the one that we consider in sigma which are maximally entangled with an additional auxiliary system, the thermo field doubles which again are these columns of the density matrix. So the first thing we need to solve is we need to make sure that the neural network that we evolve through time can exactly represent this initial state. And so yeah we have multiple ways of doing that now but I don't want to necessarily show about how well we are preparing thermal states. I want to show that the next cool thing which I like even more which is now once we've prepared the thermal state is that we can also quench it and look at quantum dynamics of it. So the first question we have is okay we have a thermal state now I've introduced these thermal field doubles. What is the Hamiltonian that we need to evolve over? So if you do the math, simple math but you get in the end something like this so again what I'm doing is I'm looking rather as an open system but I'm just rewriting it as a unitary dynamics within this doubled space. And so the Hamiltonian that I need to evolve over is my Hamiltonian applied to the physical system minus my Hamiltonian applied to the unphysical system. So again I just use the same machinery that I used in the previous slides on doing quantum dynamics for electrons and I just evolve this thermal state through time. So here are some results on the four by four and the six by six transverse field ising. So what we've done is we create the thermal states of the transverse field ising and then we quench it with the longitudinal field and we break the symmetry. And so the left hand side is the ZZ correlation, YY correlation, YY is just very difficult in Monte Carlo approximations because it's a cancellation between over diagonal terms so it's one that we really wanted to get right. And so by introducing a new way of doing imaginary time evolution which we call PITE, I won't discuss the details but check out the paper. It's a really stable method that allowed us to do this otherwise it wouldn't have been possible. There was some work on this already before also from Yusuke Nimura who was able to do the imaginary time but then if we really wanted to do the real time after we needed to do this way more accurately like with a higher precision than was available before. And so you see that we can accurately reproduce so we're comparing it to METs that we can run for these four by four systems. We can actually reproduce those results once we've evolved through imaginary time. We pick out some of the beta values and we quench it in the real time and that's what you see on the right hand side is the observables, the thermal observables throughout time and you see that you don't see that clearly but the effect of the temperature seems to be in it that the oscillation sort of get damped a bit for higher temperatures. So some nice side products of this project is that we have a new way of representing physical density meter so avoiding that we go outside the space of physical density meter specifically we found a new way of representing thermal states which you call Arno as a thermal states free or autoregressive model so if you wanna study topological states and see the effect of temperature on topological states this is the way to go. So if you like that kind of stuff read the paper. We can actually represent now any neural network quantum states as a thermal state so we're not only restricted to previous work that just used RBMs. And the last one is as I mentioned the only reason that we actually were able to do this is by inventing a new way of doing imaginary time evolution which is based on this sort of convolutional sampling by optimizing fidelity. So again, won't give the details just wanted to flash it so if you're interested so you would read the paper. And I'll finish there. So just to recap what I've shown you is three sort of related projects but I found this is the role a bit related. The first one is a new way to do imaginary time evolution. Well, something different than it but to look for ground states a new way of doing variational Monte Carlo. The second one was to look at quantum dynamics of Appian issue electronic Hamiltonians and the last one is to accurately model quantum dynamics of thermal states and stop there. Thank you. Questions? Maybe I can start. Once you perform the backflow in the determinant why do you need also the just throw factor? You don't need it. You can also absorb it. It's just something I always start from that one and then I add the backflow to it. It's just a more, it's often a more efficient way of capturing things but you can choose like we also did the electron gas and there we don't even need a just throw factor. But in principle you can in principle you don't need the just throw factor. Hey, so to combine your two topics of interest what would it take just a speculative question? What would it take to do fermions in real space for finite temperature? Because that's what a lot of condensed matter physicists care about. And you know, if you can show some phase diagram of a decent number of fermions as a function of a temperature using ab initio methods ally and QS that'd be really cool. Working on it and yeah, so I have a method now based on some of these topics that I've introduced that will actually allow you to do that but it's not an easy one because they're in continuous space. The first question you have is how do you represent the infinite temperature state, right? So the first thing you need to do is find a way around that. But yeah, good question. Yeah, yeah, this is McLaughlin. So it depends like they all combine to the same thing if you're looking at holomorphic functions. So some of the results here were actually with holomorphic neural networks. So then it doesn't really matter. So McLaughlin is a nicer one. Some of them just don't respect, for example, energy conservation. So it depends on what you're after. For me, McLaughlin makes most sense. It conserves the energy in principle. Other questions? Maybe we can say thanks again to the speaker.