 So, ladies and gentlemen, good afternoon, 17th September, 2015, if you are, so I will continue the third, in fact, I will have a third, fourth and fifth lecture today, since I found out at the end of last session that I still had 20 minutes, and so this will not happen today, because I want to tell you a few things, so one, of course, was the story about what is tau, I insist on telling you that there is no kind of big dilution of all, you know, computer time problems, computer bugs, you know, and all of this cannot be diluted in the fact that what we are doing is stochastic, you know, it's just saying that, well, there are errors everywhere, everywhere, why shouldn't I put a little programming error in there, why should I wait out the correlation time and all, so I really insist on that. The second point that I will treat today in the beginning, so before the break, is the relationship between integration and sampling. So, again, so I say sampling what others would call simulation, so I try to avoid the word simulation, because I want to bring you to the understanding of the sampling aspect of Monte Carlo. We'll do this in very, very simple examples, that's where the word sample comes from. We'll do this in very simple examples from Gaussians to Maxwell and maybe a little bit of Boitzman. And then we'll see how it goes, so then I'll have either a little interlude before the break of the sampling in classical and quantum physics. I'll show you also the way things go there. And then we'll run into, we'll go into the subject of classical spin models, and in fact I'll just, I'll just try to explain some of the algorithms for the, some of the cluster algorithms for those who don't know them yet. So I hope that everybody gets a little bit out of what I'm telling you. And please interrupt me if you want to. So most of the, of the stuff, of the things that I will discuss was taken from AMUG, and this is Alberto Rosso, Michael Köpf and Vivian LeCount who helped me or we all worked together last year and this year. So let's start. So here is an integral. And we will make a test to find out whether you are the modern, you know, Monte Carlo, modern kind of person or the old kind of person. You know, it's just you have this integral and what do you think? So the more old, you know, older people or let's not say old and young, but I mean that's, you know, the more traditional aspect is saying, well, I know I is equal to two. That's the one to say that. Is that true, by the way? Well, because you are of the traditional kind of, you know, you are interested in finding out what is the value of the integral. Of course, it will be one. We'll see. But the real point is that this is an integral. This is a distribution, a probability distribution of X. And the question is, can we sample it? Can we sample X? And then that's what we want to, what I want to understand. So of course, this is a very, very simple example of sampling. So how do we do this integral? Well, I think this is one of these things, like, you know, there are many things in life that you remember, you know, you remember 9-11. You remember, you know, you remember, well, many things. And you remember the day when you learned how to do this integral. I certainly do remember it. So I remember how to do it. Then I forgot, and so that's why I probably gave the wrong answer. So then you square it, and then you have a few equations, and these equations are written down here. So you square the integral. It's just such a pleasure to do this each time again. So you square the integral. You have an integral, independent integral in X and in Y, and you write it as a two-dimensional integral, e to the minus X real. Then you do a transformation from X and Y coordinates into polar coordinates, right? It's such a pleasure to see things that we actually know. I mean, you know, why should we teach you new things when it's such a pleasure to teach you old stuff? And for me to teach, to learn old stuff and to teach you. So anyway, so here we have X and Y, but I'm getting, I will get serious in just a second. And then you have transformed this into an integral. Then an X and Y becomes R and phi. So you have an integral d phi from 0 to 2 pi, and d R exponential, there are the brackets we're missing yesterday. There were no brackets in late late late last night. So then you have this integral, then you do another substitution, R square over 2 becomes epsilon. So that R d R becomes d epsilon, and exponential minus epsilon becomes psi. And then you have this integral, and you have this integral, and then you have one. So the Gaussian integral can be either one or minus one, and it is one, and so on. So now the point I want to make is that the Monte Carlo approach to the Gaussian integral or to many other problems is not to say, hey, I can do this integral, but it's say, hey, this is phi with a flat probability distribution between 0 and 2 pi, and this is, this is a function phi that I can sample. Okay, how do I sample it? Well, I pull out my pocket, my smartphone, and I punch random number between 0 and 1 multiplied by 2 pi, and this gives me a random number between 0 and pi. And then psi, you see it here, of course the integral d psi of 0 to 1 is equal to 1, but the important part is that, hey, psi is a random variable with a weight which is equal to 1, a random variable between 0 and 1. Okay, this is, this is the essence of the sampling approach of the Monte Carlo approach, and if you like, of the simulation approach. Okay, so we did all this transformation, and we learned how, we see that, here we have it, and the sampling will go through, we'll go, later we'll go through, again through the classical problem of how this will go through passing the real simulations, and it's always the same problem, you have an integral, you don't want to do the integral, or you cannot do the integral, but you try to sample it using direct sampling, Markov chain sampling, and a process like this. So now the story becomes, is much more interesting, because what we have is, we have, so we can say phi is a random number, is a random number between 0 and 2 pi, a uniform random number between 0 and 5, and psi is a uniform random number between 0 and 1, down there in the corner. And now, what is really interesting is that the same substitutions and calculations that we can do with manipulating integrals can also be done with samples. So just so that everything is clear, let's say phi might be 3.6911, and psi might be 0.551, so these are numbers, they are samples of the probability distribution, understand? Sample, échantillon for those who speak French, stichprobe in German, and so this is extremely important. So now what we can do, we can look at all the transformations, you see for example we have psi, okay? Psi is equal to exponential of minus epsilon, so let's take the logarithm of psi, minus the logarithm of psi, and this gives epsilon. So now again, let's take, so how do we go from psi to r? Let's take 2 times epsilon, squared of 2 times epsilon, and this gives r, and so on, and so on, and we can go up back to x and y. Now what are the, so x and y, understand? Are you following? Is that, did your high school teacher teach that already? No? You follow? So what we're doing is we're using, we have integral, integral transformation substitution rules, and these substitution rules, we apply them to the samples, okay? So of course, there's a computer program right now, right here, because here you see, I take phi, this is in my favorite computer language called Python, so phi is a uniform random number between 0 and 2 pi, epsilon is a uniform random between 0 and 1, psi is minus, minus the logarithm of epsilon, r is sigma, x is this, y is this, and then I return x and y, and I do this a big number of times, so this, I'm just using the same rules that I had for, for integral transformations, I use them for the samples, and now what is the distribution of x? Well, so each time I do this calculation, I get x and y because I squared the integral, so let me plot x and y, okay? Well, there's nothing here, so here x and y, so x was equal to 3, y was 1.4 taken from the samples, phi and psi, okay? So this was 3.69, and then you can do the logarithm, and then you do the two times the square root, and then so on and so on, and then you continue, check, check, check, check, check, check, check, check. Well, so this is the output of the program, this is the output of the program, has two qualities, first of all, it is isotropic, it looks isotropic, okay? You can turn it around, I don't have the program to do it, but you can imagine, this is a, this is a spherical cloud, we go back to the program to see why this is the way, and this is of course not the case if you take a random number in x and a random number in y, you cannot turn it, it will not be, and then you look at the probability distribution of x or of y, and you notice that it's a real nice Gaussian. Of course it is, because I told you so, and I'm telling you, so I'm telling you that anything you can do with the integral, okay? You can also do it with the samples, we'll have a few more examples or illustrations of this later. So why is the distribution, why is the distribution isotropic very easy? Well, after some transformation you see it already here, you see it here, you have an, you have a, you have an integral which only depends on r, and in phi it is isotropic, so it must be an isotropic distribution in x and y, okay? All right, so let's, so this algorithm is, is, this is in fact, is the algorithm that is programmed in all, when you have Gaussian, when you, this is the, the algorithm that I explained to you is the one more or less that is programmed in, in the Gaussian routines that are available in, in, in Python and, in any other language. It's called the box Muller algorithm, so again it's just for pleasure, so okay? So now this isotropy of Gaussians in two dimensions can be checked, we just checked it, we know, we can prove it, so let's check it in three dimensions, and instead of using our x and y, so we can, of course we can use our, we can work our Gauss test program, we can make it into a real program for generating Gaussian random numbers, the way we would do it is we would generate two, so we, we always generate two, we would keep one in memory, and then it's, it's second time we would do this calculation of the logarithm and stuff, and at other time we just, we just punch out the, the one that we just had in memory, that is exactly what is done in the box Muller method, so now let's check the isotropy of three-dimensional Gaussians, and just for fun, so I, I switched from our Gauss test program to the, the Gaussian random numbers that are in Python, so with, with mean zero and standard deviation one, and then I can plot them, okay, so here I have a movie, I even have a movie, I have a movie on Gaussians, can you imagine, okay, all right, so this is, these are the samples in three dimensions that are created, so people like, like to look at these kind of movies, okay, but it works, so it works, so you can, you can, you can be, okay, so it works, so three Gaussians are, generate an isotropic distribution, it's a cloud, it's isotropic, and I think the mathematician Reny proved that this is actually the only distribution, so it's the only function, the Gaussian function is the only function, that if you put several functions in x, y, and z, and so on, and so on, in, in, in d dimensions that you have a distribution that is, that is isotropic, so the Gaussians give you an isotropic, an isotropic distribution, so next time if you need a random number or a random point, randomly somewhere, never use again uniform in x, uniform in y, uniform in z, remember what I told you, use Gaussian in x, Gaussian in y, Gaussian in z, or in d dimensions if you need, and then you have really an isotropic, an isotropic distribution that doesn't, doesn't look like a cube that, that, that, the one that you would be using if you were, would be, if you were in using x, x and y and z, so now let's all get a haircut, okay, let's get a haircut, let me, let me see, then I can get the haircut, so no haircut today, okay, so okay, look at this beautiful Maxime Berman who worked with me last year, he did this beautiful, beautiful, so now what we had is, we had this uniform distribution, I mean this isotropic distribution and we gave it a haircut, so that means that we cut, cut all the vectors equals to one, and what is the distribution that we have now sampled, maybe one of you can give the answer, not, not those who know it already, so what is this distribution, you see it, say it louder, it's a uniform distribution on the surface, on the surface of the sphere, okay, so now we have an algorithm to have a uniform distribution on the surface of fear, we can also have an algorithm to have a uniform distribution on the sphere in any dimension, right, it's a beautiful algorithm and what is the workhorse behind it, it is simply to gauss in x, y, z in your d dimensions, give it a haircut, the haircut is not really appropriate because you have to cut, you have to, you have to also put in a little growing essence that everybody gets to one, those that are smaller than one should go to one and those get, those get a larger, they actually get cut, okay, so this is the way random points on the surface of the sphere, this is my little algorithm, directsurface3d.py, beautiful algorithm, it is, it is kept, this algorithm, I think it is through, it is kept in the British Museum, it's so, you know, it's the crown of the, of the, and it's such a beautiful algorithm, they keep it in the British Museum, so you take a random gauss in, so this is the mean value, this is the sigma in x independent, independent in x, independent in y, independent in z, then the haircut means that you compute the radius and you renormalize every, every position x, y and z, z through the radius, so this gives the direct surface 3d.py, okay, beautiful little algorithm, so now, again, let me show this, how this works, okay, very nicely, of course, so please remember, please remember that uniform objects can be, are created like this, I mean there's no other algorithm that, that is able to generate random, random points on the sphere, how would you do this with Monte Carlo algorithm? Go through space and each time you hit randomly the sphere you, you say, I mean it's just, this is, this is the beautiful algorithm, it is not a Markov chain algorithm, it's a direct sampling algorithm for random points on the sphere in any dimension, was invented by James Clerk Maxwell, I will show you this a little later, okay, so this is this, so now let's do a little bit of, let's do a little bit of math just to understand, because the, the subject of today's integration of, and sampling, so now we have three Gaussians, so these Gaussians are not just objects that we simulate in some way, you know, you could say this is, is a program and we simulate, no, each of these Gaussians samples an integral and this integral is called the Gaussian integral, so each of this, each of this, each of this, so if x, xk is a sample, it samples a distribution e to the minus xk square over two from minus infinity to infinity, so this is the integral that is sampled and the integral that is sampled through independent, through, through n, so the independent, independent samples of the Gaussian distribution is the integral dx0 to dxd minus one e to the minus one half x0 square plus x1 square plus n, so on, okay, so now again, so this is for them, so this, it's not just n points that sim, that are simulated in space, they are representatives or that are samples of this integral, this is the probability distribution and this is the measure, so now we do a transformation, a substitution rule, so what is the substitution, we go from x0 to xd, we go to r and to the solid angle, okay, we do the same trick or a similar trick that we had before, but we do this not in the integral, but we do this also on the samples, right, we can do this on the samples, so we have these samples and we have it r and 5, okay, so here, so this becomes, let me see this, okay, so then we have this up here becomes r square and this integral becomes a dr and the omega, okay, so this is, this thing becomes the integral in r and omega, so we have, so the d, the d samples, the d independent samples of Gaussian integral, they sample the integral that are up here, but they also sample the integral r to the d-1, e to the minus one half r square integral d omega, so now omega is the solid angle for all those, you know, who have ever taken, never quite understood it, but it should be something in astronomy I think, so anyway, what is the rate, what is, if you look at this, what is the function that is in front of d omega, the function that is in front of d omega is one, so this means that we have a flat distribution in solid angle, so this proves that the distribution of our samples is a uniform distribution in omega and it has this distribution, a gamma distribution in r, okay, so again it is the important point that you have integrations and samplings and you have substitution rules for both of them and they are the same substitution rules because implicitly, let me just go back, so because implicitly we took our sample x, y and z, we computed the sample, the sample of r and we used the substitution rule, okay, so now we have, so this means that this, the d samples x, they are the same as one sample of omega and one strange sample in r, now we can play a few more games on this, for example we can replace, so again what we did, we can replace the distribution in r by a haircut, by a unit point, so this is a distribution by r multiplied by an independent distribution in omega and if we play around with the distribution in r, we don't change omega and we conserve the isotropy and so this means why we have a direct surface algorithm, so now let's do a more, another trick, maybe we are interested not just in having random points on the surface of the sphere, but we want to have random points inside the sphere, so imagine you know it's just like a like a like a chocolate ball that is filled with uniform density of chocolate inside, so now we have to replace the distribution of r, which is r to the d minus one e to the one half r square, we have to, we have to replace it with a distribution which is proportional to r to the d minus one because the measure, the uniform measure in d dimensions is r to the d minus one dr, right, dx zero dx d minus one is r to the d minus one dr d omega, so we need to have the rs distributed not uniformly in r, but with a distribution r to the d minus one and let me tell you that the way to do it is you take random numbers and you multiply them with one over d, then you have a uniform distribution of points between zero and one, so this gives you the algorithm, third or fourth algorithm that presents to you, direct sphere 3d, you take random gaussians, random gaussians you have them here, so this gives us, will give us the uniform point, then we give this x, y and z, we give them a haircut by dividing it by the square root of x square plus y square plus z square and we multiply them with a uniform random number multiplied by one over, by to the power of one third, because we are in three dimensions, okay, you see what I'm doing here, this is clear, so I'm just playing games with this gaussian distribution and now the output of this algorithm, so this is the algorithm, you can look it up on my website, so this is the algorithm, now we look at the samples and the samples very easily, okay, and now isn't this beautiful, all right, so this is an algorithm to generate random points inside this here, another algorithm of course would be to take random points in the square, this is also what you could do in a cube and you cut off all the points that are outside the sphere, the difference, so an alternative algorithm would be to take random points between 0 and 1 in x, 0 and 1, or minus 1 and 1 in x, minus 1 and 1 in y, minus 1 and 1 in z, but this algorithm of course gets into very, very big trouble in high dimensions, so now let me do after this little trick here, let me go back to physics of where we stopped, almost stopped yesterday, I was a little bit disappointed that there was no, there was no remark because I showed you this slide yesterday and I showed you that, I showed you that all, that molecular dynamics, I had even discussions over dinner, that molecular dynamics and Monte Carlo was exactly the same, right, we discussed this, so this and this is exactly the same, but of course something is missing here, right, and what is missing, what is missing? Yes, there are no velocities, so this is also maybe all of you know that already, but maybe it's worthwhile spending three minutes, so where are the velocities in Monte Carlo? Good question, right, this is a good examine question, so how can I say that this is the same, this is the same, if this, if half of the parameters or the dimensions are missing, so let's look at what is happening, well what is happening is simply, excuse me, on the left here you have, you have, that's also one reason why I put them in the box, not with purely boundary conditions, you have the four particles moving around, they conserve kinetic energy, so the kinetic energy of our molecular dynamic system is preserved and this means that the kinetic energy of our N, N equals to four in our case, v zero square plus v one square plus v three square is conserved and each of these in our example each of these velocities is a velocity in x and velocity in y, okay, and this makes, okay, and this makes that we have the kinetic energy is preserved, so we have v zero x square plus v zero y square plus v three x square plus v three y square is a constant, well let's call the constant what is it, okay, so what is this, what is, what equation is this, this is the equation of a sphere, this is the equation of a sphere in eight dimensions and so last time, the day before yesterday I told you only half of the story when this mathematician Sinai and Simani proved the theorems, so I gave you only half the theorem, half the theorem was that yes the positions are randomly distributed, each of the, each set of four positions is randomly distributed, this was the first result, second result the velocities are random points on the hypersphere in eight dimensions, okay, so it is, so the second part of the Monte Carlo sampling for hot spheres is so we had this, you know we had the tabula rasa algorithm two days ago for the positions and now we need a new algorithm and what should this new algorithm do, we need a algorithm, an algorithm to sample random points on a eight dimensional sphere, so fortunately we have just become experts in doing this, we have become big experts in doing this and the way we do it, okay, so we have, we have a probability distribution, the pi of v0 v0 v1 vn minus one is one if the velocities are legal, that means if the velocities are on the surface of the eight dimensional hypersphere and are zero otherwise and this means we have to sample random points on the surface of a sphere in in d times n dimensions, that means in eight dimensions if dimension is two and we have four particles in eight dimensions and this means that we should do, we should choose the velocities vx of each of the particles with the appropriate Gaussian and the haircut that I told you about becomes unnecessary in the limit of n going to infinity if you choose correctly the sigma, so this means, this means that the Maxwell construction is exactly the, not the Maxwell construction, the Maxwell distribution simply comes from the fact that in the British Museum they have this famous algorithm, Gauss surface, the direct surface algorithm and that the Gaussian, the Gaussian random numbers are, the Gaussian distribution is an isotropic distribution in n dimensions and can be used to sample random points on the sphere and this is the original argument already given by, by, by Maxwell. All right so now I come to the conclusion of the first of three parts, maybe we can have a little discussion, we have for the first time, we'll go through it again, we consider this concept of integration and sampling, so again sampling doesn't mean that we do simulation, sampling means that we have examples of, of an integral, then I showed you, I showed you two examples of sample transformation, sample transformation means that the integral, the, the substitution rules that we usually use for, for integration can also be used for samples and then we had algorithms for equal probability inside the sphere, equal probability on the surface of the sphere and the Maxwell distribution. So the, the algorithm that we considered are this Gauss test, the test program of, of, that used the i square, the Gaussian in 3D, direct surface and direct sphere in, okay, so are there any questions? Okay, well what it means, what it just means is that the distribution of the, if you have the sum, you know, r is the sum over n random numbers, right, I just, I just forgot to, yes, so the, the, exactly, so then they have a very narrow distribution and that, that is the, I mean the sigma that I, that I didn't put out is the sigma that comes from the, from the, from the mean kinetic energy at temp, at the temperature, I didn't want to go through it, but then you see that the distribution is peak for the, you know, you have this r, r, r to the d minus 1 times e to the minus x square and it becomes a, becomes a delta function in the limit of n going to infinity. But, so what is proven for finite number of particles is that they are actually, they are, so a finite number of particles not described directly by the, by the, by the Maxwell distribution, it is described by, by a uniform distribution on the surface of the sphere. So these four particles that we're running around, their exact distribution is something very close to the maximum, extremely close to the maximum distribution, but it's the distribution of random points on the surface of the sphere. You don't, you don't, you're not, you're not, it's okay when I said about this 2 and 1, I didn't want to be unpleasant. Well that is the, well it is, I mean we learn this in, in, when we, when we, when we have our classes and sometimes it's less, less explicit. So what the, it's less explicit, but the, you know, the Boltzmann or the, the Maxwell distribution applies to any system for which the, the Hamiltonian separates. But I mean, it's, it's, you see, you see best in our case, in, in the case of hard spheres, the distribution of positions is really completely decoupled from the distribution of velocities. So it is a, it is not, it is not obvious, but this is underlying, this is, this is the underlying assumption of all of statistical physics more or less covered. So if you have a potential energy plus a kinetic energy, okay, then you have e to the minus beta means the probability e to the minus beta e port in minus 18, okay. So in our case in the, in the, right, so this, this is a separation. So we have a probability distribution of positions and velocities and is given by this and it is given by the product of a probability distribution of x times a probability, a probability distribution of pi, pi. So this is equal distribution for hard spheres. It's just so, you know, you can discuss it so clearly for hard spheres, but it's the same for, for in, in very general. So the, the, the probability distribution is the, the equal probability distribution here for, for x and, and the maximum, and the maximum distribution for the position. But it is true. It is not obvious, but it's, okay, in this, in this, so now we learn something more. It's not only that the, that the, that the positions are completely deco-related. The velocities are completely deco-related from where the particle is. So the full Monte Carlo algorithm would be that I, I couple this here, okay, to, so this, and maybe I should, I should, I think I'll be headed last year. But so the full algorithm for simulating hard spheres would be, I have, I would have one box with this, this Tabula-Raza algorithm, and the other box with simulating velocities in, in eight-dimensional space, and I could randomly put velocity vectors on each of these, of these balls, and this would, this is the proven distribution of, for this, in the box. Comments? Objections? Okay. So I don't go into the details of the advantages of, of this algorithm, but maybe, so if we have more time, or if we had more time, then we would, for example, we would see that this, the Gaussian algorithm for, becomes infinitely more powerful, you know, so let me, let me try it. Okay, so this is the sphere inside a cube, and of course you can, for example, can generate, so how, what would be an, another algorithm for, for generating random points on the surface? Well, let's do it in 2D. Let's do it in 2D. So what is another, an alternative and less, and less knowledgeable algorithm for sampling random points on the surface of the two-dimensional sphere, that means on the circle? So how could we do it? Another way is we take, we take a random number between minus one and one in x. We take a random number between minus one, one in y. So we take a random number over here, everywhere here. Okay. Then x square plus y square smaller than one. So we take away this one. So if it's, so we, we take random points in the square. We cut out the corners, haircut on, just close to the ears here and here. Okay. So then, then we do, do a haircut. This would also be a valid algorithm. Okay. So you take them inside the sphere, and then you can, can do the haircut. Then you also have, this is an alternative algorithm for, alternative to, to direct disks. And in fact, you understand this point? You understand this? So you cannot do the, you, you, you take, and, and in fact in the, in the box Mala, Mala algorithm for, for Gaussians, it's this one that is used, because it avoids computing the, the logarithm that is in the algorithm that I showed you. Okay. So let me continue a little bit. And I want to do these compiles. I should have done this one, one second. So now, this one, I wanted to show, want to show this one, one second. All right. So now let me do a little special. It doesn't show here below. That little special on the relationship between classical and quantum systems. I was not supposed to talk about classical, about quantum systems, but let me do a, just a, just a few remarks on them to show you the great generality of what we are doing. So we had, of course, we are doing classical systems. So one way of, so you, you saw this, these hard spheres before. One way of looking at our, what is our business consists in saying we are doing simulations of hard disks. This is the more, less, I think this is the less fruitful way of looking at things. It is much better to think that we are doing classical physics. We have a partition function. And the partition function is, is integral dx0 to dx3 of, each of these are two-dimensional vectors of the probability distribution x0, x1, x2, and x3. This probability distribution of x0, x1 is one, if the configuration is legal and it is zero, if it is illegal. Okay? So this is, this is what we are doing when we are, when we are sampling in Monte Carlo, Monte Carlo calculations. Okay? So of course, in general, this can become more complicated. And this can be, can be, this is replaced through the Boltzmann distribution. The special point is that in all, all cases of classical physics, this, the, the distribution pi is completely explicit. You can write down pi for any configuration of, of your system. The problem, and the real problem is, two cases is doing the interval, computing, computing the partition function, or doing the sampling. So again, let me say this. So we have, in classical physics at least, we have Z, which is equal to the trace to E minus beta, the energy. So this is what I just wrote down here for hard spheres. It is, so this means is the integral, let me write it down, multiple integral and particles. So this is the, this is the Boltzmann wave. So this is the, this is the probability wave, or the statistical wave. And this wave is completely explicit. So this is completely explicit, easy to compute. Okay? The problem really is doing the integral, or doing the sampling. So now, in quantum physics, I think there are, there are other lectures. You are, you are, you are, you see many subjects in quantum physics. I did not, did not want to do, to push too much material in one, in one course. So this is just simply one, a single particle as a function of imaginary time. So it's a one-dimensional particle here in X as a function of imaginary time. We can go into details if you want to, in a harmonic potential. And so this is, this is the path integral. And what is the, what is the object that we are, so we need, we need to simulate the path integral. But what does this mean? Simulating the path integral means nothing else. But to sample the, the quantum partition function, z, excuse me, z, z, which is an integral over time slices, so over intermediate slices here, chuck, chuck, chuck, chuck. Of an object that is the partition function between, that is the density matrix between x0 and x1, x1, x2, xn minus 1 and x0. So again, we have the same object. So this was in classical physics, Boltzmann distribution. So this was in classical physics. In quantum physics we have z is equal to the, the x density matrix of x, x, beta. I don't want to go into the details. And this is written as path integrals, as the, as this integral that I just wrote on here. So again, the sampling problem is exactly the same as we had before. So in the hard, in the hard disk system, what, what was the naive algorithm? The naive algorithm was we take, we take one disk at random. Markov disks was, we take one disk at random and we move the disk in plus x or minus x plus y or minus y. Now we have a new system with this, with this speed. What is the naive algorithm? The naive algorithm at each point, we take a random, a random point here and we move it in minus x or plus x, we move it a little bit. So this, again, I just wanted to, wanted to show you this form. This is what I want to show for all of those. There may be a few among you who don't do path integral Monte Carlo simulations. Maybe there are a few who haven't done yet or I think that I just wanted to say that you don't have to be afraid of these calculations. There's no reason to be afraid of them. There's a 10 line program that does exactly that applies the concept that is so, so familiar to us from, from hard spheres. It takes the object, moves it a little bit to the right, moves it a little bit to the left and this is the program. So what does it do? It doesn't take a random disk. So before I had the four, so the object is, I had the four disks. So I had the, so the algorithm consisted in taking one disk at random and moving it into, into, into, into a, into a random direction. For the path integral, the object is like this and as before, so this is the imaginary time and this is x. So the algorithm consists in taking a random, a random particle, trying to move it a little bit by, by minus delta in x minus, between minus delta and delta, computing the new rate, the old rate and accepting the move with the, with the, accepting the move with the, with the ratio of the new rate divided by the old rate. Okay? So this is what we have here and then accepting the, the move with the probability. So let me just say, just like in, even better than in classical systems, we have, so is that clear? You, of course, you know, this is good to everybody. So everybody, each of us could, could just sit down, program this 10 line, 10 line program and have a valid, and have a valid experience with, with, with quantum Monte Carlo. This roll free is simply a Gaussian. So this is a Gaussian distribution and we could sit down and, and actually, yeah, it's a Gaussian distribution because, because I wrote it up here. So it is, it is, so here is the free Gaussian. In fact, so it is simply a Gaussian distribution and here, so what I'm doing here is doing, calculating the free part and put, and in an external potential. So it's a, it's a complete calculation of a particle that thinks that it is a free particle, but that has to learn that it lives in a, in a harmonic outside potential. So now, what is really interesting in, in these systems of, of quantum, in these quantum systems for path integral is that you don't have, only have this Markov chain Monte Carlo algorithms. You have also direct, direct, direct sampling algorithms. So the problem is, how do you see me, how do you sample? So in the classical system, it was, you would, you sample positions, you sample these positions from the uniform distributions and you have to sample here in, in path integral simulations, you have to sample the path from this distribution with, with this rate. So I showed you an algorithm, I was, I showed you an algorithm that worked for four disks. I was able to show you the algorithm, the Tabula-Raza algorithm that did the direct sampling. So for this, for these problems for, for a single particle or non-interacting or particle interacting with the harmonic potential, there's also a direct sampling algorithm that goes back to, to Levy in the 1940s. Let me just show you. So what is, what is possible is you can compute. So let's say you have x0 and xn fixed. Just don't, as you suppose, for example, you have these positions, you have them fixed. You can sample the distribution of the point x1 exactly from a Gaussian distribution. Then you can sample the point x2 from a Gaussian distribution. You can point, sample the point x3 from a Gaussian distribution and so on and so on. Oops, so this. Okay, so you do, can do the direct sampling algorithm. This is called the Levy construction algorithm. It's also here, Levy free path, freely available on my website, like constructs. Here it's, here's the version for, for free particle in, with the, with the positions at zero and at the final position fixed. But you have, you have also this for harmonic, for harmonic, for harmonic path. So now here, look at this. So this construction, so you can construct this path without any rejections. You have a direct sampling algorithm for this path. So you can, you can compute this path without, without any use of Markov chains. And this is the basic, the basis of the, of our, of the, of the exact sampling algorithm for, for bosons in a trap that I could discuss with people who would be interested. So let me just, so the second part, let me make, come to the conclusion. So the concepts we considered here again were classical in quantum physics. So the main point of the path integral is that it makes the calculation of the weight, the probability, the weight, like the Boltzmann weight, it makes it completely explicit. So the statistical weight in quantum physics is not given by the, by the Boltzmann distribution. It is given by the density matrix. So this is density matrix and not, not simply Boltzmann distribution. Okay, first point. Second point is that in path intervals, that, that path intervals allow you to have a completely explicit weight of each configurations. And I showed you the Levy constructions. And the algorithms I considered are the naive harmonic path algorithm, the Levy construction, and I also have a version of Levy harmonic path. Okay, so are there, are there questions, comments? You can do it, you can do it if your model, you can do it for any particle in an, in an external potential. It becomes explicit. So the point is you need, you need an explicit formula, but it's, let's say the, the, the simplest version it is, it is for, for harmonic external potential or for, of, but you can have, there are many versions. You can, you have kind of particles in the box. You can have particles in, in, so single particles. So now, when it becomes more interesting, of course, what you are, what we are interested in, we are interested in, in interacting particles, right? So the point is here that for interacting particles, of course, let's say for two interacting particles, you still have constructions that are direct, direct sampling constructions. But then the corrections form with more particles, with many particle interactions have to be taken into account through, through rejection methods or through, through biasing methods. But the point is, the point is in this, in this, in this algorithm, let's, let's put it this way. So, so this naive algorithm knows nothing about quantum physics. Let's say the proposal algorithm knows nothing about quantum physics. It just takes a path, it takes a bead and tries to move it to the right or to the left. So there's nothing included. So everything, the whole, the whole, the whole, everything about the Schrodinger equation, everything about the quantum nature of the system has to be implemented through rejections. So this always cuts down on the, so on the probability distribution that you still have. So the Levy construction or the, this Levy, this algorithm here, it already knows the free Hamiltonian. It makes no error. It proposes as a solution the free Hamiltonian. And then through the rejection method of the, of the Metropolis algorithm or of Monte Carlo algorithm, you cut away what, what, what the difference between the interacting system and the, and the full quantum non-interacting system. So I don't, I don't have it here, but we have a, I have a program just, just a few, a few dozen lines that does n-particle boson simulations. And it is a, without, before having the, the, the rejection it already, it does all, for example, all the permutations are taken into account. And afterwards it's just the interaction. In fact, it's the more than two-particle interactions that have to be cut away in the haircut way through, through rejections. More comments? Okay. So let me maybe, maybe let's do 20 minutes of the, of the final part of what I want to explain to you. Okay. So now I will do, I will discuss a little bit classical lattice spin systems. And the main point I want to do, want, want to make is I want to make the, the cluster algorithms completely clear to everybody. So now, so now let's look at the easing model. I will be discussing the easing model for a while, for, for most of the time. Of course, everybody knows it. It spins sigma plus or minus one and made up such that the pair energy is minus one if they are aligned. And the pair energy, if they're aligned plus or if they're aligned if neighbors. So it's only neighbors that interact. And the pair energies are minus one if they are the same. Is it like this or like this? And the pair energies are plus one if they are like this or if they are like that. Okay. So for concreteness, let's look at the two-by-two easing model for simplicity. So for the moment without periodic boundary conditions. So in this, so you can immediately say, see what are the energies. So for example, let's say, so this energy has, so this has an aligned, one aligned, two aligned, three aligned, four aligned partners. One aligned, two aligned, three aligned, four aligned partners. So the energy here is minus four. Then it has two configurations where the energy is plus four where everybody, all the neighbors are unaligned like here. Energy is four. Energy is four here also. And the others have energy equals to zero. Okay. So two-by-two easing model have 16 configurations and so on. So now let's, which way do we have, which is the way we have to have here to enumerate? So here I have a, I start with one configuration. Let's do it before doing the Monte Carlo. Let's do the enumeration. So, okay. So this is, for example, so, okay. So this, do, does anybody see how, which is the algorithm which I used to enumerate this? I mean, I gave it already in the title, but you see this, which way this is enumerated. So it is not, it is not done in the way that I take the numbers zero, one, two, until 15. And I take the binary of zero and the binary of one. So, okay. This is a very useful little trick. So which way is this done? You see, does anybody see this? What is the way it goes? You see here is, everybody's, you have four minuses and here is three minuses. So it's this guy here that has changed. So going from this to here, it's this one that has changed. Now, how many, how many, how many spins do change, change from here to here? It's this one that changes, this and this remain the same. So let's see whether we have a pattern. How many change from here to here? It's this one. How many change from here to here? It's this one here. It's again this one. Isn't it funny? So then you go from this to here. So it's here, this one that changes, this one. Now it's this one and so on and so on. So this is an enumeration of all the spins by changing only one spin at a time. Okay. And this is called a great chord enumeration. You see, but you see how it, how it is done? You see, you have this thing here. You can flip it over if I'm not mistaken. Okay. For example, this thing here, you can flip it over and it changes only here. So you see this one changes at every second time. One, two, two, two, two. And this here is two, four, four, four. And this has eight, eight. And this has 60, eight and then 16 and so on. So this is a great chord enumeration. It changes only one spin at a time. You can use it for larger system. And the advantage is that if you want to compute the energy of the new configuration, you only have to change, to compute the energy of the local neighborhood of the spin that is changing. Very, very useful. Can be programmed, is programmed in, well, is programmed in seven lines. You see it here. So this, this, you, you call this, you call this routine and it gives you, it gives you the bit that changes. So you start off with all the spins like minus one, with the energy like minus two n. And then you, you, you have to, you have to initialize this, this tau that you, that is used here. And then two to the n time, you, you ask the great chord which one is the spin that changes. You compute the, the, the, you compute the magnetic field on that spin. You compute the energy change and you continue. So this is of course still an, an exponential algorithm. It does not, it's not very useful for a very large system, but allows you to, to do computations of a little, a little faster of the enumeration. So this is the way it was done. It allows you to compute the partition function, partition function sum over e to the minus beta e. But of course much easier to write as the density of states times e to the minus beta e gives you also the energy. Here's a, a one over z is missing here in this formula. Okay. And this can, then you can use it. So if you have the partition, if you have the, the density of state, so you use the great code to compute the number of states that you have energy for two, zero and so on and so on. And then you, so this is, this is a program called thermal easing. You compute the density of state, density of state here as an energy of energy, as a function of energy using the great code. And then for all temperatures, okay, you continue and do very simple calculation. This allows you to compute the specific heat and already for systems with purely boundary conditions here. And for two, two by two, four by four, six by six, you already see a nice peak going up close to the exactly known, exactly known transition temperature. Okay. So now last thing I'll do before the break. Let's do Markov chain Monte Carlo. So Markov chain Monte Carlo, what does it mean? You take a configuration just like we had yesterday or day before. You take one spin, so you take a configuration, you take one spin or one configuration A, you propose a move from A to B and the probability that you have to, with which you have to accept the move is the acceptance probability from A to B is the minimum of one e to minus beta energy of configuration B minus energy of configuration A. So again, this is a little program. I think everybody should have, should program this once, once in their life. So this is, this is the initial, so there's some initialization to compute who is a neighbor of whom. This is the spin, the initial value of the spin, a random choice between one and zero and one and minus one. And then at each time step, you take a random spin, you compute what is the energy to flip it, and this gives you delta E and exponential minus delta, beta times delta E. If the random number is smaller than exponential to minus beta delta E, you flip the spin, so the spin is multiplied by minus one, or otherwise you don't do anything and you continue. This calculation allows, of course, you have a small system to give you exactly the same, the same, same results as before. You can do much larger systems. Okay, and so give you the simulations. Okay, for at high temperature, you have plus domains and minus domains. See them very nicely. So this is a very basic calculation. At lower temperature, you have either all minus or all plus configurations, okay, only single spin. And in the temperature and around the critical temperature, you have big regions of pluses, big regions of minuses. So now, as we get, so let me just do the first idea. So now, as we have, well, the convergence time of this algorithm becomes quite slow at this critical temperature. And the idea is in this, so in this idea, in this thing, when you have around the critical temperature, this is what I wanted to show, on the critical temperature, you start having big regions, you start having big regions of minuses, for example here, that are all over all the systems. And then you have a connected of plus. So it is very difficult. So imagine you flip this spin over. Very difficult not to flip it back later and becomes extremely difficult to move the system further in time. And then, so the idea of cluster algorithms came up. So what do we want to do? We want to flip larger systems, parts of the system, not just the single spin. We want to flip larger parts of the system. And the easiest idea would be, so this is my first algorithm for Monte Carlo. So you take one spin, so this spin here. And you say you take all the neighbors that have the same spin. For example, you move this one, and this one, and this one, and this one, and this one. But you don't stop there. You say, well, and then try to flip those. But it would be even better. So you can do this. But then it would be even better. You take also this neighbor. And you take this neighbor, and this neighbor, and this neighbor. You take all the neighbors that are connected, and you try to flip them. So for example, these are the connected. Try to flip them. Okay. Next step, you look for a new spin. Now, what do you do? You find all the neighbors. For example, this neighbor, this, this, and all the neighbors of those that are negative. And all the neighbors of those are the negative. They are those. Okay. Then you flip them. And in three steps, we have everybody plus. And then what's the next move? You try to take one, and I connect all the neighbors, and all the neighbors of the neighbors, and so on. And try to flip them. Check. Over. Over. Over. Okay. All right. So this is a bad idea, because this does not satisfy the Boltzmann distribution. Okay. So this is not what I should do. Okay. But what is really amazing is that the minimal idea beyond this, but this is what I will do after the break, the minimal idea beyond this is to take. Okay. This is Wolff's cluster algorithm. I'll show you later on. Really amazing. You take one spin. And now you try to build a cluster. So what we do is, so in the, in the, in the, in the, in the metropolis argument, before we take one spin and try to flip it. Okay. Now the idea of cluster algorithms is that you make a cluster of spins of similar, of same, all the plus spins here and this now because it shows the first block. You take a cluster, and then you flip the whole cluster. Okay. Then you flip the whole cluster. So this idea I showed you before, this was a, you understood that this was a bad idea, right? I mean, but all the bad ideas are the, are the precursors of the really great ideas. All right. So this was not a good idea. So now I do the following. I take this spin, and now I take this neighbor to the right, but I only take it with probability p. So to construct the cluster, I will only take the, I'll take this guy along this line only with probability p. Okay. And so now what probability p means, I take a random number between zero and one. If it's smaller than p, then I accept this new one. If it's larger than p, then I don't accept it. Okay. So now I accepted it. Okay. Now this one, I also accepted it. Maybe I also accept this. These are the new spins in the cluster, spins already present. Okay. And now, for example, now this is the cluster. I rejected this guy. I rejected this guy. I rejected this guy. I rejected this one. I rejected this one. Take this one. And so on. And then I compute probabilities and then I flip this whole cluster. And so in order not to be done before the break, we'll look, we'll check into how the concept of a priori probabilities is able to cope with this super brilliant idea of constructing a cluster, but not firmly soldering all the plus spins that are together together and flipping them, but only taking its spin with a probability of p. You'll find a general algorithm for any p, and this will be the word for cluster algorithm. Okay. So maybe we have five minutes of discussion if you want to, or if comments are. And we'll, so we'll also arrive at a 10-line program of the, of the wealth cluster algorithm for this easing spin, for this easing spin model. Comments? No, no, no, what I'm saying is p is a number. We'll find that there is a magic number, but we'll find an algorithm that is correct for any p. So maybe let's say p is one half. Okay. And we say that we want to have, and we want to run this algorithm with p equals to one half. What I, what I was saying, you misunderstood me. What I was saying is if p, p is fixed is a number. What I was saying is in order to decide whether I accept this guy but I accept this part, this, this spin, I take a random number between zero and one. p is here. And if it is smaller than p, then accept the new spin into the cluster. If it's larger than one, then I reject it. No, no, no, no, no, no, but what I told you last, last time, what I told you last time is we have the, we have the Metropolis Hastings algorithm. I was going quite fast. We have a way to make that for any p, we have exactly the same outcome. Okay. So let's put them, so we have an algorithm. Yes. So let me put it this way. I construct the cluster. The cluster that I construct depends on the value of p. Okay. And then I'll have a rule to decide whether I accept this move or whether I reject this move. And then the outcome will be, will be, will be each configuration will be, each configuration will be given with the probability e to the minus beta energy of sigma. But it's an important point you raise, just like in the, in the, in the, in the hard disk system, we didn't discuss it. I moved a little bit to the right or a little bit up or down. There was a parameter sigma. Okay. And the physical outcome of my, of the, of the simulation did not depend on the parameter sigma. Okay. So I get exactly the same configuration. The efficiency of the algorithm depends on the value of sigma. But the outcome of the, of the algorithm, let's say the, the, the samples that I obtained are independent of the value of sigma. I moved up or down, up to the right or to the left, up or down. If you look in the program, it was x went to x plus delta and delta was a random number between minus, minus delta and delta. So the choice of delta, the choice of delta influenced the, influenced the speed of the algorithm. Okay. But it does not influence the results that I obtained with the algorithm. I'll give you exactly the formula. I mean, there will be, there will be, there will be a, a magic value. But I'll show you, nobody, but there is, there is no magic value. I mean, of course, yes, there is a magic, but I'll give you a formula that tells you with which probability how to accept or to reject the, the clusters that you construct. So the, the first little stupid example was, was such that you had to accept it with probability zero because that was, that was a little bit complicated. But now I'll show you the formula that will allow you to run the wealth cluster algorithm with any value of P. Sometimes it will be more efficient. Sometimes it will be less efficient. But the point is it must be, it must be exact for any value of P. Random number. Random number? Well, you could do this also. You could do this also. I would have to do a little calculation during the break. You can do any algorithm that you want. I'll give you the recipe, but that was the idea of, two days ago we discussed the subject of a priori probabilities. Do you remember in the three by three pebble game, you were here two days ago? I mean, I don't know whether, no, I'm just, I mean, I'm not, I'm not, it's not the police. I mean, just asking. So, now in two days, two days ago, we had a completely idiotic algorithm. Remember? We had the idiotic algorithm. We moved, it was in the three by three pebble game. And the pebble, sorry, in the original algorithm, the pebble was moving with one quarter to the right, one quarter up, one quarter left, and one quarter down. Okay? And some of you asked the question about these a priori probabilities. What happens if I decide to move with probability one half to the right, one quarter up, one eighth to the left, and one eighth down. Okay? And I told her, told this person, told you all that nothing happens because we have the, we have the choice of, of biasing our algorithm such that the result is completely independent. The point you raise is extremely important. This, the probability is P, you know, this thing here. This is just the algorithm. There are billions of algorithms around. All of them are valid. Let me just say it again. In what we are doing here, P is the algorithm. This algorithm, I used to follow it. So, P is the algorithm. It is the probability to move to the right, to move to the right, to move up, to move left, or to move down. It is the probability to, to, to grow a cluster. It is the probability to move, it's, it's the probability in the, in the, in the path integral to, to do, to go to the left and so on. So, P is the algorithm. And Pi is the Boltzmann distribution, the density matrix. So, in classical, this is in quantum. So, the point is up here, there are billions of algorithms. Even in the class of cluster algorithms, there are billions of choices. You can do P, a random number. You can use P equals to 0.5. You can use P, the magic value of the, of the, of the cluster algorithm. The reason why we are in this room is because there are millions and there's an infinite number of algorithms that we can, that we can devise. Of course, the best of the algorithm will win. But all of these algorithms must be correct. In order to correct, to be correct, they need to satisfy global or detailed balance, aperiodicity and, and irreducibility. But under these three conditions, and I will be able to enforce the condition of, of detailed balance for any choice of P. So, there are millions of algorithms. All of them must satisfy one, which is global balance, two, which is irreducibility, three, aperiodicity. But all of these algorithms may have different tau, right? All of them, all the algorithms, they might, they, they only may differ, may differ in tau, okay? They may differ in tau, but they all have the same stationary probability distribution. Now, this is, this is a very, I mean, you know, this is extremely important point, you know. How often do we confuse, excuse me, I take your question, how often do we confuse the properties of our algorithm with the properties of our physical system? There's nothing to do. All of these algorithms satisfy one, two, three, they, they have the same stationary probability distribution. Go ahead. Yes, of course, I'll show you. In this, there's a, there's a brilliant way of doing it. But, but that is, that is the, the, the, the important point was that you, that any algorithm has parameters. But you have to arrange those parameters and in the Metropolis algorithm or in the Monte Carlo algorithm, you have always, you, you arrange by rejection. And it's always, the rejection is your, is your handle to enforce detailed balance or global balance and this condition. So it's the rejection, it's the rejection you can do anything. So for example, just take the example of the hard disk system, you want to move this particle, you want it to move to the right. The particle has no, doesn't want to move to the right because there are rejections. So you will reject the move and you'll leave the particle where it is. So it's always the rejection mechanism that enforces the detailed balance or the global balance condition. That's an important point here, I think. Other, other points that maybe, okay, no, no, no, it's not a decision. And just, now just move a little bit, let's, let's move a little bit. So is this, I mean, this is extremely, this is extremely important. Stop talking. The separation between these algorithms and the distribution is, must be understood, I think. Okay, good idea to raise this point. Okay. Are there other points that should be brought to the audience? You want to say something? Yes, of course. That is what we will, yes, of course. How do we, how do we, but I'll show you. One, one goal is, of course, but it's not completely, it is not completely true. It is, of course, that now you'll get me into, into talking. One thing, one goal is, of course, to make as little rejections as possible. Because the point is that if you have an algorithm that has a lot of rejections, it will not move. So what you will do is, you'll make smaller moves, and then when it goes, when you have smaller moves, it will have less rejections. So then, it used to be said that having algorithms that have no rejections is a better, is better than having algorithm with rejections. You can do a whole theory on, on, you know what I mean? It's like, if, then somebody else mentioned also that there was this famous rule of you try to have one half of rejection rate. So now, if you, if you choose a stupid value of p, a bad value of p, then you'll have this rejection rate already when you have very small clusters. So you will have, you know, you have cluster of size one and you will not move, not, not move, not move a lot. Or for example, okay, or for example, you can take the, let's, let's do a simpler algorithm. Let's just take, instead of flipping a single spin, let's say, why, why be just on a single spin, let's simply take plaquette of four spins. And let's, you know, let's come, let's move this four spins. Or let's take two spins in different parts of the system and move them. The reason why these ideas never get anywhere is because you're trying to move bigger and bigger things, bigger parts of the system with moves that are not adapted to the system that you're trying to, trying to move. And you get, you know, if you're trying to move larger and larger blocks, then the rejection rate will be so high that if the effective size of the, of the, of the part that you move becomes small and if you just take a single spin. So here we'll try to move a smarter way. So when I say one rule is, but this is just a simple rule is, let's say, the efficiency of the algorithm is loosely related to the rejections in such a sense that if you have little rejections for small, for small moves, then you can make, you know, then you can, you know, if, if you have little, if you have little rejections, then you can move a larger, larger pieces of your system. But this is kind of, well, you know, I told you about whole classes of algorithms have zero rejections, but they are fantastic and so on. But I mean, it is not, it is not just this rule of, of, of rejection, but let's say this is one goal that you could have to have, let's say you can, you can try, the efficiency is, is related to algorithms that have little rejections so that you can, can move larger, larger pieces in one, in one time, in one time step. But let's, at the end of the day, everybody gets into trial and error, right? You have this great, everybody has these great ideas and then, and then you run them and then you have the test of, let's say, this is the, that's the real test to, the real test is by comparison with what exists already, that is still today, the way it goes. Okay, that's, so I'll continue, let's take half an hour and then thanks for the attention and we'll, I'll show you, but this is my big, I bet it's good that you push this point because I'm the only one that insists, I'm a little bit by myself on insisting that you get a valid algorithm for any P and you must get a valid algorithm for any P and I'll show it to you, okay? All right, thanks.