 So previously in this class, we've never talked about making absolute prediction. We've used the Boltzmann distribution and the partition function and we've talked about sampling. Let's stick with that for a while. So if I'm sampling instead of making absolute prediction, I can think of an energy landscape where I have a number of states and I want to know what is the population in particular in the local minima. And based on lecture five, you should now know that the maxima here are not that important because they're going to be so narrow so that you will immediately fall into the minima. You're not going to spend any time at the maxima. They're transition states. And in fact, you've done this. You've invented a very smart way to do this. In the first-hand in task, you use a method called Monte Carlo sampling. So all we need is a method that visits these states and visits them in such a way that their weights will correspond to the Boltzmann distribution. The Monte Carlo method was first proposed by Nicholas Metropolis in 1949 and this was deemed so important by the US military that they would even consider classifying it. What the Monte Carlo method does is that if I'm starting here and then I'm taking a jump down, I should always accept that. It's always good to go downhill. And then you could say you should never go uphill. If I never go uphill, there would just be a drift towards the lower-lying states and that I don't want. I want some population at the higher-level states. So the other part of this algorithm is that you calculate this e raised to delta e divided by kt. So if this is a number in the same ballpark as kt here, you can relate this to a random number. Say that if you pick a random number between 0 and 1. And depending on what that random number is, sorry I need to simplify this a bit to avoid spending an entire board and going through it, but basically if this energy difference is very small, you have a finite difference of moving upwards. And that depends on the random number you're drawing there. So you sometimes go uphill but in general not. What you can prove is that that leads to so-called detailed balance, that if you have a distribution here and it fulfills Boltzmann conditions, if you keep sampling this way, the flux over all these barriers are going to be the same to the left and right. That might sound strange but it just means that I have a much higher population in this state, but it's going to be a lower probability to have so much energy to go over. I have a smaller population in this state but it's a lower barrier to go over. And if the flux over each barrier is constant, I'm not changing the population in any of them. So then the population states each atom might move here but I having a stationary distribution. The concentration in each well doesn't change. It's not at all obvious that the Monte Carlo sampling and the metropolis algorithm leads to this, but you're going to need to trust me on this. So that means that we should use the Monte Carlo sampling like you did in Lab 1. Yeah, not so fast. If I try to do this for a protein and then I do a large motion for a protein and test a completely new conformation, well in this case it actually worked. If I'm unlucky two atoms might bump into each other but then it seems pretty rare. This is going to work great. In vacuum. This protein in reality would have water around it and if I now make a large move, there is not vacuum here. When I made a large move my atoms would instantly have bumped into water atoms. Okay, that means I should reject that move and go back. I make a second move that bumps into water too, I go back. So the problem is out of say one billion moves I might end up accepting one. It's going to be exceptionally inefficient actually. One in a million is probably too optimistic. This algorithms will not work for a protein, sadly. So it seems that we have to go back to predicting structures after all, predicting individual motions. Or maybe we can combine them because all I said for this to work, I need some sort of algorithm that generates confirmations that fulfills the Boltzmann distribution. And it turns out that Newton's equations of motion will fulfill the Boltzmann distribution. We should formulate that possibly in a slightly more mathematical way. So let me do that. Newton's equations of motion, they have the door back that the motions are going to be much smaller. But while each motion is smaller, I will accept virtually all, well, I will always accept them. So my protein atoms will gradually push away the heavy atoms. So if I write that slightly more formally for a large system, I have the mass of an atom I, multiplied by the second derivative of the vector of that atom, that is x, y, z coordinates for it, with respect to time. That is the acceleration. So this should be the force. And the force in that atom is the gradient taken with respect to those atomic coordinates of the potential V that is a function of all the atoms in the system. And then I here goes from 1 to n, where n could be a quarter of a million or something. This is pretty much all we need. It's not necessarily going to be fast. I will take very small time steps, but it will guarantee me that I sample things correctly in this landscape. It's going to get me something else too. One drawback with that Monte Carlo algorithm is, sure, it gets me the relative population in the valleys here, but it doesn't say anything about kinetics. I don't get any information about the actual time it takes to go over these barriers. Newton's equations of motion, on the other hand, they will. So we use this, but the goal here is not to exactly predict the path of an individual atom. I can't do that, and I'm not interested in it. I would rather accept that if I have some sort of time skill here, let's just assume that there was some sort of absolute correct atom path if I'm psychic. But I know that won't integrate that. The only question is, if I start close to that, will I in general integrate things that are reasonably close? And the answer to that is yes, if these are reasonable time frames and if I have reasonable accuracy here. Eventually some of these paths might start to diverge, but the vast majority of these atoms, some of them might come in and follow this track for a while. There is even a concept here called shadow trajectories, but now we're starting to go into very advanced MD, and Birkes has a complete separate class on that that I would suggest you take if you're interested. But what this gives me is that I'm focused on sampling. I don't worry about an individual atom, but in general I will have a good sampling, and I will also have a good estimate of the kinetics of motions, even though for an individual atom I might occasionally get it wrong. But in the limit of many atoms, good statistics, this is going to be awesome.