 Good, so let me start by just briefly introducing what molecular dynamics is all about. So first of all, let me put my title, Molecular Dynamics, and my name, just in case you forgot it, Sandro Scandolo. By the way, feel free to send me emails, it's just my last name at ICTP, if you need to contact me.IT, simple as that. I'm also on Facebook, in case you want to reach me quicker than email. All right, so molecular dynamics. Molecular dynamics is essentially boils down to try and solve the problem of the terminated dynamics of, for example, a simple molecule like this one, for example. This is ethane. I'm using ethane just as an example, but I mean it doesn't need to be this. So you have a molecule, and you want to study the dynamics of a molecule like this one in real time. So you want to study the vibrations, how this molecule moves, and so what you want to do is to, for each atom in this system, you want to solve Newton's equations times acceleration. Now, of course, there are a number of considerations to be made at this point. First of all, that if you want to study the dynamics of this, or for example, if I take this as atom I, I can call the atoms with different numbers, 1, 2, 3, 4, 5, 6, 8. If I'm looking at the force acting on atom I, this is the mass of atom I times the acceleration of atom I, there are a number of considerations I already have to make. First of all, that suppose the force is like this, for example, actually I have colored chalk, so I can use colors here, f, I. So the first thing I have to confess is that I completely forgot about quantum mechanics, at least the quantum mechanics of the nuclei that compose this molecule, because when I say that the dynamics of this hydrogen atom is determined by Newton's equation, I'm already saying that quantum effects on that proton are negligible. Now, this, unfortunately, is not the case, particularly for hydrogen, particularly for light elements, as you probably know. And so, in fact, studying the dynamics of a system like this, I'm actually using explicitly a system containing protons, because just to make you aware of the problems that molecular dynamics sometimes has, the dynamics of this stretching, for example, of the stretching of the C-H bond has a frequency which is, Ali, can you remind me, C-H bond, 3,000, 3,000 wave numbers, which is something of the order of 0.4 electron volts. So H bar omega of this frequency is 0.4 electron volts, which is much higher than KBT, for example, of standard conditions. So for all practical purposes, this stretching mode is an entirely quantum vibration, right? Yet, I mean, this is just to give you an example of where molecular dynamics might fail. Molecular dynamics is, at least in its simplest formulation, is about the classical dynamics of the particles, all right? So it's already, I mean, clear that for some vibrations, molecular dynamics, as we write it in this form, is not going to be valid. But people still use it for a number of practical purposes. There is another consideration I would like to make, and I'll come back to that extensively later on. And that when I write this, of course, acceleration is an easy and obvious concept, I mean, how this particle accelerates. And we're going to see how we're going to determine this acceleration. And we're going to determine how the molecule, how this proton here has these nucleus moves. But we shouldn't forget that behind this, I mean, in order to determine the acceleration of this, we need to know the force. And the force is minus the derivative with respect to the position of that particle of some potential of interaction between the particles in the system. I'll come back to that extensively later on, OK? But I just wanted to have it, I mean, to clarify that there are actually two ingredients when one does molecular dynamics. One is the solution of this equation, which is a second order differential equation in time with the acceleration and the force. And the other problem will be the problem of determining the force acting on that particular particle. And it's an entirely different problem. Actually, in some cases, in most cases, much more complicated than the problem of solving the second order differential equation, that Newton's equations, essentially, with respect to time. All right, so this is really the basic about molecular dynamics. In fact, the name molecular dynamics is an historical, if you wish, something that is, I mean, historically due to the fact that people started molecular dynamics studying molecular systems. But you can, of course, extend molecular dynamics also to atomic systems. You can study, I don't know, liquid metals, for example. Sodium. Sodium is monatomic, of course. And we still call it molecular dynamics, even if it's not about molecules, it's about atoms. OK, so we call it MD. But in fact, it's just an historical thing that remains, I mean, we should call it more appropriately. We should call it atomic dynamics. Actually, even more appropriate, we should call it nuclear dynamics. Because what really moves in our picture here is the nuclei. The electrons just follow the nuclei in an adiabatic way. And I'll come back to that later on. OK, so first problem now, I mean, we've seen what basic molecular dynamics is. This is, of course, a school on condensed matter physics. So we don't just want to limit ourselves to molecules. We want to study large systems, 10 to the 23rd number of particles, I guess. So we want to study systems that are large. And we don't want to confine ourselves to a small number of particles. And therefore, clearly, you immediately see that if you want to solve this problem for 10 to the 23rd number of particles, you're not even the fastest and most powerful computer nowadays will allow you to do that. So you have to confine your system. You have to carve out of this 10 to the 23rd particles that you would like to study, in principle, a much smaller system that allows you to extract interesting information about that particular system. And of course, the size of the system that you will carve out will depend on the kind of properties you're interested in. If you want to study a simple chemical reaction, for example, in a fluid, in solvent, in water, for example, well, it's enough to carve out out of your big solvent at least the molecules that are involved in the chemical reaction, plus some sufficiently large amount of molecules surrounding this chemical reaction, if you want to consider also the effect of the solvent. So we're talking about 100, 200, 300 molecules or atoms, of the other of a few hundreds. But there are, of course, properties that require much larger systems to be understood. So if you think of extended defects in solids, this is a school about condensed matter physics. I'm sure you're aware that these locations are extremely important in terms of mechanical properties of solids. These locations are extended objects. They can extend up to sometimes even macroscopic distances, microns, if not millimeters. So in principle, if you want to study a dislocation, you would have to carve out out of your solid a portion which contains, well, millions, billions, or whatever number of particles. So the size of the systems you're going to focus on, you would like to make it as small as possible because you want to be able to reproduce it in a computer. But at the same time, the size of the system has to be compatible with the kind of problem that you want to study. Now, that brings me, of course, to the fact to how you're going to truncate out of your macroscopic system. So you have a big macroscopic system containing 20 to the 23rd molecules. And you're here trying to carve out out of your system a small subsystem which contains 100,000, a million, a billion, whatever is your, but it's definitely much smaller than this one. Now, of course, the big problem that you're going to be facing is how you're going to truncate. Now, choosing this number of particles is not difficult. The problem, of course, is what to do at the boundaries at this point of this, I mean, subset of particles that you are choosing. I mean, the most obvious choice, if you think about it, is just to have open boundaries. I mean, you carve out of your system your subset and you just have open boundaries. So you will have particles that sit at the, now you remove completely the other 10 to the 23rd minus one million particles and you're left with a sort of cluster, stand alone and in vacuum with some surfaces, of course. As long as the system is large enough, such that whatever happens here in the middle of your cluster doesn't, is not affected by the properties of the surface, right? And, of course, you have a much smaller number of atoms at the surface of particles at the surface than you have in the bulk if your system is large enough. Then, of course, your simulation is safe in the sense that the way you cut the surface is irrelevant as far as the properties of the bulk of this cluster is concerned, okay? Of course, there's another possibility and the other possibility is actually much more popular than this one and some of you, most of you are probably already familiar with this, is to say, well, let me take instead of an arbitrary shape, let me take a shape, for example, a cube, which I can think of replicating periodically. So instead of removing the other 10 to the 23rd minus one million particles, let me say that I have a million particles here and then outside, instead of having vacuum, I have the same cube repeated periodically. So I have another cube here, which behaves in an identical fashion as the one, as the central one. I have another one here, which behaves identical to the other one and so on and so forth. So I fill the space in a periodic fashion. Now, of course, this is also an artifact because the real system is not a periodic system. At least it's not periodic in the way we want it. I mean, it's periodic because we have crystals, but this is a completely different kind of periodicity. What we are trying to introduce here is a fictitious and is an artifact, something that has a periodicity that has nothing to do with the real system. It's just an artifact of our computational approach. We want to limit the size of the system and we introduce this periodicity, okay? Now, there are, of course, advantages and disadvantages in the two approaches. And let me just list the possible advantage of this approach here. This approach requires to determine interactions only between the particles that are inside your cluster because there's nothing outside. This approach here requires you to determine the interaction between all the particles that are inside the cube with all the particles that are outside the cube. It's true that they behave in the same way. So when I look at the dynamics, I only have to look at the dynamics of particles inside the cube, but I have to determine the interaction with all the other ones. I cannot forget that there are other atoms outside, even though they behave exactly the same way in periodic cells. But probably the most obvious advantage that this approach has is that in this approach, I don't have a single particle that sits at the boundary, right? All particles that are in this cube behave as if they were in the bulk of the system. Of course, the only approximation I'm introducing is that if I sit on a given particle, there will be a particle sitting at a distance corresponding to the size of this cube which behaves exactly the same way as myself, right? But that particle sits far away from me if my box is sufficiently large. If the physical interaction between myself and that particle is negligible, of course, I can assume that I behave as essentially as if I were in the bulk. And all the particles that are inside this cube have this property. If you take this approach, only the particles that are inside this bulky region will be behaving like bulk. All the ones that will be sitting at the surface of this will be behaving as if they were at the surface, of course. So this is definitely not the behavior as a bulk-like behavior. Now, for a medium-sized system, take something like 1,000 particles, for example, right? Suppose you want to do a simulation with 1,000 particles. And take a cluster of 1,000 particles. Let's approximate it with a cube, okay? Just for simplicity. So we have a cube with 1,000 particles. How many particles out of 1,000 in a cube will be sitting at the surface of this cube? Can you guess? You have a cube made up of 1,000 particles. How many particles will be sitting at the surface? 600, a bit less than that, right? Because you have the corners. You have six faces, right? Each face will have 100 particles. So you have approximately 600. Minus, of course, the corners. So it's 500 something. So more than half of the particles of a cube, of 1,000 particles, are sitting at the surface. So in order to converge this approach, this approximation, you actually need many more than 1,000 particles. In order to be able to have a large number of your particles behaving like bulk, you need much more than 1,000. So if your simulation, if in your simulation you're going to use of the order of 1,000 or less particle, clearly this approach is much more effective because many more particles out of these 1,000 particles will be behaving like bulk like. And again, if you think of a cube of size 10, each particle will have a particle behaving exactly the same way, only 10 atomic distances from yourself. And 10 atomic distances is a pretty large number. It's efficient to consider, to say, for you to say that there is no physical interaction between these two particles. So for medium size systems, definitely this approach, which we call periodic boundary conditions, conditions as opposed to open conditions is definitely much more effective. For larger system, of course, you can, I mean, it's open to discussion. So I would say 90% of the research in molecular dynamics is actually done with periodic boundary conditions and condensed matter physics, of course. If you want to study condensed matter systems. Okay, so let me now come to the problem of solving the second order differential equation, Newton's equation. For the time being, I'm not going to discuss the problem of determining the force. I assume that I have a black box, a subroutine that provides me with the force, okay? But if I tell the subroutine where the position of the atoms are, there is something that gives me the force, okay? So I don't have to worry about that. I'll come back to that in a moment. But let me now try and discuss how we solve the second order differential equation, m times acceleration is equal to the force. To the force, okay? So let me now assume that we use periodic boundary conditions. We put everything in a box. We have particle J here, particle I, sorry. There is a force acting on particle I. And my goal is now to solve this problem. And I second derivative of the position of that particle equal to force. Let me be a little bit more explicit here. This force, which is the derivative of the potential, will depend on the position of all the other particles, right? So it will depend on position of particle one, two, up to the total number of particles in my system. Maybe obvious, but it's worth remembering that that force will change as soon as I move the atoms because this is a function of all the particles in my system. Of course, I'm assuming I have n particles in, so my problem is now to solve this problem for the n particles. It's a second order differential equation. It's coupled because I have r i here and I have all the possible r's on the other side. Okay, but I can, I mean, it's not that difficult, it's just second order. So in order to solve this equation, what I'm going to do now is going to expand, sorry, let me first discuss it. The way I'm going to solve it is by discretize time. There are other ways in which I can solve it, but 99% of molecular dynamics calculations are done, actually 99.9% are done by discretizing time and solving the second order differential equation by in time steps, as a sequence of time steps. Okay, so essentially, if I am at a given time t, I can expand the value of the position of particle i at time t plus delta t, okay? So suppose I know everything about the system at time t. My goal now is to determine what will the system look like at time t plus delta t at the next step. So of course, there will be velocity of the particle at time t times delta t plus there will be the acceleration at time t times delta t square over two, right? Plus there will be terms of higher order, right? So it will be a term of order here delta t cube and higher. This is just Taylor's expansion. The first terms, I know them well. They are the velocity, the acceleration and the other ones will be higher derivatives of the position with respect to time. So in principle, if I know the position at the previous step, I know the velocity at the previous step, I know the acceleration at the previous step, I know what is the value of the position at the next step. Sorry, if another time of the given time at a given time at a given time, I can estimate the value of the position at the next step with an accuracy, which is delta t cube. And notice that I essentially know all these quantities of certainly know the position at previous time, at time t. Velocity, well, there are different approximations I want to enter the details of that, but if you know the velocity at the previous step at the time t minus delta t, you can extrapolate the velocity at time t as well. And acceleration obviously is the force because you know that the acceleration is given by the force, so that's something you know about the position at time at time t. So all these three quantities are more or less known. Velocity, I mean, to be discussed, but anyway. Certainly, you don't know anything about the higher orders here. So at least to second order in delta t, you can extrapolate, you can extrapolate the value of the position at time t plus delta t. Now, this is a very crude approximation. And if you try to do it this way, you will either be forced to use an extremely small delta t such that terms of higher order are completely negligible, right? Or you will be forced to try and estimate also terms of higher order in this way. So this is not going to work. In fact, there are much more sophisticated ways to extrapolate the position at times t plus delta t. And let me just give you an example of one possible way to go to higher orders in delta t in these extrapolations. So to do that, let me calculate t minus delta t. You might wish to know why I'm calculating this. Well, you'll see it in a minute. This is r of t. Well, here there's a minus t delta t. Here there's a plus again. And here there is a term. Let me write it this way. I know this is not mathematically fully correct. By this, what I mean is that the first term, the one in delta t cube, will be exactly the same with the opposite sign. Of course, the other terms will be different because they will have the same sign. So they will have a different, so we'll have an opposite sign. But for the first one, I can certainly say that the first two terms are exactly the same because it's just the same one with the opposite sign. Let me now sum these two terms. Actually, let me write it in a slightly different way immediately. And let me say t plus delta t. So I'm going to sum left-hand side and right-hand side of these two equations. Let me immediately bring to the right-hand side this plus here. So I have minus rt minus delta t plus twice ri time t. v and minus v disappear. This one remains with the factor of 2. So it's acceleration t delta t squared. Factor of 2 disappear because I'm summing the two. Now what about this one? I just said, I mean, the term in delta t's cube is going to be exactly the same. Plus, minus, disappears. So the first term that will appear is the one delta t to the fourth. So what remains is something of order delta t to the fourth. And with this trick, I got rid of two problems at the same time. First one, I got rid of this velocity, which, as I said before, not even sure how to calculate it if I know to calculate a given point by introducing the position at the previous time. The second thing is that very important one is that I've increased by one delta t the accuracy of this estimate. And notice I'm just using the position at the previous step at two steps before the acceleration at time t, which I know if I know how to calculate the forces. So this is the force at time t divided by the mass of the particle, something I can calculate. So all three terms, I know them very well. I just need to keep them in memory, right? I just need to keep in memory the position at time t minus delta t. And the advantage of this is that I've now increased by one delta t the order of the accuracy of my estimate. This algorithm is called the Verlet algorithm. It's a very stable algorithm. It was invented by Verlet long ago. Turns out to be extremely stable. Actually, even more stable than this delta t to the fourth could let you guess. And it's very simple because it only requires you to keep in memory the two positions at previous times, one time before and two steps before, and to be able, of course, to calculate the force. That's something you always have to be able to do if you want to evolve the quipolecodynamics this second order differential equation. Now, I'm not going to go into more, I mean, better approximation. Let me just give you a flavor of how you can go, I mean, can devise better approximations to this one. You might have noticed that in order to gain a factor of delta t here, I have to introduce t minus delta t. I mean, it's easy to show, I'm not going to do it, that if you want to increase by another delta t, you just have to introduce t minus 2 delta t. So there will be an expression, which I'm not going to write down on the blackboard, in which if you keep in mind what the position was at t minus 2 delta t, so instead of two sets of positions, three sets of positions, you'll be able to kill an additional delta t here, so to increase it to delta t to the fifth, and so on and so forth. In fact, if you think about it, this is Taylor's expansion. The more you keep in memory about the past, the better will be your approximation about the future. The only tiny bit of information you need, which is new, is of course the second derivative, which is here, only here in the acceleration, delta t squared. So all the rest is essentially extrapolation, a better extrapolation of your history, just like, of course, Taylor's expansion. But this one is already quite good as an approximation. And it really depends on the kind of simulation you want to do. If you really want to do an accurate simulation where everything is conserved, and I'll talk about conservation in a minute, properly, you might wish to go to higher order integrators. In most cases, again, 90% of the electrodynamic simulations you'll see in the literature, they typically use a verly or velocity verly algorithm. So very similar algorithms. They're all based on keeping in memory two sets of positions and calculating velocity and their good up to this accuracy. Now, of course, the question remains, how do I choose delta t? This is still an expansion in terms of delta t to the fourth. So I have to be careful that this delta t shouldn't be too large, because if it's too large, of course, my trajectory will immediately depart from the truth, from the real integration, continuous integration. So let me briefly discuss the choice of delta t. So suppose we know this is now time. This is now one coordinate. I'm choosing just one. There are, of course, 3n such coordinates in my system. So when I say ri, I mean 3n numbers, of course, right? The trichartesian coordinates of n particles. So let me just pick one. And suppose that I know the solution. Suppose that I know how this function behaves. Well, if you are in a molecular system, there will, of course, be vibrations. So this position might just be fluctuating according to the vibrations. If you are in a fluid, for example, this position might actually fluctuate, but also diverge, because in a fluid, the particle starts to move away from the initial point. Anyway, I mean, it's going to be something like that. And, of course, physics is what determines the typical oscillation times, the typical timescales of this problem. For example, this oscillation here might be the molecular vibration. If this was, for example, a fluid ethane or fluid water, for example, this could be the, if this was the coordinate of a proton in water, this could be the oscillation of the proton due to the stretching mode inside the water molecule. And then, of course, it would be superimposed on that. It would also be a translation of the entire molecule. So there would be some sort of much fluctuation, with much broader, much larger amplitude. But let me just focus on this one for just to give you an example. So now the question is, you know the physical roughly, the time scale of this. The time scale of this is the molecular vibration. Each problem will have its own time scale. Of course, if you're dealing with molecule, this will be the period of a molecular vibration. So we're talking about femtoseconds, 10, 20, 50, 100 femtoseconds, depending on the system you're dealing with. So we are in a scale of, well, let me put very, actually, it's never femtosecond. It's at least 10, 20 femtoseconds, the molecular vibrations. But anyway, just to give you some more of a magnitude of what this time scale could be. And you need to be able to integrate this equation. So the way we've done it is by saying, OK, let's start at time t. I know what the value of r i at time t is. And I want to extrapolate the value, for example, of time t plus delta t. And my goal now is, how do I am going to determine delta t? Of course, the longer the larger is delta t, the better in terms of number of time steps that I will have to evolve my system for in order to be able to describe the system for a sufficient, for a given physical time. I mean, there are actually two time scales in the molecular dynamic simulations, typically. One is the time scale of the internal vibrations, which are the fastest time scales at which particles move. And then there is another important time that you have to know, or at least that you have to be able to guess at some point. And this is what is the characteristic time of the simulation you have to carry out in order to be able to extract appropriate information about that problem. If your problem is to extract, for example, the frequency of vibration of this molecule, of course, this simulation is probably enough to come out with a reasonable guess of what is the molecular frequency of vibration of this molecule. You just have to run your system for two or three vibrations, more or less you calculate the period, and you'll be able to come out with an reasonable estimate of the frequency of vibration. But if your problem is, for example, to determine the diffusion coefficient of water in fluid, then of course you have to run your system for much longer time in order to be able to extract statistical information about the diffusion of particles in your system. So this can be 100 times this period of vibration, if not 1,000 times. So the physical problem you're interested in will determine the time scale, the total duration of your simulation. Let me call it, so this I call it tau, the period of vibration. And then it will also be a big T, which is the total duration of your simulation. That is also determined by the problem you're interested in. And typically the total duration is much larger than the period of oscillation of your system. So assuming you more or less have a guess of what is the total duration that you want to achieve, then of course having a longer delta T will definitely help you. Because the number of steps that you will have to calculate, of course, will be equal to the total duration times divided by delta T. So you want to minimize this number of steps. So the longer delta T, the better. Because it means less number of steps, less calculations of the force. Computationally, this is going to be more efficient. At the same time, of course, you don't want to make delta T too large. Because clearly, if you're trying to extrapolate, for example, with a delta T that big, you're definitely not going to be able to use Taylor's expansion up to fourth order. This is not going to work. So there is a threshold, of course. And that threshold depends on the shortest, on the fastest degrees of freedom of your system. Because those are the ones that you have to be able to integrate using your, for example, Verlet integrator. So what determines delta T is actually the fastest vibration of your system, the fastest time scale of your system. You have to be able to integrate using your Verlet integrator the fastest possible degree of freedom of your system. Clearly, delta T, comparable to tau, is not going to work. Because this is not going to give you a proper extrapolation of the, for example, of assuming a sinusoidal curve. Now, in fact, people have studied this extensively. And if you just take, for example, a simple sinusoidal curve, and you're trying to integrate a simple oscillator using the Verlet algorithm, it turns out that there is a sort of magic number that you might wish to use. Delta T has to be of the order of, sorry, delta T, tau divided by 30. Some people use 25. Some people use 40. Some people use 50. You may immediately see that this is a reasonable estimate of what the longest delta T is, if you want to be able to integrate, for example, a sinusoidal oscillation of time of period tau. So you discretize your sine, your sinusoidal oscillation with roughly 30 slices, 40, depending on how accurate you want your simulation to be. And the Verlet algorithm is typically quite robust with this choice of delta T. Which means you can now, if you know this time scale, if you know delta T, you can now also estimate a number of steps. Now, unless your problem is extremely simple, like, for example, determining the frequency of oscillation of a molecule, which of course requires, at this point, only 100, 200 steps in principle, for typical problems in molecular dynamics, you'll immediately see that the number of steps has to be at least of the order of 10,000, 100,000. It's not a million. So a standard molecular dynamic simulation requires you to evolve your Verlet algorithm. And therefore, to be able to calculate the force on your system, on each particle in your system, 10,000, 100,000, a million times in order to complete a molecular dynamics simulation. How do I know whether my delta T is good or bad? Well, in principle, it would be trivial to know if you knew the real trajectory. But you don't know the real trajectory, right? You're only going to approximate the real trajectory with your delta T. So in principle, whether this choice or this choice of delta T is correct or not, I mean, you need an external something else, somebody else, to tell you you're doing a good job or you're not doing a good job. And this something else, actually, is something extremely simple. It's trivially the energy of your system. Now, in a simulation like this, where you're evolving your particles simply with this Newton's equation, there is an obvious integral of motion of your problem, which is the total energy of your system, kinetic energy plus potential energy. So there will be a quantity, which I will call total energy, which is the sum over i on 1 half mi vi square plus v of r1, r2, rn. And this v is the potential that gave me the force. Of course, I'm assuming the system is conservative, OK? So no magnetic moments, no, I mean, it's a standard conservative system. I know that if this is the case, then the potential energy plus the kinetic energy, the sum of the kinetic energy plus the potential energy of the system, total energy, if I'm evolving the system using these Newton's equations, then this number must be conserved, must be the same, must be a constant throughout the trajectory. And of course, any error in integrating the equations that I was writing here will reflect in a non-conservation of this number of the total energy of the system. So the total energy is an extremely important indicator that tells you whether you're doing a good job or a bad job with your delta t, whether the choice of delta t was a good one or was a bad one. It was too long. If it's too short, this will also tell you because you will see that the total energy is down to the 20th digit from time step to time step. This is too much. I mean, you don't want to be so precise. You want to extract physical information. So the kind of information you extract will have some error anyway due to the choice of the potential, due to the choice of all the approximations you made. You have been studying a quantum system with a classical metal. So there will be a number of approximations. So there's no reason to conserve this total energy down to the 30th significant digit. But you still, I mean, typically, you want to at least have this conservation up to the fourth, fifth, sixth. Now it really depends on who you ask at this point. But four or five significant digits are typically OK throughout your molecular dynamic simulation. If it's more than that, if you start changing in the second, third, the significant digit or even more from time step to time step, then, of course, you're in trouble. And that means that you have to reduce your time step, your delta t to make it shorter in order to be able, in order to allow your Verlet integrator to give you a more precise estimate of the positions at different time steps. Of course, I didn't write it explicitly here. But what I meant, of course, is that these are all time dependent. So when I say that this is a constant, I'm saying that this is a constant in spite of the fact of velocities that positions are changing with respect to time throughout my molecular dynamics calculation. So the total energy is an extremely good indicator of the accuracy of calculation. There's also a good way to determine the best possible delta t, in case your difficulty is extracting it from the fastest vibration of your system. All right, so what is the standard protocol now of a molecular dynamic simulation? So you have to start from something. And in classical mechanics, if you want to determine the initial state of a system in statistical terms, what you have to provide is positions at time 0, and velocities at time 0. So once you give me this list of 3n plus 3n numbers, I can start a molecular dynamic simulation properly, because I have all the relevant information. Now, if you think about it, if you remember the way I described the Verlet algorithm, Verlet algorithm doesn't work with velocities. You can extract velocities, a posteriori. But the Verlet algorithm never has velocities in the algorithm. It always deals with positions at time t and positions at times t minus delta t or t plus delta t. Obviously, a velocity is the difference between these positions at different time steps divided by delta t. So typically, if you start a calculation with a Verlet algorithm, instead of using this initial set of conditions that you use instead, positions at time 0 and positions at time minus delta t. You give two sets of positions at time 0 and at times t minus delta t. And that allows you to start constructing the position of times delta t to delta t and so on and so on and so forth. Now, of course, these two sets of initial conditions are perfectly equivalent, because you can extract the velocity from the difference between the two positions. Not exactly, because if you really want to know the velocity at time 0, the difference between these two divided by delta t will give you, in principle, a best estimate for the velocity at time minus delta t over 2 to be precise. But these are initial conditions that we're not really. We shouldn't really be worried about the choice of the initial conditions, because we will then evolve the system. And as you will show in a minute, you will have to equilibrate your system. So the choice of the initial conditions is rarely important to the point that you need to be careful about the difference and distinction between velocity at 0 and velocity at minus delta t over 2, which is what you would do if you were to input these two velocities. So typically, instead of using this, at least if you use Verlet, this is your initial set of conditions, time t and time t minus delta t. Of course, from a physical point of view, you may wish to start working with a system which is already at a given temperature. I'll come to that in a second. So suppose you want to study a fluid. You start by placing your atoms in some places. I mean, randomly, perhaps, if you study a fluid, you don't know exactly how to place them. If you knew the best way to place the atoms in a fluid, you wouldn't have to do molecular dynamics. So it's your goal to determine the typical set of configurations of a liquid. So your starting configuration will be something you'll have to guess in some way. So perhaps you can put particles in random positions inside your box. What about velocities? Or what about the choice of the positions at times minus delta t, which is equivalent to saying, what about the initial velocities? Well, if you want to study a system at a given temperature, you know that temperature in a system is 3 over 2n sum over i of mi di squared, where n is the number of particles. So it's the average, sorry, kBt. From statistical mechanics, statistical physics, we know that the temperature of this, sorry, sorry, sorry, so 1 half, 3 times, no, this is 3 halves, kBt. So it's 1 over n. Every degree of freedom, no, there's one third. Sorry, sorry, sorry, sorry, this is 3 degrees of freedom. So this contributes to 3 halves, which means 3 k, it's 3 divided by 3, 1 by 6. Sorry, sorry about that. So each degree of freedom essentially brings 1 half of kBt, kinetic energy. So here there are 3n degrees of freedom, so I have to divide by 3n. So this is 1 half. Here I have 3n degrees of freedom. So I have to divide by 3n, and this is 1 half. So this goes away, and this goes away. Sorry, here you go. kBt is equal to, apologies for this, should have remembered. So kBt is defined as the average value of the velocities of the square of the velocities just from statistical physics. So if you want to study a system at a given temperature, suppose you want to study liquid fluid water at 300 Kelvin, for example, how would you choose the velocities at the beginning? Well, I would choose them as close as possible to the velocities the system have at equilibrium at temperature 300 Kelvin. So I picked them up from a standard Boltzmann distribution where the temperature is kBt. So I choose the velocities in such a way that the sum of these velocities will give me kBt mt is 300 Kelvin out of a Boltzmann distribution with random variables. So that's actually the most efficient way to start a molecular dynamic simulation at a given temperature. Now, let me digress just very briefly. In statistical physics, a system is defined by the set of positions and by the set of velocities. Now, for the velocities, we are very fortunate. We know exactly how velocities are distributed in classical physics, in statistical physics. It's the Boltzmann distribution. For the position, however, the way particles are distributed in phase space is also determined by an exponential, but that exponential contains the potential at the, because the partition function contains the potential at the exponential. So it would be impossible for us to construct a configuration immediately that satisfies and samples this precisely, this distribution. In fact, it's precisely the goal of molecular dynamics to use dynamics to construct configurations that will satisfy this statistical property of the distributions. If we had a way to estimate, to come up with configurations that satisfy this distribution automatically, we wouldn't need to do molecular dynamics. In fact, Monte Carlo does the same. Monte Carlo is precisely used to approach a system in which you're sampling configurations out of this Boltzmann distribution. It's not an easy task. And of course, you'll see plenty of examples of Monte Carlo later on. Molecular dynamics is another approach to the same problem. You want to sample a statistical distribution for the positions, which is determined by the Boltzmann, I mean by the partition function written in this way. And you use molecular dynamics to generate configurations that are sampling this distribution at a given temperature, T. So molecular dynamics and Monte Carlo are essentially complementary in some way. They are both approaches, classical MC, of course, classical Monte Carlo, of course. They're both methodologies that allow you to sample a parameter space, which is large, and according to a distribution, which is the one determined by statistical mechanics, end of the regression. So let me go back to the problem I have. I want to start with a configuration which I don't yet know, because I don't yet know exactly how to sample this distribution. I choose positions in a guess, some guess. Of course, if I have molecules, I try to put the atoms in molecular species, at least, so that they don't have to find themselves on the way. And then I take velocities out of a Boltzmann distribution with this distribution of the velocities determined by temperature, because that I know, that I can do very easily. I can generate very easily. Then I start evolving my system. And you can start looking at some quantities. So this is time. We'll have these 3n particles moving. We'll be accumulating data, 3n, 3n, actually 6n. No, sorry, 3n. Verlaine, you only take 3n at the different time steps. And you can start looking at quantities. For example, you may wish to start with the total energy. For example, total energy has to be flat. If it's not flat, there's a mistake. So this is the typical, the first thing you look at during your molecular dynamics. This has to be flat down to the fourth, or fifth, or sixth, or seventh significant figure of the total energy. Suppose this is the case. Delta T was chosen properly. I can trust my simulation. Let me look at something else. Well, what I can look at, actually, the one of the first things I look typically is temperature. When I said this, I will say, I mean, I said that the temperature is determined in this way. But here, what I had in mind is an average value of the velocities. Now, at each molecular dynamic step, in fact, the value of this object. So let me now remove this KBT here, put the temperature dependence, and put another KBT here, where now, here, is the instantaneous temperature at time t. Now, if I did my job correctly at time 0, and I chose my velocities out of a real Boltzmann distribution, this sum will be exactly 300 Kelvin if I want to simulate my system at 300 Kelvin. So if I now plot also temperature here, t is instantaneous as a function of t, the initial value will be exactly 300 Kelvin here. But when time evolves, the sum of this plus the sum of the plus the potential will be a constant. And the potential will fluctuate, and so will the instantaneous temperature. So my temperature will actually be oscillating over time. Actually, this is a very lucky situation. In most cases, what temperature does is not this one, but it's something else. Use a different color now. Well, it will do something perhaps like this. This is another typical situation. So what this means is that the way I placed the atoms at the beginning was not properly chosen. Suppose I make a mistake in the choice of the bond length of water, and I start with water molecules which are all systematically stretched by 10% by mistake. I mean, I don't know exactly what equilibrium value was. And by mistake, I started with all the molecules 10% stretched. Now you see immediately that these vibrations will immediately start to absorb a lot of kinetic energy, much more than I gave it to the system at the beginning, because they are totally out of equilibrium. But because the kinetic energy plus the potential energy has to be a constant, if the system acquires a lot of kinetic energy, the potential energy has to go down. Right? And if I plot the temperature, which is the instantaneous value of the temperature of this quantity, well, the temperature will do something like that, until the system equilibrates at very high temperature, due to the fact that I gave a lot of energy to the stretching at the beginning. So at some point, the system will equilibrate. So there are two concepts, actually, that I would like to convey with this picture. The first and very important one is that in a simulation, there is a characteristic time which is the equilibration time. It takes a while for your system to equilibrate. And that time will depend on how clever you were in choosing the initial coordinates. If you did a mistake and you started with 10% longer bonds in water, well, it will take a lot of time for the system to equilibrate, because those vibrations will have to transfer their energy to the other degrees of freedom, to the translational degrees of freedom, to the rotational degrees of freedom. And it might take picoseconds even more for your system to equilibrate. So the time it takes for the system to equilibrate depends on how clever you were in the choice of the initial coordinates. A better choice will lead to shorter equilibration times. But the typical indicator of whether you're equilibrated or not, it's not the only one. And you shouldn't be really looking only at this one. But one of the first indicators you look at in order to determine where the system is equilibrated is the instantaneous temperature. The instantaneous temperature, for example, if I made a mistake in the length of bonds, it will increase, and then we level off, and it will start to oscillate. By the way, notice that it's actually a property of the problem of what we can now call the microcanonical ensemble. Let me introduce some additional words. The microcanonical ensemble is the ensemble in which I evolved the coordinates using simply Newton's equation, and the total energy is conserved. Actually, let me give you the name here. We call it microcanonical ensemble. It's this one. Whenever the total energy is conserved, and I'm using Newton's equations to evolve my particles, it is a property of the microcanonical ensemble. It's being actually proved, you can prove it mathematically, that the amplitude of these oscillations scale like one over the square root of the number of particles. So what that means is that you'll never get rid of fluctuations in relative terms. In absolute terms, it's actually square over n. But in relative terms, if this is one, let's say, this is one over square root of n, percentage of the value of the temperature. So you'll never be able to get rid of those fluctuations. It's a theorem. It's proved. It has to be like that. So if you have a system of 100 particles, for example, your temperature will continue to oscillate with fluctuations of the order of 10%. You shouldn't worry about that. Of course, the fluctuations of the average value of the temperature, that is, the average that you accumulate by running a very long simulation, will decrease, of course, as a function of time. So that number will get closer and closer to the temperature of the system. But the actual instantaneous fluctuations will continue to be of the order of one over square root n. So suppose you've been running your system for a while. You started from 300 Kelvin. You made a mistake. You didn't know what the equilibrium bond of the water molecule was. And after some point, you reach a value which is, I don't know, 400 Kelvin. Well, 450. I mean, it's not something you impose. It's something you calculate. So it's 415 Kelvin. Gosh, I mean, I wanted to simulate water at ambient conditions, not at 415 Kelvin. So how do I actually manage to reach 300 Kelvin in a simulation? As you have seen here, it depends on how clever I am, the possibility to actually reach something which is close to the temperature I started from. But it's actually the system that will bring me wherever the system wants to bring me. If I was too wrong, it will bring me too far from 300 Kelvin. But I mean, there's no way in which I can tell the system, at least in this, using this methodology, get to 300 Kelvin because I want to simulate my system at 300 Kelvin. Now comes the methodologies that actually go beyond the macrocanonical ensemble. I'm not going to give you an overview of all the possible methodologies. There are millions. You'll find plenty of them in the literature. I'm just going to give you a flavor of how these methodologies actually work. So the goal is now I want to do a molecular dynamics and I want to do a molecular dynamics at 300 Kelvin because I'm interested in water at ambient conditions, not at 415 Kelvin. So how do I actually play with my equations in such a way to bring my system to 300 Kelvin? Now, there are plenty of methodologies. One possibility is to, the simplest ones, perhaps, is to, while you're evolving your dynamics, calculate at every time the temperature. And suppose at some time you calculate the temperature is here. Well, it's 380 probably. You want to run at 300. Well, rescale all the velocities. You take all the velocities. Actually, you don't play with velocities, but you play with positions at time t and positions at times t minus delta t. You just change them a little bit in such a way that the instantaneous velocity just is scaled abruptly to 300 Kelvin. So you bring down the temperature to 300 Kelvin, the target value. Now, this may not be enough yet because you have changed the velocities. You bring them down to the correct bolts and distribution at 300 Kelvin, but your positions are still the ones that were typical of a system at 380 Kelvin. So for example, the stretching modes of the water molecule is still excited. Perhaps the velocities were reduced, but the positions were still the ones for an excited system. It's a bit like if you have an harmonic oscillator. If you want to reduce the temperature of an harmonic oscillator, you can stop, for example, the particle when the particle is moving, but you will still preserve the fact that the particle has a given oscillation due to the fact that you stopped it at a given position. And that will determine the temperature of your system. Anyway, you can continue. You stop it again, and you bring it down. Well, this is a very crude way of eventually achieving the correct temperature in your simulation. Every now and then, you just change the velocity, you rescale them, and you bring them down to the correct target temperature. You'll have a series of up and downs until at the end the system will be forced to learn that you wanted to run a simulation at 300 Kelvin. A more efficient way and actually more clever and smoother way to achieve this goal is to play with the second order dynamics of each particle in a non-abrupt way. What do I mean? Well, we know that mass times acceleration equals the force, but if to the force I add a term, which is, for example, minus gamma times the velocity, I'm going to slow down the particles. This is standard friction. So I may consider changing the dynamics. I introduce here, actually, on the other side. So here it becomes plus gamma. No, sorry, it's minus. 6 to the minus is still on the right side. I can decide to introduce a friction. Now, a friction will only slow down the particle. So it will only work if my temperature is higher than my target temperature. So I can think of switching on a friction if the temperature is above given the target temperature. Vice versa, if my temperature happened to be below, which is also possible, I can change the sign. I can accelerate the particles using this friction coefficient. In fact, you can easily see that if I do something like constant times t instantaneous minus t target, so this is the instantaneous temperature that I calculated at that particular point. And this is the target temperature, for example, 300 Kelvin. If I have a friction that depends linearly on this difference, if the instantaneous temperature is larger than the target temperature, as a function of t, sorry, this is t, the friction coefficient will be positive with a negative sign that will have the system slowing down. Vice versa, if the instantaneous temperature is below the target temperature, this will be negative, this will be positive, and therefore the force will be accelerating the particles. So with something like this, I achieved a very simple goal, which is to reach a target temperature in a smooth way without perturbing too much the system, without abratting, changing anything, just by introducing some friction. Turns out that there is an even more elegant way to achieve this, which is even smoother and has some very important properties, which I'll say in a moment. Let me do it here now. This is to introduce an additional variable. So now I have f i minus gamma v, and this gamma I write as a derivative of a variable, a fictitious new variable, and the dynamics of this variable is determined by minus c t instantaneous minus t target. So in other words, what I'm saying is that instead of using directly this quantity to slow down or accelerate my particles, I use a variable whose derivative is driven by the difference between the temperatures. In other words, when the temperature is above the target temperature, this fictitious object will start to accelerate. It will not become positive immediately. It will start to accelerate. And vice versa, if the temperature goes below, it will decelerate. But its value will be determined. The value that I put in the friction is actually the velocity of that particle, not the value of that particle, the position of that particle. Well, why am I doing this? It works. You can show that it works. That is actually much more efficient than this simple one in achieving the correct temperature, magically. And even better, you can show that this method allows you to sample exactly the canonical ensemble. So by running a dynamics which is made of these equations now with this additional particle, which we call a thermostat. So this particle is called a thermostat. S is the thermostat. You can show that these dynamics, precisely like this dynamics was sampling the micro-canonical ensemble, which is the ensemble in which the total energy is a constant. I hope you're familiar with a standard concept in statistical physics, right? So statistical physics, the micro-canonical ensemble is the ensemble in which the energy is a constant. Or if you wish, I mean, NVE, number of particles, volume, energy, constant. If you want to sample the, sorry, this equation, you can show that this equation samples the canonical ensemble. By the way, this equation and not this one, for example, nor the one with the abrupt jumps, OK? This one you can show mathematically in the large end limit samples exactly the canonical ensemble. Of course, the dynamics of the particles will not be the correct dynamics of particles, because particles actually behave like Newton's equations. This friction is an entirely fictitious concept, right? It's something, it's just something that allows you to sample, in statistical terms, the canonical ensemble. So the configuration that you'll be generating using these dynamics will be corresponding to a proper sampling of the canonical ensemble, OK? That is the ensemble minus beta B with a given temperature. By the way, this trick goes under the name of Nose thermostat. So there are some important figures in molecular dynamics, historical figures. One is, I mean, scientists. One is Nose, who is the one who introduced this concept of thermostats. The other one is Verlet. You've seen it before. And there are several others that I will forget, I mean, just because they contribute to different aspects of the development of molecular dynamics. But this is a very important concept, because by introducing this trick, you can now tell the system, go to 300 Kelvin. And the system will automatically and spontaneously reach a temperature of 300 Kelvin. And it will do it in a way that samples exactly the canonical ensemble. Of course, the total energy will not be conserved any longer, because, of course, you're changing the velocity. You're adding a friction term. So the velocity is, I mean, the equation will no longer be a conservative motion. So this total energy will not be conserved. Actually, there is, I mean, in this theory, there is another quantity that is conserved. And it's something that is slightly more complicated. You just add an additional term here, which depends on the velocity of the thermostat. And I mean, I'm not going to go into the details, but there is a quantity which is conserved, even with the dynamics with another thermostat. So as far as your accuracy of the calculation is concerned, you have a very good indicator anyway of the conservation of something. I mean, that your integration is done properly. You don't have to worry. Going away from the micro canonical ensemble still gives you the chance to check whether you're doing the calculation properly, of course, by using a different form for the total energy, which includes also the dynamics of the thermostat. Sorry? Yes? I can see there is a time where the same thing will follow. Yes? For the way you have this count, don't you? Yes, correct. Very important, yes. So the Verlet algorithm, without friction, is, in fact, time reversible. In this case, it is not. In this case, it's not. No, because there is an explicit velocity dependence here. Here you have the velocities. So there is explicit velocity dependence there. Oh, what do you say, squared? Oh, that's a good point. I don't know. That's a good point. Does anybody know the answer? If you wait a time for the rest of the question right here, someone will be afraid to do it. Yes? You don't need to change your function to react a little bit. Correct, yes, because here. Sorry, yes. I think that was your question. Sorry, I missed the point. Yes, because you are introducing a term which contains the velocity. Yes? So the algorithm will have to change. Yes? It can actually be changed within the spirit of Verlet in a rather simple way. But you will lose, of course, time reversal when you do that. Yes? Yes? I forget now the exact way. I mean, I would have to look it up, in which you can introduce the Verlet correction in the equations of motion. Yes? OK. How much time do I have here? Because I'm, shall I stop now for coffee break? You know what? Unless you're specifically interested in this exploring by interiors, I might actually continue with molecular dynamics later on, Blackboard, if you're OK with that. Up to you. I mean, if you find this interesting, I think I can just continue with Blackboard. I have several other things to mention. OK, so let's do this. Your question I will answer later on then about the phase changes. OK, and I guess we can now go for coffee break. Thank you.