 We know that it shares a gradient in concentration in a situation where we're looking at distance scales which are long compared to the mean-and-treat path of a particle moving in, some material or fluid or substrate that the motion is diffusive, the particle makes a kind of random walk. This direction of motion gets renewed with some characteristic time-scale, during which you travel the characteristic length scale, that's the mean-and-treat path. And then there will be a function of the diffusional, proportional, and gradient of concentration of the particle. It intends to flow from the region to the high concentration, the region to low concentration, for fairly statistical reasons. And the constant, if carrying any equation, is called the diffusion constant or de-usivity. Using the continuity equation, the fact that the particles are conserved, we write down a partial differential equation governing the concentration of the state that the concentration, regardless of function of position and time, has a time derivative which is given by d times gradient square, that's the diffusion equation. We discussed last time how the diffusion equation can be derived or interpreted from a microscopic view point, turns the probability distribution for particles in the system. So, keep things simple, suppose we have a one-dimensional system, and we think of particles as making a random walk. There's some characteristic time for the direction of motion to be reset, I'll call that epsilon, the time step, the recent time. And we can imagine describing the motion on a lattice in equally spaced sites with spacing between the sites which I'll call delta. And in each time step, the particle will take a step either to the neighboring site on the right or the neighboring site on the left, each occurring with a probability. If we consider a probability distribution which is broad on the scale of the lattice, which changes slowly as a function of position on the scale of the lattice spacing, then in each time step, the probability distribution will be updated slightly. We can describe that as the form of the motion as the lattice spacing delta in the time step epsilon, so the zero. And then we obtain the probability distribution for, in this case I'm in one dimension, so I have a probability distribution for the position in one dimension, which is the integer times the lattice spacing delta, and the time in our discretized model is something intermultiple of epsilon. But if we take the formal limit in which the steps are small, we obtain a differential equation which is just the distribution equation that's passed by the probability distribution. The time derivative of the probability distribution at a particular site is given by the fusion equation in this case, we're in just one dimension, so it's the second partial derivative with respect to x of this function of x and t. And we also learn that we can express the diffusion equation in terms of parameters. In our discretized model, the diffusion equation is the lattice spacing delta square divided by twice the time step. So when we take the limit of a very broad distribution, we're going to make a small lattice space and we take the limit of t to be fixed in order to get the situation that's described by the diffusion equation of stoppability. And if you like, you can think of this as being one half times a speed of motion in the particle hops left to right by the distance delta in the time epsilon, so going to the speed delta over epsilon times a characteristic mean-free path. So v is the speed and you can think of delta as the mean-free path. In other words, it's the distance the particle travels before getting its direction motion randomly. And so that if you have an initial probability distribution at time zero, which is just delta function on position, the particle starts out at the origin with probability one. Then after time t, the probability distribution will be given by a Gaussian on over at square root of 4 pi dt. The exponential of minus x square over 4 dt, so the width of that Gaussian depends on the time. And you can think of this as it's really mathematically equivalent to the problem we discussed a long time ago of an unbiased coin flipped many times. The implicit coin corresponds to whether the particle decides to go left or right. And the position after t steps is the excess of steps to the right over steps to the left. So it's just like our distribution for the spin excess in our simple model, the magnitude excess in the first week of the class. We look at the means, well, root mean squared distance that the particle travels in time t, it grows like the square root of the time with the diffusion constant soaking up the dimensions turning the square root of time into a distance. That's characteristic of the fusing motion. The distance traveled goes like the square root of the time. We can consider the three-dimensional situation and it's really the same thing. We can imagine the particle now taking a random walk in which its direction of motion gets randomized and we can describe that on the lattice. So an effect is making independent random walks in the x, y, and z directions. On three dimensions, you can think of the diffusion constant as essentially the same thing as except, well, I'll use d for the dimensionality for three dimensions. I think this one is 2, 2, 1, or 6. Square root of time over epsilon. But the idea is just that if you consider a cubic lattice in three dimensions, there are six directions in which you can step in this time step. You may go up or down, left or right or forward or back. And so the probability distribution will have an effect of 1, 6 instead of 1, half because there are six ways you can go, which are all equal probable. And our expression, if you start out with probability one of being at the origin of double function in three dimensions, time t will have probability distribution for the three dimensional position, which is just the product of this distribution three times governing the x, y, and z displacement. So we're interested in the excess of steps up versus steps down, steps left versus steps right, and steps forward versus steps back. So it would be 1 over 4 pi dt to the 3 halves. And then the exponential, well, e to the minus vector x squared, so x squared plus y squared plus z squared divided by 4 dt. And that would solve the three dimensional version of the diffusion point in which the second derivative becomes gradient squared for points of x, y, z, and t. So in pretty much, if I consider the typical distance travel, now we add up the distance travel in the x direction, the y direction, and the z direction. That is, we add up the square. There's a contribution from the mean value of x squared. It's just the mean value of x squared and the mean value of y squared and the mean value of z squared. So the mean values all grow like 2 dt, and they add up. So if I take the square root, I'll have the square root of 6 dt. So again, the distance from the origin grows like the square root of the time, and the coefficient of that square root of time is determined by the diffusion constant. So what I design and realize back in 1905 is that we can use this picture of a random walk to understand Brownian motion. Where, interestingly, the Brownian motion is aptly named. It was discovered by Brown. And he, in the early 19th century, was looking at a pollen grain suspended in water under an optical microscope. Pollen grain is about a micron in size. And he noticed that it dances around and makes a crazy walk. And if you wait a minute or so, it'll typically move about 6 microns, about 6 times its radius. And I think, I'm not sure whether Brown observed, but others had, or Einstein, that the distance traveled goes like the square root of the time. Brown thought that, well, maybe the pollen grains are alive and they're swimming. But Einstein had a very different picture, which is that this poor little pollen grain is being pummeled from all sides by random collisions with molecules, which make it can't spread. So at any given time, it's being, because fluctuations hit a little bit harder from the left and from the right and from up and from down, forward and back. And so it executes this random walk pretty much because of fluctuations. And he is, well, something like this derivation of the diffusion equation, from a microscopic point of view. And we also do another important thing, which he explained what the diffusion constant was in terms of other invincible measurable quantities of the Einstein relation, which is fusion constant is equal to temperature times something called the mobility. The reciprocal is a measure of dissipation, how much friction is encountered by a particle which is moving through the fluid, which helps me think, say, our little pollen grain falling under the force of gravity through the water or whatever it is, will be the resistance to that applied force coming from the, in this case, the viscosity of the fluid, the fact that the fluid resists flow. And as a result, the particle will reach a terminal velocity, which is proportional to the force. And what that means is that the frictional force impeding the motion of the particle is balancing the force applied to the particle when it reaches the terminal velocity, which is proportional to the force and the constant of proportionality is b. So the larger b is the less resistance there is, the less dissipation. Small mobility means high precipitation, low terminal velocity. Okay, so where does this relation come from? I'll describe two ways to derive it or understand it. So the first is, you can imagine a steady-state situation where there's a force being applied to the particle. And let's say there are many particles in our fluid and they're all around because of the fluctuation. And we'll reach a steady-state where the diffusive flow, because of the gradient in density, is exactly proposed by the terminal drift of the particles. In other words, if we have a gradient that's more dense here and less and less densities go up, then because of the high-density diffusion, we'll make the density want to spread upward. In the steady-state that will be exactly proposed by the drift of the particles, the terminal motion in which the dissipation matches the applied force. So when the total flux in the steady-state is equal to zero, there are two contributions to that flux. There is the diffusive contribution, this law of diffusion, that there's flux proportional to minus the gradient of concentration. And there's the contribution which is this concentration times drift velocity, which is what flux would be if there were no diffusive contribution. And this is, if you like, the drift flux. And this is the diffusive flux. Once we reach an equilibrium situation, they add up to zero, they're equal or not. Now, when we're in equilibrium, then the distribution of particles in the fluid should be given by a Boltzmann distribution, where in equilibrium with the reservoir at some specified temperature. Concentration proportional to the Boltzmann factor associated with the conservative potential that produces the force. In other words, the force being applied to the particles is minus the gradient to this potential. In equilibrium, the probability of a particle being in a position should be proportional to the Boltzmann factor with the energy being determined by the potential. So that means if I take the gradient of the concentration, I differentiate this exponential, so I get minus the potential divided by temperature times concentration. And that's just equal to the force divided by temperature times concentration. So we know that in equilibrium, since the two contributions to the flux are the same, I have diffusion constant times gradient of potential equals concentration times velocity. But this is just diffusion constant times force divided by temperature times concentration. And this is since the terminal velocity is just mobility times force equal to energy times force. So the concentration and the force drop out of the equation. And the conclusion is that diffusion constant divided by temperature is equal to mobility times the Einstein equation. Now let's try to understand what's going on from a microscopic point of view. Like I said, these particles are actually why are they moving around? Confusively it's because they're being hit by molecular collisions. There's force being applied, I think that the particle is moving a little ways freely until the collision randomizes motion. And during that time it will accelerate a little bit. The acceleration will just be given by the force divided by the mass. M is the mass of the drifting particle, not of the molecule. And so there's going to be a reset time. The microscopic one we discussed last time will call that epsilon. That's the time that the particle moves before the molecular collision randomizes velocity and send it off in a new direction. That acceleration applied for that time is going to allow the particle to attain a component of its velocity in the direction of the force, which is the acceleration times the time, a epsilon. Actually its average speed during the acceleration will be half that. If we imagine it started out with no contribution to its speed coming from the acceleration starting at zero velocity, winding up at velocity a epsilon, the average of speed due to the acceleration is going to be the average equals one half the acceleration times the reset time of one half over m times epsilon. And so that means I can think of this as mobility times f. And we can identify the mobility at epsilon half not epsilon over m. Now suppose we use the model in which the diffusion constant in terms of the reset time and the imaginary path and d dimensions can be, you know, the step, the spatial step size, the distance the particle goes before its velocity gets reset divided by 2d times the reset time. What I said over here, what I'd like to invite you to think of that is 1 over 2d the square of the speed for the particle times the reset time. For this now, I'll call it capital D squared. It's the speed squared for our diffusing particle. So that means since this is v squared d epsilon, I can write epsilon in terms of the diffusion constant equal to 2d the dimension times v d diffusion constant divided by the speed squared. Okay? But if you can remember the mobility we said would be man-hand divided by the mass of the drifting particle. But now let's put in this expression for the reset time epsilon in terms of the diffusion constant. So then we have spatial dimensions divided by the diffusion constant divided by m of v squared. So now let's make the following leap. Before when we talked about the equal precision principle, we imagined applying it to the motion of the molecules themselves. So the kinetic energy of one half tau in each quadratic degree of freedom for the molecules in the gas. We can also apply equal partition to the kinetic energy of our drifting particle. Okay? So in thermally equilibrium, I can say that because of equal partition, the kinetic energy of one half mv squared mean value for the drifting particle will, if we're in d spatial dimensions, there will be a one half tau because of equal partition for each one of the components of its velocity and d dimension. So in other words, for mv squared, I can take d tau by equal partition in classical statistical mechanics and substitute in that into our expression for the mobility. We get the Einstein relation again, diffusion constant divided by tau. So this argument is a little more crude than we want to give initially, which is a bit more respectable. We have to get exactly the same answer. If we've gotten the same answer up to some constant like 2 or pi, I wouldn't have been too disturbed about it. But we can see that why it makes sense for the mobility to go like diffusion constant divided by tau. From the point of view of the microscopic model, if we think of the diffusing particle as having a kinetic energy set by equal partition, yeah. How do you drop the brackets? Yeah, well, so I'm saying that we're watching the particle over some long times compared to, you know, the typical fluctuations in the speed. And so if I want the mobility, the particle actually is, you know, its speed is fluctuating. But we're interested in its average strip speed with those fluctuations averaged down. And so I'm just going to take the average kinetic energy of the diffusing particle in this system. But that's the answer in the question again. This time, yes, this was kind of interesting. You could estimate Avogadro's number, which was a big deal at the time, because there were various people using various points of view, trying to estimate Avogadro's number and get a better understanding of whether they should believe in the reality of atoms. So as we've seen what the diffusion equation tells us is that if I look at the mean square distance traveling by a particle, let's say now three dimensions, that if I divide that by the time, I get six times the diffusion constant. And the Einstein relation tells us how to express the diffusion constant in terms of the temperature and mobility. And uncharacteristically, I'll write in Boltzmann's constant for a reason we'll see in a minute. For tau, it's actually kt times mobility. But Einstein knew what the mobility is of a sphere suspended in the water because there was a formula that Stokes had to run, which you might have to lay explaining because it requires explaining what viscosity is, which is a whole lecture in itself. But according to Stokes, if I consider a sphere moving through a fluid which has some viscosity or resistance to flow, theta, then the mobility can be written as one over six pi, viscosity times r, which is the radius of the sphere. So bigger sphere means it's harder to push the ball through the fluid because of the viscosity and therefore smaller mobility. And from looking in the microscope, ground new, or others knew by that time, about how big the pollen grain is, that it's about a micron in size, and the viscosity of water was also known from other measurements. So the mobility was known. What was not known so well at the time was Boltzmann's constant. So Einstein used this to derive the value of Boltzmann's constant. And you can translate that into a statement about an Avogadro's number because the Avogadro number is an ideal gas-dependent bar divided by Kb, that is the Boltzmann constant. If you write pressure times the volume, not as n, kind of like you usually do, but as nRT, where now n is the number of particles in moles, I don't know what that means, n. Well, the constant R was known, and R was just n, the number of particles in a mole times Boltzmann's constant. So it determined Boltzmann's constant, and hence I've got this number. How did he manage to get the value of capital D? Capital D? Yeah, he got it from the Einstein relation. He knew that it was the temperature and units of energy, which Boltzmann's constant times the temperature of degrees Kelvin. But he didn't know Boltzmann's constant. Yeah, that's right. He used this to find Boltzmann. Oh, how did he, okay, right. If he used it, he did this. He said, okay, here's a particle, and it has a size of about R plus 1 micron, or micrometer, and look at it under a microscope. It's big enough so you can see it with a good optical microscope, and it jingles around. And then you can, with a stopwatch, verify that the distance it typically travels, scales like the square root of time, or distance squared or scales like T, and with a constant of proportionality, which you can then determine by looking at how far it goes. And that's how 6D can be extracted from the data. If you put in the viscosity of water and R equals a micron, then the typical distance traveled is about 6 microns in a minute. So you don't have to look at the particle moving around for a few minutes and you do it many times, and you can get a pretty good estimate of this expected value of x squared. And that means you know D from observing the diffusing particle. And then Einstein had derived that D is KT times B, no, B was known, and so KT was determined. That's how the number was determined, and he got 6 times 10 to the power of 23. But in this case, it seemed like that determination about the Gauder's number was especially compelling evidence for the reality of atoms, because Einstein understood it. I don't know, should I put it again, that the interpretation was that Pauline Graham's getting hit from all sides in the ground, you know, out of it by all these molecular collisions, and that's why he was drifting around. It was a controversial subject in the early 20th century, whether atoms were real, but philosophically-minded questioned it, like Mark, because he said, well, you can never see them, you know, just a theory. And now we have lots of ways to see them. But this was a, the first hint that you really could see them, because the juxtaposition around the Pauline Graham was evidence of the Pauline Graham being subject to molecular fluctuations. So this Einstein relation was one of the first examples of something that's called a fluctuation and dissipation in a relation, using some kind of mobility times temperature. There's some connection between these two things. Mobility has to do with dissipation. The resistance to flow through the particle, particle going through the fluid. Actually, the reciprocal of mobility is a measure of dissipation. High mobility means low dissipation. This is a measure of dissipation. We see from the interpretation of confusion in terms of a random walk, or because of this behavior you can observe when a particle is suspended in the piece of medium, that the fusion consent is a kind of measure of fluctuation, once the particle is walking around with a brand of this random fluctuation. And there's a relationship between the two things, and it involves the temperature. So if you fix the dissipation, in this case the mobility, then as you crank up the temperature, the fluctuations get stronger and stronger. I mean, arising because of the molecular collision, pummeling, or pollen grain, are... We talked about another fluctuation dissipation relation. Don't you remember that? It was a while ago. Maybe this will remind you. So we can think of it this way, that there's some power being dissipated by the motion of the particle. I'm going to give it away. Johnson-Norris. Johnson-Norris, right. In other words, the power of the particle that's shaking around is like a force of speed. And if you think of the speed as being the terminal velocity of v squared divided by the mobility, according to definition of mobility. And so if I look at the mean square distance traveled by a refusing particle and divided by t squared, so this is like v squared over v, like something like power, dissipated by the diffusing particle subject to force. And it's, well, in these initial dimensions, it would actually be equal to 2d times the temperature times 1 over t. So you can think of t as a characteristic time over which you're watching the motion of the particle. And 1 over t is kind of like a frequency band. So this is an algex to the case of Johnson-Norris, which according to 19th formula, power is i squared r, a frequency band, which we're watching the fluctuations of the current and the fault. So if you like some characteristic time of observation for fluctuating current. So in both cases, the temperature controls the scale of the fluctuations. There's something that measures dissipation. Here it's r, the resistance is uncertain. Here it's 1 over b. 1 over b is a measure of dissipation for the particle moving through the fluid. And the size of the fluctuations scales with the temperature. Higher temperature, larger fluctuations, that's a fluctuation dissipation relation. So tau controls the size of the fluctuations. Before I do that, let me remind you, in case I forget to do, so at the end of the class, that there's a final exam. I expect to post it today. And what you do at the end of the exam period, which is 5th and 7th Friday, the format is similar to the midterm. The rules are the science. It'll be four hours this time. You can take a 15-minute break. It doesn't count. So I'm doing four hours. And there will be four questions. There's a lot of life tests, so four questions, four hours. And what are the questions about, you know, the content of the courses. But of course it's kind of cumulative. So chances are, if I were writing this exam, I did. I would draw more happily from some of the latter topics. What were those that seemed to me? Then we talked about phase transitions point of view and all the gases and things like that. Ah, sorry? What about kinetic theory? Oh yeah, a good point. That's exactly right. And kinetic theory. I'm glad you reminded me. You'll fill out, it helps me. And I guess you get an email telling you how to do it. You got some feedback from that survey that you did earlier. And there are two main things I learned. Actually, I want to see if among you, my biggest fans who are actually here, if there's a consensus about these things. One is the homework. Some people don't like the problems in K and K very much. They like it better when I write the problems. So if you, you know, in other things, being people, you could do a KK problem or a KT problem. How many, like, KK problems better? You know, they kind of tell you the things. How many like the problems I wrote better? Okay. So that's, uh, consistent with the KK and KT policy survey. Biggest complaint about the lectures surprises me. My monotone voice. I need to modulate my voice form. How many would like me to modulate my voice more? I'm bored. My voice more. How much do you like my voice less? Modulation and the rest of the lecture. And, uh, what I want to talk about is, um, well, it's the internet. Right. It means four hours when you're actually sitting, uh, working, writing, and 15 minutes where you go to the bathroom or have a snack or something like that. Hours and 15 minutes. Problems start to kind of well seem to end in a second law. The conditions of thermodynamics. The, the fun of this course has been to see how we can recover thermodynamics from microscopic point of view. And really, the germ of that idea, uh, arose in the 19th century, particularly because of the work of Polesman and Maxwell. But Maxwell had some qualms about the microscopic view of the second law of thermodynamics, which he expressed in an article in 1871. Limitations of the second law. Maybe it doesn't always work, he said. He imagine the following situation. Suppose I have gas in a box. And suppose the box is divided into two parts. There's a partition in the middle. I'll call the part to the left of the partition A and the part to the right of the partition B. And there's a little, when a molecule is hanging towards the door, I can open the door and let it pass, and it bounces off the door and doesn't pass. So, uh, since you probably don't believe that I can do that, Maxwell said imagine a demon, observational power, who can watch every single one of the molecules in gas at all times. He's watching where they go. For the purpose of this discussion, we can think of dynamic as being completely classical, okay? So, there, uh, you really can know exactly what principle you can know, exactly what the position and the speed, the velocity is of every molecule, and that's what the demon knows. And he allows any molecules that are going faster than me. There's a Maxwell distribution of velocities, right? Some are going faster, some are going slow, and equal to him. And, uh, so when he sees fast molecules heading towards the door, he allows them to pass if they're going from A to B. But if they're trying to go from B to A, he doesn't want to pass molecules through. He does let slow molecules through, though. So the molecules that have a kinetic energy less than a typical kinetic energy are allowed to pass from B to A. The ones with kinetic energy greater than typical kinetic energy pass from A to B. And the demon keeps doing this for a long time. And what happens? A is going to pull down, and B is going to heat up. And we started out with all gas and molecules at the same temperature, governed by a single Maxwell distribution. But after a while, A is cooler, B is warm. But as far as one can discern, there's no reason in principle why the demon couldn't do this, expending an negligible amount of work. He's only got to move this little tiny hatch open and shut to let one molecule through. It doesn't sound very hard. Okay? And with the, he's a very powerful demon, an excellent engineer. He may make a very efficient little door. He goes through. He's watching them all the time. That, you know, requires very digital vision, but it doesn't necessarily watch things. Look, I'm watching you. Am I working? No. So it's not clear that he has to expend more than an negligible amount of work. So the work is negligible, or it could be. One side pulls off, the other heats up. That violates the second law. If there's no dissipation, because the entropy of the universe is going down. And in fact, if I'm eager to, you know, lower my dependence on the imported oil, I could do this for a while, let the heat flow from the warmer side to cooler side, not afraid of the engine. So that doesn't, something seems wrong here. And in Maxwell, maybe there was something fishy about this. But he couldn't quite put his finger on what was wrong with it. So it seems like, under these circumstances, at least, second law of thermodynamics can be violated. Granted, it requires a fairly advanced civilization maybe to carry out this program. But some of the theories from now may well be able to do it through the years. You know, it's all garbage. Second law is wrong. Yes. Wait, what's your argument for the work is negligible in here? Well, I guess that's what we want to examine more closely. We'd like to understand whether, what are the real limitations on carrying out this procedure? Why did I say the work was negligible? Because first of all, observing the position and momentum of all molecules is not clear. Something we should think about, right? Whether that requires significant expenditure of energy, just watching. Okay. And secondly, I have to operate the little door. How do you make a window slow? You just wait until the one's coming. Oh, yeah, right. So they're bouncing around, right? So every once in a while, if I keep the door closed, they're going to bounce great doors. Okay. And instead of letting them bounce, well, either let them bounce if, you know, from coming from one side, they're too hot, or coming from the other side of the fold. Otherwise, I'll let them through. Okay. And you said to move both the doors. So that's going to cause some flow of heat from one side to the other, right? One side will get hotter, the other side will get colder. Second law says you shouldn't be able to do that without expending some work. Question is, why is there some unavoidable expenditure of work in this process if indeed there is? That was the question that Maxwell raised, but did not resolve. And some years later, quite a few years later, actually, Leo Szilard took up the question, which is a few years before he became, technically, the first person to conceive of a nuclear chain connection. And he thought about Maxwell's proposal and he made a number of important conceptual contributions in this paper. The one thing he, for the first time, suggested a kind of connection between entropy and thermodynamics and information because he seemed to feel that it was very important in doing our entropy accounting to think about the demon himself and how his thermodynamic state might be changing, might be changing because the demon is acquiring information when he observes the state of the molecular gas. That was an idea which, 20 years later, was taken, which was further by Shannon, founded Modern Information Theory in 1948. He also, apparently, was the first person to think of the concept of a bit, that we can express information in terms of two-level systems. He didn't call them bits, but he had the idea. And the idea that we could quantify information by how many bits we need to collect the information, how many letters we should get together. And so Szilard said, well, it's helpful to think about Maxwell's idea in concrete terms. Let's make it a simple and as possible obvious good thing to do for your physicists. I'm going to come well down the idea to a simplest realization and the best chance of understanding. He focused on the question, what is it that really is irreversible that's going on here? He noted that the demon is going to have to measure. He wondered, is there some dissipation associated with the demon acquiring information? He noted that the demon was going to have to store information in a memory. And he wondered about whether that involved some kind of irreversibility. And he just thought that came here when it came to this last one, but he also seemed to realize that the demon will need to reset himself, which means that information will have to be erased. And he tried to put his finger on what the catch was, why in fact the total entropy of the universe maybe isn't going down if we analyze this process carefully. Now here's a version of the demon machine that he envisioned, said it's gas in a box. What's the simplest gas? What's the simplest gas? Yes, but even simpler would be just one molecule. Zero molecules wouldn't do, but one molecule, we can talk about gas. Now of course if we want to just switch to mechanics, we want to add many molecules. So Lara said, well okay, we can consider many of the one molecule gases in the physical mechanics to that ensemble. Well let's just look at one molecule at a time. So there's a box and there's a molecule in it. It's bouncing around. It's in contact with the reservoir. The reservoir is a temperature top. So we're going to consider isothermal processes a temperature top and here's how we'll try to get some useful work without useful work for free essentially, which would lower the entropy of the universe. So there will be four steps. The first step is we will start out with a box which has no partition. There's a molecule in there somewhere and then we will suddenly close the door, partitioning the box into two parts, the left part and the right part. And when we do so we will capture the molecule in either the right side or the left side. The two occurring essentially equal probably. But we'll catch it on one side or the other. Okay. That's the first step, partitioning the two parts. But we haven't at this stage observed whether we caught the particle on the right side or the second side. So the second step is to measure if the demon learns whether the molecule got thought or the left. Which of the two possibilities is it? Is it stuck on the right side or on the left side? Let's say we found out that it's on the right side. So then what I'll do is I will allow the partition to move. I've got a molecule that's on the right side bouncing around off the wall. And this now movable partition with weight, with a rope going around the pulley, weight. Actually, yeah, I guess that's right. And now I quasi-statically allow the gas to expand. It's going and bouncing around, it's going to kick the movable wall once in a while. And it's going to slowly move the wall to the left, but in a way. Okay. So the third step will be isothermal expansion. The wall has moved all the way to the left side. And now we have no wall and we have a single molecule bouncing around inside the unpartitioned box. How do we know it will expand all the way that way? How do we know it won't just come to equilibrium? Well, it's an isothermal process. So it won't actually lose energy. It'll stay in contact with the reservoir of temperature T, so it'll keep bumping into the wall and pushing it. So you took the energy to raise the weight out of the reservoir? Sorry. So you took the energy to raise and lift the weight out of the reservoir? Well, the weight itself, it doesn't really have to be in contact with the reservoir, but the energy to lift the weight is coming from the reservoir. So that the molecule is doing work of lifting the weight, whereas the energy coming from it's coming from the reservoir, because the molecule will stay in thermally cavern with the reservoir of temperature T, but in order to stay isothermal while it keeps pushing the weight, it has to acquire energy from the reservoir. So the total amount of work that's going to be done, I'm going to put the case at it for time being, constant constant times temperature times log 2. This is actually a reversible isothermal process of the sort that we talked about before. It just happens to be just one molecule. So before we said that when the expansion occurs at constant temperature that the entropy increases by, so this is now the conventional entropy, so I should put in a factor of k, Boltzmann's constant times the log of the final volume after expansion divided by the initial volume before expansion. I guess I didn't explicitly say so, but I was imagining that I put the coefficient right in the middle. So when the wall gets pushed all the way to the left side, the volume has doubled. So the entropy in this case, when n is equal to 1, and the ratio of the final volume to the initial volume is 2, the entropy increase is just k log 2, and the amount of work done isothermally then is kT log 2. The work according to the first law is just the temperature times the change in entropy. So there's heat flowing from the reservoir into our one molecule gas and that heat is doing the work. We let this wait, but there was no waste heat. We did this at temperature tau, we had reservoir temperature tau, we had no zero temperature reservoir. We didn't have to produce any waste heat, so no waste heat. We only had one temperature, we were able to seal of kT log 2 of energy from the reservoir and do useful work with it. So we did work and the whole world, the whole world, the number of states of the gas expanded by a factor of 2, so the entropy of the gas increase, and that increase in entropy we were able to use to do useful work without any waste heat, and that violates the setting law. This is better than the caramel efficiency. We had no cold reservoir, but we could do work. We just threw it right out of the reservoir at the temperature tau. And then the fourth step is we set up the whole thing over again and do it again. So we can keep doing it over and over again, each time we get work out kT log 2, about half the time the molecule when we measure it in step two is going to be on the left, half the time on the right. We use that information to know which way to load the molecule all, okay, so the way to always be lifted. But every time we do kT log 2 until, you know, I'm tired of using my laptop. Well, the reason I emphasize that we can repeat this is because like in our analysis of a heat engine, we wanted to consider a cycle that we could repeat over and over and over again. So we wanted to come back to the initial situation, so everything would be the same from the left hand and then from the cycle. It seems like I can do that here, right? I started out with a molecule in the unpartitioned box, taking it, I let the gas expand, and then I'm back to a molecule in the unpartitioned box. So I do the thing over again. Oh, we should always ask a heat engine or a refrigerator that seems to do paradoxical is, is this cycle really closed? The same, at the end of the cycle is when we can or if something changes and we need to keep track of when we run the cycle over and over. Well, let's think about the demon himself or herself. Is the demon in a different situation, a different state at the end of the cycle than at the beginning? At the beginning he doesn't, well, at the beginning there's uncertainties which have the molecule, but at the end there's 100% uncertainty. Yes, he has reports and information. The demon stored information in his memory. He had to store it, right? He made the measurement to figure out whether the molecule was on the left or the right. He wrote that down. Then he was able to figure out how to load the wall during the upper thermal expansion. What he stored was one bit left or right when he measured it. So really the system that we're considering is not just the gas, the one molecule in the box. It is the one molecule in the box and the demon's memory. We brought the gas back to its original state, but not the body. Now the demon might have all the memory and if he has a finite amount of memory, eventually he's going to use it up and he won't be able to do this anymore. He won't be able to store another bit unless he brought it. So the demon who must eventually write, if he wants to reuse it, if he doesn't, if he just stops after he runs out of memory, well, then we have to remember that the memory itself is, well, we have to ask how we should model that. If it's really a perfect memory, it doesn't have any fluctuation. It is a zero temperature reservoir. It's a system that's zero temperature. So we shouldn't be so shocked that the demon was able to get some useful work out because he had a low temperature reservoir. That was his memory itself. So if we wanted to keep doing this over and over again and he has only a finite amount of memory, eventually he's going to have to erase. And so we have to have to ask the question, what is the thermodynamic cost of erasure? This was a question that was actually asked by Rothland out in the early 60s and his answer was that erasing a bed requires work, which is at least kt on two. On two because we're erasing one bed. So if Linda was right about that, then there's really no paradox. If we want to really have a closed cycle and we want to put the demon's memory as well as the box where backwards started, then we're going to have to pay back the k log 2 of entropy that we withdrew from the reservoir during the erasure. The erasure itself will be a dissipative process. It will dissipate heat. Entropy will go back to the reservoir. Everything's back where it started, but we got no useful work out in that because the work that we did using the one molecule gas, we had to pay back when we erased the demon's memory. Okay, so the question is why? This is called Landauer's principle in 1661. Well he didn't specifically discuss Maxwell's demon. Landauer's main point was that was quite interesting, and I'll come back to that at the end if we had time, which maybe we won't, so maybe I should say it now. He worked for IBM, okay, and his job was to think about computing, and he said that there's a thermodynamic cost to computing which is unavoidable because when you compute, you necessarily erase information because logic gates are not logically reversible in general. They can destroy information. He said that it would cost about kT in energy each time we do a gate in a computer, and so you'd never be able to reduce the energy cost of computing down to zero, and that was his main point. He was wrong about that, but he did state correctly that whatever is logically irreversible has some non-zero thermodynamic costs, in particular erasure, erasure is something that once it's done, you cannot undo. That's the sense that it's logically reversible. His statement was that if it's logically reversible, that means it's thermodynamically reversible, but it wasn't until another 20 years later that Charlie Bennett really completely explained what was going on, namely that Landauer's principle is the key to understanding why Maxwell-Steemann doesn't violate the second model. Trying to explain why erasure is thermodynamically irreversible, why we need work to do it in particular, the simplest thing to do is to consider some model of a memory and ask what we need to do to erasure it. So, along with the best of the effort to describe Szilard's machine, I'll imagine that the demon uses a similar machine for his memory, the Szilard box. He wants to record a bit. He puts a molecule in the box. The box is divided into two parts. He either puts a molecule on the left side, let's say the right side, and he calls that going to zero, or he puts the molecule on the left side, playing this Szilard cycle many times. Each time he stores another molecule in the box to detract the information which he's got to have on hand when he loads the partition, it wasn't on the left side or the right side. So, I'm using the box two-way, kind of using it as the object that is running a heat engine cycle, but I'm also using it as a memory, another box, which I can use to record information. Now, every time I've done this three times, I've got lots of molecules in boxes, each one stored on the left or the right, okay? Therefore, the cycle isn't closed. Every time I rag the cycle, I turn it on the bed. I imagine I want to erase all the bits. How do you erase a bit? Well, in this case, I have to do something, some physical process, which, irrespective of whether the initial recorded value was zero or one, will take the bit to the same final value. Give me a zero, irrespective of whether I had zero or one to begin with. That's what erasure is. It's a map from a bit to a bit. And no matter whether the bit is originally zero or one, it maps it to the same value. Let's call that zero. So, when I say it's irreversible, I mean, this is not an invertible map. If you know the output of the erasure, name zero, you don't know whether it was initially as it were or one. They both got mapped to the same value. So, non-invertible map, and call that logically irreversible. So, how do I erase information? I've got to do something like this. No matter whether the molecule was on the right or the left initially, I have to make sure it's on the right finally. And here's the way you do the erasure. And the analysis would apply no matter how you do it. Here's an erasure procedure. Erasure means that I have a way of storing zero or one, and no matter how it was stored in the beginning, I want to have a zero at the end. And I have to apply a procedure which works the same way, no matter whether it was a zero or one to begin with, because it would work differently depending on whether it was a zero or one to begin with, I would have to record that information. Can't you just remove the partition? Well, I could remove the partition, but then that will not take the memory back to its original configuration. I don't know whether it's a zero or one. If I say that the zero state is the state where it's on the right side, and the one is the state where it's on the left side, I just remove partition. Now it's neither on the left or the right. It's bouncing around. So I can try to force it to be on the left or the right. In fact, I can't just put the partition back because that would just, I wind up with a random bit. It might be over here or it might be over here. So I want to do something which will make sure that it's always on the right no matter what. So what should I do? Do is compress the gas. So I start out with the movable wall way over here on the left. The molecule is somewhat in box. I'm in contact with a reservoir, temperature top, and I compress. It goes ecologically in the white middle, partly compressed here. When I'm done compressing, the wall is in the middle again, and it's guaranteed that the molecule is on the right side. When I did the compression, I had to do work. I was actually reducing the entropy of the molecule, reducing the volume that it occupies from the final volume of two to the initial volume of one. So the amount of entropy, entropy and gas are reduced under isothermal compression. Volume is half of the initial volume. So where does that entropy go? Where does it go? It goes back to the reservoir, right? So there's going to be, because I'm compressing, I'm going to work on the gas, but it stays at temperature top. So there's going to be heat flow from the gas back to the reservoir. So W equals kT log 2 into reservoir. Entropy of reservoir increases. So when I got work operating this large machine, I told entropy k log 2 from the reservoir. I used that heat to do work kT log 2, energy kT log 2 that I drew from the reservoir. But it wasn't my close cycle because I wound up with a bit stored in the daemon's memory. I got a really close cycle. I've got to erase that bit. The erasure of the bit, though, is dissipated process. It sends what is done as reversibly as possible entropy k log 2 back to the reservoir. And it requires that we do work kT log 2. The general principle is that erasure is a procedure that compresses the base space, reduces the number of possible states of the system, right? That's what we mean by erasure. And we have a bit. It means there are two possible states, the molecule here or here. When we erase, no matter what, we want it to be here. But here are the two states. Here it's only one. We've lost one bit of information, and so we produce the entropy like k log 2. And there's no way we could get around. So that's the resolution. We have to, it's a little bit surprising when you first hear it, because you think that remembering things is hard, you know, studying for exam and forgetting things is easy. But it's actually the other way around. It doesn't cost any energy to remember something, but it costs kT log 2 to forget a bit. So if you work really hard, you can forget everything I told you. And if you can't start in 10 years or so, you'll be successful in 10, so I hope. Okay, thank you.