 of all the x's and time. And this is the kind of stuff that you would usually solve just by integration. But when you look at chemical kinetics, a couple of things change. For example, in one dimension, you will have dx dt with some f of x. But in chemical kinetics, we have to explicitly keep track of the total number of molecules and the number of molecules of every type can increase or decrease and the increase and decrease steps are totally different kinds of processes. They have totally different physical implications, right? So one is about creating a molecule, the other is about a molecule that's already there, which is being removed from the system. And so in general, what we keep track of, so this is the standard dynamical system. But for chemical kinetics, we generally write these equations as dx1 dt is some f1 of all the x's and maybe t minus g1, all the x's may be t, dx2 dt, that's f2. So this notation contains a lot of compressed information. But what it's saying is that for each chemical species, you have a different equation. And those equations are split up explicitly into creation terms, degradation terms. The rate at which a molecule is created can depend, in principle, on how many other molecules there are in the system up to that point. And similarly, the rate at which a given molecular species is removed can depend on, in principle, all the other chemical species up to that point. So these functions could be very complicated. And by the end of today's class, we'll be writing down a particular nice example of such a function that has relevance to this problem of gene expression, which I mentioned a day before yesterday. A little bit about units. Now henceforth, whenever I write equations like this, these x's will actually represent the number of molecules of a particular species in some fixed volume. And I'm not going to explicitly include the volume. It could be a bacterial cell, which the volume is 1 cubic micron. But I'm just going to count how many such molecules of a transcription factor or protons or whatever it is there are. So these x's, going forward, are going to be actual numbers. Now numbers, since the volume is fixed, this is a bit odd, but since volume is fixed, these numbers are dimensionless. And that's going to be important going forward. So we had that. And a very simple example of such a process, we looked at this one, a simple, standard creation of a molecule at a constant rate. And we wrote down, you know, obviously the solution to this equation is starting from some x0, let's say it's 0, it's a straight line. But I mentioned at the end of last class, since molecule numbers are discrete, the answer cannot possibly be a straight line because you can't have 0.5 molecules. So it must be some step, step like function, right? So is it this function, which is sort of the simple sawtooth? It's actually not that function. This is wrong. It's wrong because what you're really imagining here is a system where some molecules are being created from some pool of available substrates, it doesn't matter what's going on here. But a sawtooth curve like this seems to imply that the system is aware of a fixed amount of time between successive creation events and it's able to precisely tune the times between the release of a new copy of the molecule. And that kind of stuff implies all kinds of underlying complexity. There must be a timekeeper somewhere. And since there isn't, it's just a tube and molecules are just being created by standard chemical kinetics, it cannot look like that. So we step back a little bit and what we did was we tried to understand in a large time interval t, assuming this rate is alpha, the total number of creation events must have been alpha times t. Given sufficiently large t, this is a very, very good approximation. And we split this up into n steps and each of these little guys with dt, right? So we have n dt is equal to big t. And these creation events I'm going to represent as little arrows like so, right? Because they represent the idea of creating a molecule and here the wrong picture has these arrows operating on a clock like a metronome, which is wrong. So we're going to try and find out where these arrows lie. If these little intervals are sufficiently small, then we can always assume that two events don't fall in the same bin, this is one of these standard moves. In that case, we know the total number of events is alpha t, the number of bins is n and therefore the probability of a creation event in one bin of width dt must be equal to alpha t over n, which is alpha dt. I wish the board was bigger. I will write bigger and then I'll try and transfer some stuff to this side. If this is the only thing you can't read, don't worry about it. But yeah, I'll try and write a bit bigger. So what's happened here is what was previously rate, rate which is like 10 molecules per second has been shifted to something else, which is probability per unit time. So alpha now, alpha itself is not meaningful, but alpha times some dt, let's say dt is 0.01 seconds or something sufficiently small. So alpha times dt is an actual probability. It's a number that is between 0 and 1, but in fact it's a number that's very very small. So it's much closer to 0 than to 1. We already looked at something like this in the beginning of class yesterday. So this is precisely an example of something called a Poisson process. The Poisson process is something where you have each event occupies an infinitesimal amount of time. We're assuming that the creation event of a molecule itself is something that doesn't take up time. It's a fairly decent approximation compared to the time interval between creation events. And in that case, the question that is relevant is how many creation events have there been up to a certain finite amount of time? This curve is simply calculated by adding up all these arrows. So in a sense, if you think of each of these arrows as little delta functions with area 1, then the integral of this function should give you the actual number of x's in the system. This is the sum of delta functions. And the question is again, how many incremental steps have they been in this list? So this is very easy to solve. The answer is the binomial distribution. So we know exactly how to calculate this because it's a Bernoulli process. It's a Bernoulli process where heads means you created a molecule, tails means you didn't create a molecule. There are a lot more tails than heads because it's a rare event, but nevertheless we can go ahead and calculate the whole thing. So the probability that there are n, let's say, let me call it n plus creation events in n bins, the probability that there are n plus of these little arrows when the total number of bins is n is given by n factorial over n plus factorial n minus n plus factorial. Then we have something like p to the n plus, 1 minus p to the n minus n plus, where p is exactly this time, little p. Any questions about this? It's just the Bernoulli process, this is just a Bernoulli distribution. The only interesting thing being that this p is very, very small, p is very, very small. So in the limit of small p, how does this actually work out? So this is where I always mess up, but let me see. Well, n plus is going to be much less than n and p is going to go to zero and n is going to go very large. So you have to take a limit and see what happens here. So let's see what, question. So this then is equal to n factorial or approximately equal to in the limit that n goes to infinity and np is constant, is some constant. I know this is small, forgive me. This goes to the limit. I need your help because I'm going to be too close to the board to know what mistakes I'm going to make here, but let's see. I'm just trying to think of the easiest way to get us to the solution. So p to the n plus, which is fine, 1 minus p to the n because n is much bigger than n plus. And this we know, the limit of this is e to the minus np. This is just the standard form when p goes to zero and n goes to infinity. There is no problem with that. p to the n plus, we know what that is, which is, forgive me, I'm screwing this up, I'm screwing this up. I'm just going to give you the answer since I don't want to waste time driving the distribution, but you can just calculate this as a limit. Trust me, it's much easier than the mistake I'm making over here. So the chance of there being n plus reaction events in n bins turns out to be, let's say lambda is equal to np, which is the expected number of events. The answer turns out to be lambda to the n, the probability of there being n events turns out to be lambda to the n over n factorial e to the minus lambda. And this can be derived in a way that I'm just being stupid to derive as a simple limit of this. The limit that p goes to zero, n becomes big. You get e to the minus np, which is e to the minus lambda. This n factorial comes from this, and these two cancel. So we got the right answer. This lambda to the n comes from some sort of expansion of the factorial. Are there any questions? Yeah? So you can see where all the pieces come from, right? This term contains n factors in the factorial. This term contains n minus n plus factors in the factorial, so you have just n plus terms that remain, which is that kind. So this is n plus, sir. And this is called the Poisson distribution. And you can commit this to memory. It's actually fairly straightforward. You can easily check that it's normalized. You can check that it's normalized because you want to add this up over all values of n plus. And when you do that, this thing just evaluates to e to the lambda, multiplied by e to the minus lambda. Everything is cool. We already worked out the mean and the variance of this distribution. Yeah? So the mean of this Poisson distribution is lambda, and the variance of this Poisson distribution is also lambda. We worked this out last time just by taking the standard sum of random variables. So this is important. So I'll write that down here. In big probability of n plus events is lambda to the n plus factorial e to the minus lambda, where lambda is equal to alpha times t. It's the total number of events you expect in the whole system, in the whole time interval. Where is the eraser? Let's just keep this up here for now because I want to calculate something else about this process. Another interesting quantity to ask something about is how long you have to wait between successive events. So one thing you could do is you could actually do this experiment. I mean, this is the standard experiment you would do if you had a very big block of radioactive material, and over the first few radioactive decays, the total amount of material hasn't decayed too much. So the rate at which you'd get sort of clicks on your Geiger counter per unit time is constant. And what you could do is just do the experiment and just record how much time elapsed between the first and the second click, the second and the third click and so on. And so you make a list of these times. This thing is also something that has statistical properties. I could ask what is the probability distribution of having some time interval separated between some bins T and DT. Or I could ask for the probability density of these time intervals. I could take all these times, I could make a histogram, I could bin the histograms, and I could ask what do these time intervals look like. So we can calculate that, it's actually fairly straightforward. So what we want to do is to work out what is the chance that you actually elapse a certain amount of time tau with nothing happening and then the chance that an event actually happens in the final step. And I can write that down. So what is the chance that you have a certain number of bins which are let's say A bins with nothing happening and finally in the last bin something happens. So the chance of nothing happening in all these bins, remember the chance of something happening is alpha DT. So the chance that nothing is happening in each one of those bins is 1 minus alpha DT and the total number of such bins is A. Now if the total amount of time here is tau and each of these things is A times DT, so A is then tau over DT, that's the number of little bins, sorry tau over DT, that's the number of little bins where nothing happened. And finally in the last bin something did happen, so the chance of that happening is simply alpha DT. So it's literally the Bernoulli process, it's just that there's a very, very small chance of getting ahead, most of the time you're getting tails. So in the limit this then goes to E to the minus alpha tau, alpha D tau, just evaluating. Now what is this crazy thing? This thing is the answer to a sentence which can be framed in as an English sentence. What is the probability that starting now nothing happens up to some time interval tau and then literally something happens in the very next time interval of with DT? So what is the chance that nothing happened up to this interval and then happened in the very next interval? There's a reason why this D tau is stuck in there, right? Because the answer is actually probability density. If your bins are very, very small, the chance of something happening is itself very, very small, but that's just a matter of your choice of bin. The probability density should be independent of your bin, okay? So I'll write that down again as an English sentence, right? Probability of nothing happening for interval tau and then something happening in a bin with D tau, I'm just summarizing this as a long English sentence, you know what I mean? DT to the minus alpha tau, D tau. This is all the, okay, fine, DT, D tau. So the tau is the label I'm using to designate the end of this. You could call it DT also, yeah? But it's the same variable. When you're integrating, the variable that you're integrating over is the little variable that you're going to label the differential with, okay? So think of it as D tau, it's perfectly fine. So what is this, right? This is this curve. This thing, when graphed, is simply this curve, okay? Times alpha, right, that, sorry, alpha is very important, that alpha is very important, okay? So you can check that this is a correctly normalized probability distribution because this is one, okay? Where the time interval could be anywhere between zero and infinity, I mean, in principle it's possible that you might have to, even though there's a rate which is finite, you might have to wait infinitely long, you know, for something to happen. There is a very small chance, but there is a chance that that can happen. The limits of the integral are waiting time zero, waiting time infinity, okay? So stepping back now, we've derived it, but I just want to explain again how you would get something like this in a real experiment. In a real experiment, you have a block of radioactive material. For example, it's a sufficiently big block that we're going to assume that the total number of radioactive nuclei hasn't changed much over the course of the experiment. Therefore, the number of decay events you should get over the course of your experiment is some constant probability per unit time alpha. You're measuring with a Geiger counter, you're recording precisely when those events happen. Then you're taking the difference of times between successive events. And if you do this experiment for long enough, you'll have a very large number of inter-event time intervals written down here. You can always do the standard thing where you then plot that as a histogram with some sort of bin, right? But in the limit that the bin size becomes small, this histogram goes to a probability density, and this is the probability density that you get, yeah? The mean value, by the way, the mean value of tau is 1 over alpha. You can just get that by integrating the exponential. So, let me also write that down. So we've now calculated two things. So we've calculated that when you have a Poisson process. The Poisson process is something where the reactions happen with a constant probability per unit time. Then in a finite time interval, the total number of events is itself a random variable. Every time you do this experiment, you're going to get a different random variable. And that random variable has a distribution. Here's the distribution. This is the random variable. Each time you do it, you get a different value of n plus. You get a different number of creation events. The chance of getting zero creation events in a time interval big t, for example, is simply e to the minus alpha t, okay? So it could happen, you get zero events. The chance of getting a very, very large number of events is, of course, rather small because this thing takes over, fine. Now, the other thing we calculated is this idea of inter-interval times or waiting times. And that itself is also a random variable. And it has a distribution, and that distribution is exponential, yeah? Now, when you read the literature, there's a little bit of confusion. This is the standard thing. I mean, you must have seen this before. If you have a memory-less process, if you have a process that doesn't depend on what happened in the past, it's the standard Markov assumption, then the waiting times will be exponentially distributed. That's what we just calculated. So when people say you have a Poisson process, they mean one of these many things. They mean constant probability per unit time of event happening. They also mean the event inter-waiting time between events is exponentially distributed in this fashion. They also mean that the total number of events in a fixed time t is given by this formula, okay? All those things are equivalent ways of saying it's a Poisson process. Only this equation is called the Poisson distribution, right? So the Poisson distribution is merely the total number of events you would get from a Poisson process after some fixed time has elapsed. These are all very different ways of using the same word. It sometimes leads to confusion but shouldn't, okay? Any questions? Yes? This one to this one? Well, no, because as I'll show you later, this thing can arise in many, many different circumstances, not just this one. Yeah, so this does not contain all the information about the Poisson process. In a sense, this contains all the information about the Poisson process. But this one does not. So now that we've got that far, okay? We're going to do, we're going to add a little bit of complexity to this, right? So now suppose that we had dx dt is some f minus some g. It's a slight, it's a slightly more complex, yeah. They are totally equivalent statements. There's no way to get this other than constant probability equivalent. What's not equivalent is this distribution to this, okay? So we're going to split this into two terms, yeah? And now, by the way, so what would this curve actually look like? This curve would actually look not like a smooth metronome type of curve. But more like what's called popcorn noise. If you put popcorn in the microwave, then you get short waiting times, then you get long waiting times, you've all done this. So you're going to put arrows at random intervals. And therefore, the curve should actually look, I'm not going to draw the steps. It'll look like some jiggly thing, it won't go down. It'll look like some, or some discretized version there, yeah? This is correct. And you could ask, on average, does this thing look like that? In other words, does it still go through the same straight line? You could ask such questions, right? And the answer, of course, is that it does, because the expectation value will always match. If you repeat this experiment many, many times, then presumably, these curves are all going to track the same straight line. So that's what chemical kinetics always looks like. And now we're going to add just one additional complication. We're going to add, apart from a creation process, we're going to add a destruction process. And then simply ask, now what happens? The creation process is itself a Poisson process. The destruction process is itself a Poisson process. With some, these are called propensities also. These alphas are called propensities. The alpha over here is called a propensity. What used to be called a rate turns out to be a probability per unit time for which this word propensity is used. So imagine what's going to happen. You have some starting value of x1 at, let's say, t1. And I wait for some time, and I take an observation at t2, and I ask my standard question. Where will x be at t2, given that it's at x1, at t1? And given that there's some constant rate of creation, for the moment it's constant, and some constant rate of degradation over that time interval. So in this case, the curve doesn't have to just go up. The curve could go up and go down. So in general, you get some sort of curve, some sort of squiggly, e-jiggly curve. And you're just asking where you're going to sit. So well, the answer can be fairly complicated, depending on if these f's and g's are complicated. But for the moment, let's, like I said, assume that they're constant. And let's also assume that this time is sufficiently large that there have actually been many creation events and many degradation events in that time interval. And then we're going to try and use this to guess what the distribution looks like at exactly that observation time. The usual game we've always been playing. So how many n pluses do you expect? So the n pluses that you expect, let's call this delta t. The n pluses that you expect is simply f times delta t. That's this thing, because the creation events are themselves a Poisson process. The n minuses expectation is simply g times delta t. Because again, they're totally independent processes. Creation events happen independently. Destruction happen independently. You're just getting this curve by taking a cumulative sum of a bunch of plus ones and minus ones and a bunch of zeros, which we're not even sure. So that's nice. So clearly delta n, which is n plus minus n minus, or delta x, will simply be f minus g delta t. The expected change in x. So this thing won't be flat. It might even be increasing. The sum overall value delta x that you accumulate due to these creation and destruction events. Now that's not interesting. That's just good old fashioned chemical kinetics. What's really interesting is the question of how wide this is. So if I call that sigma, then you find that sigma squared, the total variance, is simply the total variance of this random variable, n plus minus n minus. And you know that variance is add. So the total variance, sigma squared, must be equal to delta n plus squared plus delta n minus squared. You know that means add, and the means come with plus and minus signs. That variance is also add, and both these variances are obviously positive by definition. So the total variance, sigma squared, that you would have picked up due to this process is the sum of the variances of the creation and the destruction events. And this is simply the addition of random variables formula we spent so much time on yesterday. The only additional thing I'm going to say is that we actually know the values of delta n plus and delta n minus. Because each of these up and down arrow events and processes are Poisson processes, delta n plus squared must simply be equal to the expectation value of n plus. And delta n minus squared must simply be equal to the expectation value n minus. And that's, oh, I forgot to write it down, right? So the mean and the variance will be the same for the Poisson distribution. So this must be equal to f plus g delta t. Are there any questions? So the variance, not the standard deviation, but the square of the standard deviation must be equal to f plus g times delta t. So if I were to then write down x at t plus delta t, that will be equal to x of t plus. So that's where you started. Plus the amount of increment that you got, that's this guy. Plus some distribution, some amount of noise, some amount of fluctuation, whose variance is given by this quantity. Now we're going to make an approximation. So far, we haven't made any approximations. All this is totally correct. It's exact. Now we're going to make an approximation. Remember I said that there's a certain number of events here. Let's assume there's 10 or 20 or 100 events in here. Let's assume there's enough events in here that we can reasonably approximate this as a true sum of a lot of random variables, which is a Gaussian distribution. So then you're going to have a Gaussian random variable multiplied by this standard deviation. So we just churned and calculated this. It's fairly straightforward. It was exact right up to this point. This is the point where we actually made an assumption or an approximation. And I'm not going to prove whether this approximation is well controlled or not. And part of one of your homework is to check whether this kind of thing gives the same result as a more exact solution. So what did we do? We took standard chemical kinetics. You're not just having a creation event, but you're having creation minus degradation. And you notice now it's very important that we separate these f's and g's correctly. Because these terms contain f minus g and f plus g. And you would not know those two things separately if you didn't know the actual values of f and g. In particular, I can't arbitrarily add a huge amount to f and a huge amount to g. Although f minus g is kept constant by that, f plus g is not. So this is highly constrained. Once you write down that this is the chemistry that's going on underneath, you're literally driven to this equation. Even though traditionally, deterministically, I could have added anything to f, added the same thing to g, and got the same deterministic equation. It would mean that the process was happening at, say, 100 times the rate. I would add lots more to this, lots more to that. But that's not what you're allowed to do. You have to literally go in and see what the rates of creation and destruction are. Because you're literally interested in how many of these separate types of events have. Are there any questions about this? Yes. Well, yes. No, this is what I'm saying. So you can't do that, because if you look. So this is the deterministic equation. And the solution to the deterministic equation are things like these straight lines. And these straight lines cannot possibly represent the number of molecules in the system, because that's what ostensibly my variable x is trying to capture. So what I'm trying to argue, your question is correct. When you first see this, it looks really odd, right? Why don't I just call this alpha? Let's assume f is bigger than g. Why don't I just call this alpha and go for it? You can't do that. Because what's important is a separation. If f and g had just been alpha, this term would have been alpha, that term would have been just alpha. Or if f and g were both equal, right? If f was alpha and g was alpha, this term is zero. But that term is still there. So the separation actually matters. And it matters because we're dealing with individual molecules that have been created and destroyed. And in principle, if I had a very powerful microscope and we have these things these days, you can go in and look and watch something being created and watch something being destroyed. And you can't pretend that that's the same as having 1,000 things created and 999 things destroyed. It's not the same. So you're inexorably driven to this description of chemical kinetics. There's literally no other way you can do it. So well, OK, this is interesting, right? Yes? This one? So what I'm asking is I start at some x1 at time t1. And my usual question is at time t2, what's going to be the distribution of possible outcomes if I run the same thing many, many times with the same initial condition? Now where I'm going to get the x's? The x's by definition are given by how many creation events there were, n plus, minus how many destruction events there were. So I know the expectation value of n plus, because it's a Poisson process. So it's just f delta t. I know the expectation value of n minus. I know the expectation value of the increase because it's just n plus, minus, n minus. So this part is easy. But remember, n plus and n minus are themselves random variables. Every time I run this, there's going to be a different amount of n plus and a different amount of n minus. That leads to a broadening of the final outcome. So the variance, which is the width of this, must be given by the variance of the sum of these two variables. Since variance is add, this formula comes from the fact that variance is add. This formula comes from the fact that these processes are Poisson processes. So it's a series of, you have to unpack this into a series of reasons why I've written down all these equalities. So far, all exact, this is now an approximation. That's an approximation. It's not exact. And in general, even for large collections of equations like this, this kind of development is a reasonable way to think about this collection of equations for chemical kinetics. Except that f and g will depend on the current state of the system and on time. So they'll become functions instead of being constant. Yes? They can. So good question. So now, when you have, you've already seen over the course of this other lectures in this school, people writing down stochastic differential equations. You get this DW term, you get Brownian motion. Yesterday, we saw an example of Brownian motion for a particle that had angular fluctuations. So in those cases, the Brownian motion is literally Brownian motion. It comes from a diffusion type of process. And its distribution is exactly Gaussian, which is why equations like this, for those physical processes, this is pretty much exact in the physical limit. In chemistry, there are other sources of noise. It could be that the temperature in the tube is varying according to some slow process, which is influencing the rates of all the reactions. And it's influencing it in a random way. So if I repeated the experiment, this increase would have been different, not just because of this kind of noise, but because the temperature fluctuations were also noisy. What's important to understand is that those external sources of noise, you have to have some other law to find out what they are. But the internal sources of noise are absolutely fixed. This noise term when it comes to chemistry, you don't add the noise in. You don't add the noise in and say, I'm adding a fake amount of noise. You don't inject the noise. The noise is not injected. The noise is very simply a consequence of the fact that molecules are discrete, and there's no other way it can happen. So it's not addition of noise. It's not additive noise. This has to come along with this, other questions. So this is almost looking like a differential equation. It's almost looking like a differential equation, which is the differential equation we had earlier, but not quite. So we'd like to write it as a differential equation. We'd like to write something like dx dt is f minus g plus some noise. We'd like to do that. We haven't quite got there, but we'd like to do that. So the question is, what is this? And the answer is just going back and looking at these curves. In fact, just look at this curve. Let's do back one of those experiments where this is the standard expectation, and the actual system is going to be some wiggly thing. In a sense, in a very obvious sense, dx dt is simply the derivative of that squiggly line. By definition, this is x, this is t. So dx dt is just the derivative of the squiggly line. Where's the problem? The problem is that the derivative of the squiggly line contains one nice part, which is the deterministic part. And then it contains some other parts, which are sort of terrible. And they're terrible because they're sums of delta functions in the plus and minus direction. So this term, you can write it down. And what it is is very simply a sum of delta functions whose times are random. They occur at random times according to where in this time interval creation or destruction happen. That's what this eta is. So sure, you can write this down as a differential equation. But then you've just shoved all the complexity into this thing. And this thing enables no simpler description than the entire full-blown machinery of stochastic processes that we've been looking at. So although it's sort of nice to write it down like this, it makes it look too simple. Because you don't know whether the assumptions that people usually write down for these noise terms are actually valid. In particular, the Gaussian assumption is not valid. And for very short delta t's, things won't work. That's one piece. Second thing, let me just pause on this before I go on. If you had to integrate this differential equation, you could easily do it numerically. You could just pick a small enough delta t, use Euler's method, add the whole thing up. You could easily do it. And why is Euler's method so good? Because it's guaranteed that for sufficiently small delta t, the curves that you calculate will converge to a fixed limit. That's the reason why Euler's method is good. It's not good because it's easy to do. It's good because you're guaranteed to converge. Now look at this expansion. Just look at it for a moment. Think of it as a numerical recipe for figuring out the result of this stochastic process. So when you start to do the numerical simulation, you'll pick some value of delta t. The delta t has to be sufficiently small, that the total number of creation events in such a delta t should be smallish. And you just add up over a large number of delta t's. This is the one thing I want to draw your attention to. For a fixed choice of bin size delta t, all this looks like is except the next time step is except this time step plus some term plus some other term times random number. It's very simple. And you add it all up and you're going to get some result. But what if you now make delta t 100 times smaller? If you make delta t 100 times smaller, unlike your traditional Euler's method, this term will become 100 times smaller. That term will only become 10 times smaller. And this different type of scaling is absolutely essential for the curve that you're going to calculate to converge to a reasonable limit. If you just pretended this was a standard type of cranking away differential equation, and you make the bin size smaller, you make this smaller, you make this also 100 times smaller, in the limit, what will you get? You get a perfect straight line. It's not something which is going to be a stochastic term. So this is very important. So I just want to highlight the fact that this delta t sits under the square root sign. And so when you're doing a numerical approximation, I think you'll have to do for one of your homeworks, you need to keep this in mind. Other than that, this is extremely straightforward. This is the way I like to think of a stochastic differential equation. I don't like to think of it as one of these very clever looking DW derivative of a Wiener process and start using ETOC calculus. I don't like to think of it like that, because it hides all the approximations. It hides all the machinery of what's going on. This is what's going on. You know where you are. You want to know where you get to the next time step. It's easy. You add a deterministic part, which by definition is multiplied by delta t. And then you add a stochastic part, for which you need to draw a random number under a certain approximation that random number can be drawn from a Gaussian distribution with this variance, or rather with this standard deviation. And if you think of a stochastic differential equation like this, rather than trying to make everything behave so it looks like a traditional differential equation, you should be fine. So my message to you is think of this as the way it is written down. As an integral equation, it's integrated already over delta t. Don't think of it as trying to massage an equation and trying to understand the properties of that eta. Because the properties of that eta are just really terrible. It's very hard to write down. Mathematically, it takes you into territory that we may not all be trained to understand. Any questions about this so far? Yes. No, you can make delta t as small as you want. OK, very good. So good question. So what's your optimal delta t? Why can delta t not be too small? Because a very, very small delta t, obviously, nothing is going to happen. A very small delta t, there's going to be intervals in the equation when nothing happens. In other words, your Gaussian approximation, the Poisson distribution is very happy with nothing happening. But the Gaussian is not happy with nothing happening. So already, this equation, what is the nature of the approximation? The nature of the approximation in that equation is one simple one and one rather dangerous one. The simple part of that approximation is that you've waited long enough that the sum of random variables approaches some sort of Gaussian distribution. That's the simple part of the approximation. The dangerous part of the approximation is already this x is no longer an integer. This x is no longer an integer. It's, again, a real number. So the motivation I had that the molecule numbers have to be discrete and so on has already gone out the window. So to go back from this to getting integers, you're going to have to do some other trick. For example, you could just round it off. Now, rounding off this equation turns out to be a much closer answer to the truth than rounding off this equation. Because rounding off this equation all just gives you this, which is wrong. So there's a lot of hidden assumptions, approximations, potential mistakes that can be made in this kind of view of the system. But it is very intuitive at one level. What I'm going to do today, I have, let's see what time is it, 10 o'clock. 45 minutes left. So I'm going to show you three, apart from this one, three different ways to think about this equation. All of which will give you a recipe for calculating what the results are. Two of those ways will be exact. This way is not exact. There's another way that's not exact. Two others will be exact. So there's actually no excuse in chemical kinetics for using this kind of description. OK, a couple of other points before I move on. Sometimes one is interested not in generating the entire curve. The entire curve is what I'm always focusing on. But sometimes one is interested not in the entire curve, but just in the moments of that distribution. If I'm only interested in the moments, then there are many tricks you can use to solve the system. And in those circumstances, this kind of equation can actually be quite useful. OK, now I want to make a connection. I want to make a connection to what I wrote down on the first day, which is the sort of stochastic differential equation for Brownian motion. But now I'm going to write down a stochastic differential equation for chemical kinetics. We have to assume something about this noise term. So is everybody happy with this? If you only follow up to this point and you don't want to track what I'm saying for the next five minutes, I'm perfectly fine. What I'm trying to do is to make a connection. Unfortunately, these are the kinds of equations people have been taught to write down with some eta. And usually the eta has some sort of properties like this that are written down, called some q. These are the kinds of ways that you write down what eta is. So the question is, what has this got to do with that? It's not totally obvious how this numerical recipe to predict what's happening in the chemical system has got to do with this very cute but dangerous analytic looking recipe. So the question is, what's the value of q? And to find out the value of q, we're going to have to understand what this equation implies if it is used as a recipe for this process. I derive this equation from first principles. This equation is just a nice looking thing, which I'm hoping will also apply to the same situation. It's worse than an approximation. It's an ansatz which might be wrong. So we have to check, what does this equation imply for what's going on here? So let's just do it. Let's integrate this equation from time point t1 to time point t2, see what it predicts, and then make sure that all the coefficients and terms in this equation then match what we know to be the case from the underlying microscopic description. So fine. So delta x, which is the integral from t1 to t2, of dx dt dt, so it is red. And that must be the integral f minus g dt plus the integral eta t dt. So I'm just seeing what the implications of writing notation like this are. Now we know that this delta x is a random variable. And so every time I run this, I should get a different answer. And so I'm not allowed, of course, the first piece is fine. The second piece has all the random stuff. So I'm not really allowed to write this down without putting expectation values everywhere. That's the real thing that I could, even in principle, calculate with such a compressed description. Because after all, this eta is defined only statistically. It's only defined statistically. I don't have a formula for it. So well, this is easy. That means delta x is f minus g times delta t. Because the expectation value of this thing is 0. And all we're saying is that this eta, the noise term, is as small as it needs to be to account for fluctuations around what we already know as the deterministic law. That's why this eta expectation value is 0. The other thing we'd like to calculate is delta x squared, or the variance, sigma of this. The other thing we'd like to calculate is sigma squared. Sigma squared is the variance around the new position. We've already accounted for the deterministic change. So the sigma squared can only come from this piece. Sigma squared can only come from that piece. So since these things are uncorrelated, this turns out to be something like, the sigma squared is the sum of a bunch of random variables. These things have uncorrelated pieces. And therefore, sigma squared, just by integrating this equation, must be equal to a term like this. Because we just write down, you just expand it. And you get x minus the mean value of x multiplied by the same piece again. And you get this sort of double integral. Don't have the, no, because, so what I'm calculating is I want to separate out this piece. This piece is gone. This piece is gone because I'm only interested in this amount of separation. I mean, I could have written this as sigma of delta x squared. It's a very odd notation. So it's just sigma squared. The sigma squared part is literally the integral of this part. It's literally the integral of that part, which is what this is. You can. And that's how you get this kind of thing. That's precisely what you do. You do x of t, x of t prime. And that's how you get this. And then you set the t prime and t of the same. But this is what it would be if this was a traditional function. If we're just pretending that this is just the traditional function, in which case this is the exact formula that you get. But it's not a traditional function, right? Because this thing, if I put the expectation value, sorry, this should be an expectation value. If I put an expectation value, then I know statistically what's going to happen to this. Oh, sorry. Missed a very important point there. So then this collapses. This becomes integral 0 delta t, right? One of those t's evaluates to itself because the delta function of q dt is equal to q delta t. Because this term becomes q times delta tau. Where tau means t is equal to t prime. So you get rid of one of the dt's in the integration. The delta function goes away. t prime becomes equal to t. You put a q inside. And you leave one more integration to do. You integrate that up to delta t. You get q delta t. And we know that this must be equal to what we knew earlier. This must be equal to f plus g delta t. So this q must be equal to f plus g. So I'm not a fan in particular. I don't like these kinds of manipulations because it seems very forced. I like this a lot. But this is just to make connections to existing descriptions of stochastic differential equations some of you might have seen. If you want for some odd reason to write down a traditional deterministic chemical kinetic equation plus a noise term, if for some reason you want to do it like that, you can. With all these gymnastics that you have to do as a statistical description of the noise term, the strength of that noise, you have no choice in the matter. It'll be given by f plus g. For the rest of this class, you never have to use any of this stuff. You will have to use this. You never have to use any of this. Any questions? So let's keep going. So this is also sometimes written, of course, as dx dt is equal to, sorry, dx is equal to f minus g dt plus some constant times this term, dw, where w is a Wiener process. This is only for those of you who have seen this. And so this constant is the same as that constant. It has to be equal to f plus g. Process. It's nothing but the derivative of diffusion. It's just the derivative of diffusion. That's what it is. All this doesn't matter. I don't like all that notation. It leads to more confusion than necessary. This is the only important thing. OK, I need all this. I guess I need that. OK, so now I'm going to be forced to erase something. But let me see how far I can get with this part of the board before I have to erase anything, except that it's written as the integral. So instead of writing dw, you write eta and zeta and describe it statistically. dw is a more explicit way of making the assumptions straightforward. This is equivalent to saying it's a Wiener process. The fact that the random numbers that you add are Gaussian variables is equivalent to saying that this whole thing is a dw, dw times a constant. dw is the diffusion process with the unit variance. So this constant has to be there. The square root of f plus g has to be there. So you get square root of f plus g dw if you've used this kind of thing before. All of it is just commentary for people who've used these notations before. The real thing that matters is we started from dx, dt is f minus g, and we inexorably reached this recipe for finding the value of x at a subsequent time, which involves, as usual, the use of a random number generator. In this case, a normal distribution. All is good. There are other ways to do the same thing. So let me quickly derive one other way. So what we are interested in is the probability distribution at t plus delta t of x. So these probability distributions are themselves can be described as curves. So what we want to write down rather than a stochastic differential equation is we'd like to write down a partial differential equation for the shape of this curve, a partial differential equation for the shape of this curve. So how would we do that? We need to start thinking slightly different. So now I'm going to switch from this sort of vertical axis for x and flip it around. So just bear with me here. But I'm going to maintain for now the idea that these things are discrete, which this thing threw out the window, but I'm going to bring it back in. And I'm going to write down an absolutely equivalent formulation of this, which shows you exactly how much complexity is contained in this equation. This is called the master equation if you've seen it before. So imagine that I have a bunch of bins that I keep track of. And these are different values of x, 0, 1, 2, 3, and so on. These are all different states that the system could be in at any time. This, in other words, is simply stretching out the rotation of this vertical axis. I've just taken this vertical axis and I've just done that. Because it's chemistry, there are no negative numbers. Now I do a similar kind of move, as remember Einstein did when he was doing his diffusion calculation. Instead of worrying about probabilities, I'm going to start thinking about many copies of the same process running at the same time. It's a slightly different way of thinking. Why is it reasonable to think of infinitely many copies of the process running at the same time when actually you were interested in one curve? So this is a very subtle point. Again, when you study stochastic processes, everybody immediately moves to the limit I'm talking about where you imagine the entire distribution changing over time. Actually, a stochastic process is not a distribution changing over time. A stochastic process is a single trajectory. Whenever you find yourself calculating distributions when working with a stochastic process, remember these distributions are merely the conditional probability distributions that you need to move from one time point to the next and draw a random number. That's why you calculate distributions. The distributions are not the answer. They are the one step in getting the answer. But to calculate distributions, luckily, we can add a large number of independent processes and see where it takes us. So the way I like to think about this is imagine that you have a million cells, a million bacterial cells sitting in a tube. And in each bacterial cell, you're looking at the concentration of some protein. And that protein is being created with the rate f per unit time and being degraded with the rate g per unit time, where f and g might be also functions of the existing state. At the initial time point, I look in my tube. And these days you can do this microscopic. You look in the tube and you look and you see where they are on the vertical axis over there or the horizontal axis over here. In other words, you make a histogram of all the cells. It's a real thing. So you put all the cells. You see there are n sub zero cells here, n sub one cells here, n sub two cells here, n sub i minus one, n sub i, n sub i plus one, and so on. What are these n's? These n's are the number of cells in each of those states. The sum of all the n's from n zero to n sub infinity must be equal to the total number of cells, which let's say is one million. Actually you can get about 10 to the nine cells in a tube. So these are not absurd numbers. So in fact, let me just write these n's inside here, n sub two, n sub i minus one, n sub i, n sub i plus one, and leave me some space to do some further writing. Now, it takes a little bit of thinking to see what to do next. At any point in time, any one of these cells in any one of these tubes, all of which are behaving independently, could gain a molecule, right? And at any point in time, any one of these cells in any one of these tubes could lose a molecule, okay? So a cell could transition from one of these bins to the next. A cell could transition from one of these bins to the next. In a small time interval delta T, how many cells will actually transition from here to there? Right? The number of cells that transition from here to there must be proportional to the number of cells that already was, because all the cells are working independently. And so let's write that down. Times the rate at which each cell could add a molecule, probability that that cell added a molecule in that time interval, right? And this is where it's very important to keep in mind that standard chemical kinetics have rates, but stochastic processes have probabilities per unit time. If the time interval is very small, most cells would do nothing. Some cells would have gained a molecule. No cells would have gained two molecules or three molecules or four molecules. We're working in such a small time interval. Mostly nothing happens, but some cells gain one molecule, right? Therefore, the chance that cells gained a molecule going from here to here is just F times delta T. Are there any questions? That's a bit odd. I want you to really internalize this. I want you to really internalize this. There are Ni cells here. Each one of those cells has a chance, F delta T, of gaining a molecule in time interval delta T. None of these cells has a chance of gaining two molecules in time delta T. None of them, because delta T is so small, okay? And just to be very safe, these F could depend on I. F could depend on I because of this, right? The chance of gaining a molecule could, in principle, I haven't used this so far, but could, in principle, depend on the current state, okay? Now, in the same way, a cell in this bin could lose a molecule and end up in the bin of defined by cells that have I minus one molecules. So you previously had a thousand molecules. You could have a thousand and one or you could have 999. And this rate is exactly the same form as that. It's Ni times F or GI delta T, yeah? And the same game happens all the way through, right? You have N0, F0, delta T, N1, F1, delta T, and so on. And you have Ni minus one, Gi minus one, delta T, and so on. If you wrote down your equations correctly for F and G, you will find that there's no way to go here, right? If you have zero molecules, you cannot get minus one molecules. That's another way of saying we're dealing with chemistry rather than some arbitrary axis, right? Chemistry does two things. It makes this thing discrete. This is a random walk, right? But it makes this discrete and the random walk has a hard boundary on this side. Has a hard boundary on that side. You cannot have minus one molecules. Makes no sense at all. Yes. So in this equation, the rate of gain and loss could depend on the current state of the system, okay? The current state of the cell is that it has a thousand molecules. And therefore, the rate of gain could be indexed by that. It could be. Okay, so the I represents the current state of the system. But you sort of see where I'm going with this now. So now you have a flux of cells. You have a million cells, right? So these fluxes are nearly deterministic. Nearly deterministic. So you have a flux of cells that are moving back and forth. There's nothing stochastic here puzzling me. So now we have to write down an equation, right? Which is a very strange equation, which is we want to write down the rate of change of Ni, right? We are no longer writing down the rate of change of X. We previously wrote down the rate of change of X. Which is this kind of movement, right? Now what I did was I took this axis and I made it horizontal. So previously we were writing down the rate of change of these ends. The rate of change of X is like the rate of change of, sorry, of these I's. I remember X went to I, became discrete. That's what we previously wrote down. We're no longer doing that. Now we're keeping track of how many cells there are in each bin. That's why we're writing down the rate of change of the heights of this number of cells in each bin. In other words, we're writing down the rate of change of these histogram bin sizes. Okay, so we've totally flipped the situation around. So what is this equation? Well, let's do it properly. So Ni at t plus delta t, where you are at the next time point, must be equal to Ni of t, because that's where you started, plus a certain number of contributions. How many contributions are there? There's actually four of them. There's Ni minus one, F of i minus one, delta t. There's Ni plus one, g of i plus one, delta t. So Ni t plus delta t is what it used to be, plus two terms, right? Which is Ni plus one, g i plus one, delta t, plus Ni minus one, F of i minus one, minus one, delta t. This is already looking a bit odd, right? Because g represents degradation, but it's entering this equation with a plus sign. It's doing that because cells here are coming in from the right, right? And then there's two minus signs, which is very simple. It's Ni times Fi, F of i, delta t, minus Ni, g i, delta t. So this is nice. It's looking very good. In particular, the dynamical variables of the n sub i's, and this equation is totally linear. It's totally linear. So I'll move things around and move the dt to the bottom and just write down the whole equation. So I get d Ni dt is equal to F of i plus g of i times Ni, which is this term, which are these two terms, that guy and that guy. So what did I do? I just moved this to this side, moved the delta t to the bottom, replaced that with a derivative, and I'm just writing down the terms, right? Plus F of i minus one, Ni minus one, plus g of i plus one, Ni plus one. So this replaces what we previously had. Again, just look at this, look at this. This is the equation we started off with, right? We had dx dt is some F minus g. In principle, the F could depend on x. In principle, the g could depend on x. I moved from a continuous description to a discrete description, and so the x's are replaced by i's, and whereas here the x is the dynamical variable, here the i's are just labels. The i's are just labels. The dynamical variable are the heights of the histogram that I'm going to draw on top of all these bins, right? And what I've written down is the total number of cells at time t will change over time, according to this differential equation. The dynamical variables are the Ni's. How many equations are here? Infinitely many. This little thing actually means this infinity of equations. That's why I'm saying that little stochastic differential equation formalism hides this true complexity. This is literally exact and it's the whole situation. By the way, how am I allowed to assume that the change delta Ni is exactly this, right? I'm allowed to assume that because I've taken an expectation value over billions of cells, right? So there's no more noise, strangely. This equation has no random numbers. There's no random numbers, which is strange since we're dealing with stochastic processes. You know why, because this equation is just a way we're going to use to calculate conditional probability distributions, which obviously involves no random numbers. We only use the random number to collapse that distribution into a single observation, okay? So infinitely many equations, linear in the dynamical variables, and obviously coupled, they're coupled. The value at Ni is coupled to the value of Ni plus one, coupled to the value of Ni minus one, okay? So I'm going to do one last thing. Now that we've seen that the number of cells Ni, Ni, this is the deterministic almost equation that we derive for the number of cells. I'm just going to replace, I'm just going to go from Ni to Ni over N total. Total number of cells is fixed, right? All they're doing is wandering up and down this space. So N total is fixed, right? Which I'm going to define as some sort of P sub i. Since this is a linear equation, I can just get away by dividing every term by N total and not change anything, okay? So this is one of those few cases where things work out nicely. So I'm just going to call this P i, and I'm going to call this P i, I'm going to call this P i minus one, I'm going to call this P i plus one. And we're done. And we're done. What can I erase? What can I erase? Let me see. Let me erase this and write down that just so it's preserved somewhere on the board. X of t plus delta t is x of t plus f minus g delta t a square root of f plus g delta t and r. So I can erase this. Doing well, doing well. Yes. Is it here also on time? Oh, they can depend on time. And that merely is added as the explicit dependence here. Is that fine? They can depend on time. And the only complication then is that you have to resolve these conditional distributions for every possible starting time, right? It's another huge pain. Okay, so look, pure chemistry, right? It's just this. The only thing that we got from chemistry is that it has to be f minus g, nothing else. And that leads inexorably. There's no other way to this equation, right? In a sense that this equation is merely an approximation to this. And we'll see in what sense that is. Okay? Now, how do you solve an equation like this? It's actually, you know, formally very straightforward. This is a linear equation. Every linear equation can always be written as a matrix, right? dn dt for n equals 0, 1, 2, 3, blah, blah, blah, is equal to some matrix times the current values of the n's. Yeah? And these terms just couple the different elements of the matrix. It's not a diagonal matrix, but it's a matrix that has terms plus and one, plus one and minus one above the main diagonal. Yeah? So this whole thing is just a very simple, nearly diagonal matrix. That's why it's called one step process, one up and down, okay? So in principle, this is trivial to solve. You write down the matrix, exponentiate it, and you get the answer. In practice, it's horrible because the matrix is infinite in size, okay? That's the only reason it's horrible, otherwise it's fairly straightforward. Everybody's happy with matrix exponentiation, right? So you can solve this by matrix exponentiation. In practice, you never do that because you'd like to get closed form formulas. So let's see how that goes. Let me write down i minus one, i plus one, and I have f i minus one, p i minus one, f i p i, g i plus one, p i plus one, and g i p i. I can erase this too. Okay, nearly there. Now again, we're going to do an approximation. We'd like to do a few kinds of approximations, but for now we're going to take this, right? And we're going to say, after all, what is this describing? This is describing the height of all these bins as a histogram. It's just describing the heights of all those bins. And traditionally when we write down these histograms, we don't write them as discrete bins, we write them as some formula. You write them down like a binomial distribution or a Gaussian distribution or a Poisson distribution. You write down an analytic formula of this type, right? So let's assume that f and p and so on can be written as analytic functions. Even though they are only valid for discrete values of their argument. Everybody's happy with this, right? This Poisson distribution is an analytic function. Of course, n factorial, you know how to evaluate that, has an analytic function, right? So let's assume that. In that case, okay? We're going to just use very simple approximations. So if you have any function, let's say h of t plus dt, right? That's approximately equal to h of t, right? Let's say plus or minus, not t plus dt. Not t plus dt, i plus delta r, h of i plus one. Any function h evaluated at some value i must be close to its value at i plus one and must be close to its values at i minus one. So we're assuming that let's say there's a thousand molecules in the system and the difference between how many cells that have 999 or a thousand or a thousand and one can't be that big, okay? This might look like an uncontrolled approximation and I'll tell you, it ought to disturb you and I'll tell you how you live with that in a second. So h of i plus one is equal to h of i, right? Let's say plus or minus one, right? Plus or minus dh di, right? At i times one, right? And plus one half d squared h by di squared, right? Times plus or minus one squared, plus. This is a Taylor expansion, right? Usually you would never do a Taylor expansion where the value of the difference is one. It's of order unity, right? So it ought to disturb you. But really that's not the point because these terms are actually very small. So it is a bit awkward and the way you should actually do this expansion and I don't have time to get into this in this course is to do an expansion in molecule numbers or in the total size of the system. And when you do an expansion in the total size of the system, the value of an incremental step of size one compared to let's say a thousand molecules can be rather small and in that sense this becomes a controlled approximation, okay? Everybody fine? Is everybody happy with this? If you are disturbed by this, I share your pain but we have to use it and move on, okay? I mean the real question is why these terms are small? That's the real question, right? So take my word for it, those terms are small. I is discreet, it used to be discreet, right? So remember what I said. So the Poisson distribution looks like this. Zero, one, two, three, four. The Poisson distribution looks like this. But you can write down a formula which is an analytic function which actually looks like this. You can do that. So what I'm doing is pretending that I'm working with the full analytic function which does have well-defined derivatives at all points, okay? But I'm only going to evaluate it at discreet values of one, okay? This kind of move is perfectly allowed. There are infinitely many points here so I can always find another analytic function that goes through this, right? You just evaluate infinitely many points. This is only one function that'll do it, okay? I agree, you should be disturbed, okay? So the real reference on why this kind of nonsense works is the book by Van Kampen where literally the entire second half of the book goes into why this kind of expansion works, what the next term in the expansion is, and so on. If you've read this book, I sympathize. I mean, it's a very difficult book to read but it deals with these kinds of questions, okay? For the moment, let's just leave it. If I'd written x instead of i, you would have had no problem because ddx looks okay. ddi certainly looks very bad, yeah? Okay, so I'm going to write down two different, there's two different functions that I want to keep track of. So in particular, right? f of i p of i, which is this term, f of i plus one, f of i minus one, p of i minus one, which is this term, must be equal to f of i p of i plus ddh ddi, sorry, so minus ddi f of i p of i plus one half d squared di squared. It's the same thing, but the minus signs go away. f of i p of i. All I did was substituted f times p for h. And similarly, I'm going to substitute g times p for h plus ddi plus one half d squared di squared g of i p. Plus dot dot dot. As long as you're happy with this Taylor expansion, which trust me is totally fine, okay? These things just follow immediately. And now you see what happens. These two terms enter with a minus sign and they exactly cancel these two terms, yeah? And you left with a rather simple equation which says ddt of p now of i, and of course, it's a function of time. I'd suppressed that time variable here, but it's obviously a function of time, right? p now becomes not indexed by i, but it becomes a sort of analytic function of i, yeah? Where i can now range over the entire positive real axis. You're only going to evaluate it at integers, right? Is equal to, these two terms cancel, and you're just left with that term, okay? So you get d, well, ddi of, let's say minus ddi of plus one half ddi. I think I got all the signs correct. This has a minus and that doesn't, but I want it right as f minus c. That's why I put the minus. This is f plus g. This ddi and that ddi give ddi squared. There's a half, okay? Great. So this is called the Fokker-Planck equation. So how did we get it? We got it by expanding a giant, infinite set of coupled linear differential equations. This is also linear equation. This whole diffusion operator and this drift operator operates on this term, okay? I've already used the word diffusion and drift, so let me, let me assume I've done that, but let me just do it and show you what it is. So if you're uncomfortable with this way of writing it, let me just write it like this. So I want you to stare at this for a second, right? This is actually a controlled approximation. It's a Taylor expansion, okay? This is not a controlled approximation. It says, you know, hoping for the best that the increments are Gaussian random variables, this is the way the differential equation should look. So my advice to you is when you look at a stochastic differential equation like this, you look at it only in two ways. Either you look at it like this and say, I'm using this as a recipe to implement this numerically and then make sure you pick time intervals delta t that are reasonably well founded, right? Or you simply look at this and say, aha, this actually means this, okay? That's what it actually means. This thing says that the increments are always Gaussian, right? But this equation is able to account for all kinds of funny behaviors of the F's and the P's and so on. So this is the most important thing. It's a very, very accurate way to solve these equations. But as long as the molecule numbers are, you know, 10 or more, this turns out to be pretty good. And let me put it differently. As long as the chance of having molecules from zero to 10 is reasonably small, right? This turns out to be pretty good. Now, looking at it again, of course you do notice that this happens to be a diffusion equation where this term is the drift term and this term is the diffusion term. And you see that in this picture also, right? The first term, if I didn't have the second term, what does the first operator do? The first operator takes any probability distribution and simply moves it with velocity f minus g along the axis. That's what the drift term of a diffusion equation does, right? I mean, the way you see this is that this thing is actually the derivative of a flux. And if the second piece wasn't there, this is just a constant flux of cells moving in a certain direction. Or a net flux of cells moving in a certain direction. And therefore it can't change the shape of the distribution, the whole distribution simply moves with this velocity. Okay, everybody happy with that? Because this is the derivative of a flux. Any questions about this? You've seen all this before, right? Is the minus sign confusing? The minus sign has to be there because you're accounting for cells coming in from one side and leaving on the right side. Okay, that's why this minus sign is there. What does the first term do? It just moves the distribution on average at the rate f minus g. What does the second term do? Second term is a standard diffusion term. It simply expands the distribution as it moves. In fact, to the point where the bottom end of the distribution could well be below where you started. And often will be for short times. So this formulation then brings to bear all the standard ways you guys are used to solving partial differential equations, right? For a chemical kinetic process, very good. And it allows you to calculate these conditional probability distributions numerically on a computer using those standard tools. This Fokker-Planck equation plus a random number generator should allow you to generate individual particle trajectories at desired observation times at will. There's nothing complicated about it. Because solving PDEs has its own problems but in principle it can be done and there's a lot of efficient ways to do it. It's not used here at all. Remember what I said. So the way you do a stochastic process, yeah. So what you're doing is you're starting somewhere, generating a distribution, and then you're starting somewhere there, and then generating a distribution, then you're starting somewhere here, generating a distribution, right? Each time you pick a value from a distribution you use the RNG, right? But between observation times, you use this guy to propagate the distribution, okay? So that's why mysteriously, although it's a stochastic process, it doesn't appear to have any random number generators in it, it's because it's merely giving you the conditional probability distribution. That's what it's doing. By the way, if you're not happy with this, this framework works just as well. It's just matrix exponentiation. In fact, in the afternoon, we'll be using this formulation to find dynamical behavior in steady states of a certain simple biological system, okay? Question? No, no, it won't. It won't at all. It won't at all, because f's and g's are not linear, right? So if you actually solve these, and you're going to do it in your homework, if you actually solve this, the solutions you get are nowhere near Gaussian. It's a full-blown partial differential equation, right? The f's functions of i, p's a function of i, that's enough curvature to generate non-Gaussian solutions which are very, very close to the exact solutions. It's Poisson here. You're adding these to get the total increment, right? X is a cumulative sum of these. X is a cumulative sum of creation events which will lead to variance in this direction, right? You'll have more or less of these. So Poisson distributions are always this, but of course I had to tilt it this way to write this down. Not should, you do, you do, right? If you put this alpha and you just do this equation, you get something very close to a Poisson distribution. Plus these correction terms. These correction terms, these correction terms, right? So this, you know, where is the approximation? This approximation, it assumes nothing, but the idea that the i's you're considering are very small compared to the number of molecules that are already there, right? So in chemical kinetics then, the number of molecules is the large parameter, right? The number of molecules in any volume. Now, if I'm considering the number of molecules in this room, right, that's a very bad idea because different parts of the room are not mixing over the time scales I'm interested in. So obviously the largest volume I can consider are the volumes that would be effectively mixed by diffusion over the time scales that I'm interested in, right? So there's a reason, you know, it's chemical kinetics, there's no diffusion here, but diffusion is implicit. Diffusion sets the largest volume I'm allowed to use before this description fails, yeah? So these are all interesting subtleties about the use of this. For this course, I'm considering a small volume which is a bacterial cell, it is one cubic micron, and in the cubic micron cell, these are all excellent approximations once you have more than a few molecules. Excellent numerical approximations. I'm not saying there are excellent descriptions of what's going on in a cell, that has to be tested by experiment. Okay, it's 10.45, so we can stop, and I'll see you all in the afternoon. I didn't get the,