 Okay, so before the break, we ended up talking about the main, one of the main discussions last week or last class was an introduction to Ito integrals and Ito's lemma in a very simple context. Let me just remind you of what those points were and then we'll go a little bit beyond and try to develop a little more of a theory and hopefully by next class we'd be able to come back to an application and be able to ground all of this theoretical work. So let's remind you that we define something called an Ito integral. So if we have a function g, g of two arguments, we need to define this properly because dw doesn't actually make sense because w, the Brownian motion, is not a differentiable object so it doesn't make any sense to talk about the d of that object. However, we have defined this notation to mean the following thing. It means to put a partition down, a set of partitions, and take the limit in which the L infinity norm goes to zero and we sum this object evaluated at the left hand point and it's very important that this was a left hand point and times the increment of the Brownian motion over that interval. So I'm going to start using this notation while we've used it already. Delta, delta w sub k just as a reminder, that is the increment of the Brownian motion over that kth interval and it's important that we evaluate the integrand, you can think of that as the integrand, we evaluate the integrand at the left hand point. I think when we discussed this last, we didn't have this t in here, is that correct? I think if I just flip back to our previous lecture when we defined our stochastic integrals and let me see, yeah, I only defined it last time for functions of h of w, s, d, w, s and now what I'm doing is I'm actually also putting in a time dependence into there as well. It's perfectly allowed, all it means is now that my function is a function of two things, where you are in time as well as where the Brownian motion happens to be. So it's a function of these two arguments and what I'm going to do today, one of the things we're going to do today is develop an Ito's lemma for these kinds of transformations. What we demonstrated last time was if we start with the Brownian motion, this is just a definition, this is defining what we mean by the Ito integral and just to be really precise, what is pi? Pi is a set of times t0, t1 up to tn and they're ordered, right? So t0 is zero, it's less than t1, it's less than all the way out to tn and that is point t. So this is what we mean by partition. What we demonstrated last time was if we define a new process which is, and I'll use the same notation from last time, which is a function of w, only a function of w alone and h is differentiable twice with respect to, so let's write this, h of x is or, yeah, sure, let me put this notation, h of y is twice differentiable, then we were able to write down a new differential equation for this function x, for this process x in terms of the function h and the original Brownian motion and we saw that if you just treated w as a normal differentiable object, then the naive answer should be h prime dw, but because w in fact is a Brownian motion and it's not differentiable and we have this quadratic variation being finite, we have this extra term which is h double prime of w dt. So this is our Ito correction and this result, you can think of this as one of the first forms of Ito's lemma, this is Ito's lemma 1 if you want to think of it that way. Now always keep in mind that this equation, it's a stochastic differential equation, it's a differential equation because it relates the d of one object to the d of other objects, in this case the d of w and the d of t, so it's a differential equation because of that and it's stochastic because w itself is a stochastic process, so that's why we call these things stochastic differential equations and the point that I made last time and I sort of iterated around whenever we talk about the d of Brownian motion is that this equation is not actually mathematically correct statement to write down because of the fact that dw doesn't make any sense. What we really mean by that equation is it's integrated form. What we really mean is if you integrate the left side you get the increment of x, if you integrate the right side you have the stochastic integral of h and this one here is simply a Riemann integral because it's with respect to or a Lebesgue integral, it doesn't matter with respect to df. So this one we know how to define. So this is what we really mean by this stochastic differential equation. Most of the time you'll find that this form is useful for doing quick calculations. To prove anything you actually have to use what's in the round bracket. You actually have to use that representation. Now this course is not a course on proofs per se. There are a few things that we did prove, but it's not a course that's focusing on that. So when I ask you to show things with respect to Ithos and using Ithos-Lama and so on, it's perfectly valid for you to just keep using the differential form and then when necessary interpret it back as a last line in terms of its integral form. Now we did a couple of examples of how to use Ithos-Lama just directly. We looked at the example of, and I'm just going to go really quickly through the example again because we've done it already. We went through the example of what if x was w squared? How can we write the d of x? To do that we said, okay, well the function that would relate x to w is the function, the squared function, right, h of y equals y squared. That's our function. So all we do is we just simply go ahead and apply Ithos-Lama, this version of Ithos-Lama up there, and what we will have is dx equals h prime, that's 2y, so that's 2 times w, dw, a half second derivative of h is just 2 times a constant, dt, and then, and that's really the end equation in terms of the sdE for x, but this actually gave us another useful representation. If we wrote this back in terms of the integral now, we integrate both sides, what you end up with is xt minus x0 is twice the integral from 0 to t Ws dWs plus the integral from 0 to t of ds, and this gives us a way to write the integral of WdW in terms of w itself and time. So if I just put this, rearrange my equation a little bit, we'll have a half. So xt, by the way, is wt squared minus x times 0 is w0 squared, and then we have minus the integral from 0 to t of ds, that's just t times everything is times a half here. So in other words, we get this nice little expression here, which is not the standard calculus result. The integral of some function f df, the answer should be a half f squared from standard calculus, but instead we get a half w squared minus t, and again, this is simply the ito correction, that's our ito correction term. And this equation can be viewed as an integration by parts formula. It basically is an integration by parts because we're changing this by, you know, this formula u dv equals u v minus the integral of v du, standard calculus kind of result. This is the analog of the integration by parts formula once we isolate the term that I've underlined. So sometimes you'll see in some of my old exam, I ask you derive an integration by parts formula for something. And what I mean by that is always isolate, start off, so what I typically do is I'd say here's an integral, derive an integration by parts formula for it. And what I mean by that is you have to think of what is the appropriate h function that you need to use in order for you to write the integral in terms of other things. And in this case, the appropriate h function is y squared. So if I ask a similar question, and I think we also did this example earlier as well, find an integration by parts formula for the integral from zero to t, and let's do one that we didn't do, say ws to the fourth dws. You might see a question like that. And the approach to take is to say, okay, well, if it was standard, this is the way that you should always do it. Think of it, if it was standard calculus, what is that integral equal to? The integral is one fifth wt to the fifth, right? That's from standard calculus. So the one fifth is just a constant, so you don't really need to consider that as part of the construction. So it's natural to consider this object here. And the corresponding h function is y to the fifth. So now if we just apply those limits of that, we'll see that what we get, one of the terms will be exactly the integral that we're looking for, and therefore we can isolate it and get an integration by parts formula. So that's what all of these kinds of questions surround. So if I do the dx here, what's the Ito's Lama? What's the first term here? H prime dw. Okay, so H prime is going to give me five w to the fourth dw. And so maybe I'll write below here. Again, this is just H prime, this is my Ito's Lama. And the second term is H double prime, right? The correction term. So here I'd have one half times five times four times w cubed dt. Okay, so all I need to do now, again, is isolate. We can see right away that the term that I want, this one here, is isolated. That term shows up there. So all I need to do is put everything else on the other side of the equation and write it in terms of integral form. So this tells me that if I just, I'll just do it all in one now because we've seen it so many times, this equals one fifth. So we'll get xt minus x zero. That's the same thing as wt to the fifth minus w zero, which is zero. Subtract 10 wt cubed dt. Oh, there should be an integral on that last term. So all I've done is I integrate everything there and I just put the last term on the right hand side. And that's it. That's my integration by parts formula. Fairly straightforward in these kinds of examples. Okay, are there any questions about this particular kind of application of Ito's Lama? It seems like a useless one, I think, actually. But it's one, it's a standard application. Okay, let me show you a useful application of Ito's Lama. So these are just for finding integration by parts formulas. Now here's a useful application. Suppose, suppose that I have something which satisfies, suppose that a process xt satisfies the sdE, so the stochastic differential equation, ds equals sigma times sdw. So ds is sigma sdw. And the question that we're going to ask is what is the solution to this sdE? And we'll see that we can use Ito's Lama to answer that. So what do I mean by solution to an sdE? Let's ask ourselves that question in terms of odEs first. And then once we see what we mean in terms of odEs, you'll be able to understand what I mean in terms of sdEs. So let's just do a little bit of a reminder here. Okay, so this is a reminder. Solve the odE. And basically I'm going to write down the exact same equation, but instead for an ordinary differential equation. dxt equals sigma. And just so that we don't immediately confuse these two, actually let me now call it x. Let's call it z. Sigma zt times dgt. And g is just some function. g is a nice differentiable function. Everything is well behaved. How would I go about solving that? What does a solution mean, first of all? A solution means somehow getting equation where zt equals something in which zt is not on the right hand side. zt only appears on the left hand side. And what appears on the right hand side could depend on sigma, could depend on g, maybe it depends on its integral, maybe it depends on the derivative. I don't know. A bunch of junk, but z itself does not appear on the right hand side. That's what we need to do. That's what solving an odE means. It means you have a functional form for z in terms of everything else that you know. Okay, so in this case, you should probably imagine what happens in standard, well if you think about in standard calculus, you're tempted to write this, or hopefully you're tempted to write this in the form dz over equals something dg, and why is that a useful representation? Because for standard calculus, you know that if you take the derivative of the log of something that's differentiable, that is just the d of the something divided by z, right, from standard calculus. So right away, you see the d of log z is, in fact, sigma dg, right? So you identify these two objects together. So you've immediately reduced the complexity of the problem. Instead of having dz and z as part of your problem, you've now reduced it simply to the d of the log of z. So call the log of z some new thing. That's your process that you want to solve for. Or alternatively, simply integrate the very far left hand side of that equation and the very far right hand side of that equation. And if I integrate that from zero to my current time, the left hand side becomes what? Fundamental theorem of calculus tells me it's this. You're integrating the d of something. So if you think of that in terms of a Riemann sum, you're basically taking the sum of increments and they just telescope and they end up giving you the first point and the last point. That's it. Nothing else. The right hand side, in this case, turns out to be the exact same thing. It's a constant times d of g. So I should end up with the constant times the increments of g. Right? It's a telescoping sum again if I just integrate both sides. So now I can just put the logarithm on the, put the log z zero on the other side and exponentiate everything. And you end up with z equals z zero e to the sigma gt minus g zero. And that's our solution because the right hand side does not depend on z. It depends on the initial point of z. That's it. It doesn't depend on anything else about z. And you probably recognize that as a solution to equations of this kind. These are exponential functions basically. So that's our standard calculus result. And what I'd like to do now is how does this result change when we go to stochastic calculus? And in other words, how do we solve the actual original problem? df is equal to sigma stw. The hint in the solution comes from thinking about it from regular calculus. This line here, this d log z equals sigma dg. That's the relationship that we found based on standard calculus type result. So what we'd like to do when we look at our original problem is we would also like to write it in that form and investigate what is the d of log of s. That's what we would like to do. So our original equation is ds over s equals sigma dw. That's our original equation. And what we would like to write down is the d of log s. What is it equal to? Now, if things were standard calculus, it is equal to sigma dw. But things are not standard calculus. So we cannot write this relationship down. It's not true. There's going to be some extra terms. And our question is how do we decide on what those extra terms are? So let's see. How would I formulate this in sort of an introslammotype form? Well, this here is kind of like my function h. This looks like h of s. Agreed? It's h of s where the function h is a logarithm function. Now, there's a little bit of a problem. And to solve that problem, I need to tell you one more form of introslamma. But the little bit of the problem is that although we can write it in this form, it's a d of h of s. Can I immediately apply introslamma that we've learned so far? Let me slide back up to where the statement is. The statement of the introslamma that we have at this point is if x is h of w, so w is a Brownian motion, and h is twice differentiable, okay, our log function is certainly twice differentiable, then dx equals h prime dw plus a half h double prime dt. Does our current problem fall into that class? There's something quite not right. Our current problem is we have h of s. But is s a Brownian motion? It's not. We don't even know what s is. That's what we're trying to find. All we know is that s satisfies this stochastic differential equation, right? I'm just going to write both forms here. All we know is that s satisfies this stochastic differential equation, but we don't know what s actually is. We don't have yet the technology to do that calculation immediately. What we need is one more form of introslamma. We need to know how do you apply introslamma on a stochastic process for which all you know about that stochastic process is it's the differential equation that it satisfies. Let me back up again, remind you why are we running this problem. Our original introslamma is based on transforming a Brownian motion. That's transformations of a Brownian motion. What we have to do here is a transformation not of the Brownian motion, but of something else, this object s. s is not a Brownian motion, and s happens to satisfy stochastic differential equation. That's all we know about it. We need a new introslamma for this. This is introslamma. Let's call it introslamma 2. At the end of the day I'm going to give you one giant introslamma which covers all of this, but we build it up slowly. Introslamma 2 says the following. Suppose we have one process which satisfies a stochastic differential equation of this kind. It doesn't assume you know how to solve that equation. It simply assumes, actually let's not even put in the mu term here. I don't want to put that in for now. Just this term. Suppose dy is equal to sigma t comma y d w. It somehow satisfies that. You now introduce a new process, x, which is a transformation of y. You see how we're going one extra layer of complexity. We started off with doing transformation just of the Brownian motion. Now we say, okay, let's take a process which is built out of Brownian motion. That is the y process. It's built out of Brownian motion. Let's transform it. We've got x is this with, as always, h being twice differentiable. Then dx is equal to, so let me ask you, from standard calculus what would your guess be for the answer? Or from standard calculus what would your answer be? How do you take the d of a compound, this is kind of a compound function, right? H of y. y is a function itself. You should get h prime of y dy, right? Just from standard calculus. It so happens that dy itself is also equal to sigma t y d w. We could write it back in terms of the underlying Brownian motion. Just from standard calculus you would have h prime y dy. Does everyone agree? That's what you'd have just from standard calculus rules. The ittle correction term that shows up here is very, very similar to what you saw before. It's actually pretty much identical. It's one half h double prime y, but we have to multiply by the square of sigma. This is the ittle correction term. Now to show you why that's true we more or less use the same kind of techniques that we used to show that the quadratic variation equals t that we actually used to show that the integral of w d w is equal to a half w squared minus t. Remember we actually proved that one, right? We actually went through, put a partition down, computed the error, showed that the error was a zero almost surely random variable. So basically you use that same technique for this kind of process. It's a little more involved and I'm not going to go through it here. I don't think it's worthwhile getting through to that level of detail in this course. But I'd like you to know that the techniques that you already learned can essentially be applied here to show this same result. So I want to slide back up so you can compare it to the, actually let me not slide back up. Let me just put here as a counterpoint the old, not the old Ito's lemma, but Ito's lemma one. Let me remind you of what it looked like. So it was of the form as h, if x was h of w, right? And it was twice differentiable, etc. Then dx was h prime w d w plus a half h double prime w dt. That's what we had before from the Ito's lemma one. Now hopefully you can see that Ito's lemma two actually contains Ito's lemma one. Ito's lemma two, if you chose sigma to be equal to one, always, one. If sigma was equal to one, then dy equals dw so y actually equals w. If they both started zero, for example, then y equals w. And then this expression, this simplifies to one and this is one and you get the same result. So right away you already realize there's no need to know Ito's lemma one and two. Just know Ito's lemma two. And you've already covered Ito's lemma one. And you'll see that that will happen every time I introduce a new one. It will contain the previous. Okay. Is there anything that I can clarify about this formula? You know, you can take it as a formula. Take it as something that you memorize. In fact, you won't have to memorize it. I will give it to you on the final exam. I always give you one exam cover sheet that has a bunch of standard formulas which are, some of them are fairly lengthy that you don't have to remember. But this is the form of it. Okay, so let's see how, how does this help us with our current problem, right? Because we got to this point because we were trying to solve a stochastic differential equation and we realized, I'm sliding up here, we realized what we needed to do is take a transformation of s and s satisfies that stochastic differential equation above. So now all we need to do is use Ito's lemma two where y is s and, um, and s is log s. Right? That's all. We just apply that result. So I'll just write it in above here. Actually, no. Let's write it down below. So I'll separate this out. If we define xt be equal to log st, my h function is a log function. And dx t equals, according to this form of Ito's lemma, is telling me that I should get h prime which is 1 over y and we substitute s in there because we're talking about the process, times the sigma associated with my, let me slide up here again. You know, so we need that, but that's sigma times s. Okay? We need, we need the coefficient of the dw term and the coefficient of the dw term is everything here. Notice when you write this formula down, there is no divided by y on the left-hand side. So you have to make sure that when you read off this function, you read off everything in front of the dw. It's not this sigma. It's not that constant because this equation is not in this canonical form. It's not in the form ds equals something dw. However, this representation is of that form. It, oops, oh jeez, okay. It is in the form sigma ds equals something dw. So our, our sigma function that we're going to put in here is s times the constant. And the sigma function that goes in here is s times the constant. Okay? So this should be sigma times s dw plus, or you know what, why don't I do this in two steps? Let's say it's h times ds, okay, plus one-half second derivative of h. That's minus one over s squared times sigma times s squared dt. That's my ethyl correction term. Now I can plug in ds, right? ds is sigma times s dw. So what do I find from this result? I get this equals constant dw minus one-half constant squared dt. So again, this is our ethyl correction term. Remember when we had the od, when we had just the ordinary differential equation and there was no stochasticity in there, the d of the log of our function was sigma times dg. Let me slide back up. Okay? Here, this middle equation. We had d log z was sigma dg. This is the analog of the equation that we now have. We now have the equation dx, which is the d log s, equals sigma dw. You can hopefully see the parallel between the two things. And then there's that correction term. Yes, no? Next step. The next step is to notice that the right-hand side does not contain x. Now it does not contain x at all. It doesn't contain x, it doesn't contain x. It just simply contains w, the underlying source of randomness. So I can integrate both sides of this equation and I'll get an equation for x. So if you integrate both sides, you get x little t minus x0 equals sigma. And the integral of dw is also just the increment of w. The integral of dt is just t. And now I simply write x back in terms of x to exponentiate. And by the way, w0 is 0, right? But I'll keep it in there just so we can parallel the result that we had before. So the standard calculus result that we had before was, so from before we had zt was z0 e to the sigma that. This is what we found from the equation dz, z equals sigma z dg. This is what we're finding from the equation ds equals sigma s dw. And w is a Brownian motion. So you can see the difference between these two things. The only real substantial difference is the existence of that term. This minus half sigma squared t, which where you see if you trace it backwards, that comes from the Ito correction. It's that Ito correction term that shows up. So this is what I say is a useful application of Ito's lemma. It tells you how to solve something as opposed to the integration by parts formulas, which just writes one integral in terms of another integral. That's the integration by parts. But this is actually solving an equation. So let's try to solve a couple of other equations and see where we get. By the way, this is a canonical equation that you see in finance. This one here. This is a key equation that you always see showing up. And why? Because in finance, so let me give you a little intuition of this. In finance, when you write it in this form, this has an interpretation, a financial interpretation of the instantaneous return of the asset. It's the increment in the asset price divided by its initial price. So it's an instantaneous return. And the left hand side is this. And the right hand side, well, this is my source of randomness, my uncertainty in that return. And this tells me the size of that uncertainty. The size of the fluctuation. This is my volatility. Do you get that connection? You can see how the left hand side is instantaneous return, right hand side is basically noise. So this would be something which had an instantaneous return that was normally distributed because the increment of a Brownian is normal, with mean zero, variance delta t if you took a very small time step. So it's not a terribly interesting asset because it has an expected return of zero. So the equations or the type of asset dynamics that are most often used are not exactly that one, but rather you add in a drift into the equation. So this becomes this is a drift or the expected return, expected instantaneous. So this is why I spent some time on it. It's because it actually has some relevance to us in finance. So we need to know now how do we solve the second equation. That's one we need to know how to solve. To solve that we need to introduce yet another Isos Lemma, right? We need the more general version with a drift. So what I'm going to do is I'm going to write down one relatively general equation that's valid for pretty much everything with one asset. And then I'm going to have to introduce another one when you have two of them. But I'll leave the two asset case for later. So here's the version of Isos Lemma that you're going to use over and over again and it encompasses all those other ones. So we say suppose that y satisfies the SDE some function of t and y dt plus some other function of t and y dw and xt is equal to a function now of two variables time and y. We'll see that this is sometimes useful. They also have the time dependence in there, which is once differentiable in t and twice differentiable in y. So now we need slightly, we have to talk about differentiability in each argument separately. Then we end up with this formula. We end up with the increment of x equaling. So this is a part that's going to be a little bit not so nice to write down, but I want to try to break it up into two pieces. So it's the partial derivative with respect to the y argument first of all dy. That's kind of what you'd have from standard calculus, right? You have a function of two arguments, you're taking the d of it, so you need to take the d of each one separately. There's a partial derivative with respect to time dt. So again this is standard calculus. Now here's our ittle correction term. One half second derivative with respect to, oops, sorry, second derivative with respect to y sigma squared y dt. This is the ittle correction. And you can expand this of course by plugging in what is dy in terms of the Brownian motion. So I won't bother writing it down, but you can just put that expression in there. Once again you can prove this by same techniques as what we've used before. Questions about the statement of the lemma? Looks a little bit daunting, but it's not too bad when you realize, when you break it up into those two classes of terms, right? Those from regular calculus, the first line, and that term from ittle's lemma. It looks almost exactly like what you had in the ittle's lemma too, right? Two derivatives of h sigma squared dt times a half. Except now you have to be careful that you're talking about two derivatives with respect to the y argument, with respect to the spatial argument and not the time argument, okay? So let's see how, first of all, how we can use this to solve a problem, an integration by parts problem. And then we'll see how we can use it to solve a stochastic differential equation, okay? Let's find an example. Find an integration by parts formula for x, t, or sorry, let's just write it down. Integral from 0 to t, s, w, s, d, w, s, okay? So that's what we'd like to do. So this is just Brownian motion, so things simplify a lot, right, for ittle's lemma here. The y, the y process is Brownian motion itself. So that means what? Mu is 0 and sigma is 1, okay? So maybe I can put a little note here at the bottom, right? But that should be clear. Look at the equation for y. If you make mu 0 and sigma 1, then y is Brownian motion. And that simplifies things a little bit, not a whole lot, but it simplifies things a bit. For one, you can see that the term dy is now just dw. And this last line is just one-half second derivative with respect to y. There is no sigma squared because sigma is 1. So we can just apply this. So then the only question left is, what is the appropriate h function here? Is it w squared times t, w t squared times t? Is it t squared times w t? Is it a linear combination of those? Is it something completely different? How do you identify the appropriate transformation? Do you remember in the previous examples, what always turned out to be the term you're looking for when you apply the integration by parts formula? The term that you're looking for is always a dy term, right? So all you need is to make the partial derivative, you need to choose an h such that when you take its partial derivative, you end up with this integrand. When you take a partial with respect to y, you end up with that integrand. So the appropriate h, do you see? Look, at the top of the screen, you have partial y h dy. We know when we make mu 0 and sigma 1 dy really is dw. So we'd have partial y h t comma w dy. So that would be w. So what you want is you want this to become partial y h t w. That's what you want it to be. So what does that tell you? What should h be? It would be t times y squared. You could put the factor of a half, it doesn't matter. As we've seen, the coefficient out front will never make a difference because it will come out at the end of the day. So if I took a partial derivative with respect to y here, I'll get t times y and that's exactly what the integrand is. So that's how I choose my integration by parts and now all I do is I just apply it to the lemma. I simply say, okay, so dx t equals, choose h to be that and x t is h of t w t. So I just want to keep that in mind. See it was lemma on the screen so you can see it. So dx t equals, what's that partial? It's going to give me 2t wt dwt plus partial with respect to time. That gives me y squared but I have to substitute y equals w. Get w squared dt plus one half, two derivatives with respect to y. So that will give me a factor of 2 times t times sigma squared to sigma squared is 1 here, so 2t dt. That's it. And of course these last two terms I can collect them together and hopefully you see in the general form of a dose lemma, the second term and the third term can always be collected. They both have dt in them. Alright, so now we're pretty much done. All we need to do is integrate this equation both sides and isolate the t w dw term. Or you can do the isolation first and then integrate it, it's up to you. So that's equal to one half dx minus wt squared minus t and so the integral from 0 to t of sws dws is one half xt minus x0. Subtract the integral from 0 to t, ws squared minus sds. And we can then substitute in the expression for x directly. That's it. There's nothing else you can do. That's the integration by Park's formula. All we've done is we've isolated that sdws term, the one we wanted. Make sense? Okay, so I think it's a good time to do your little quiz. Let's pull out your paper and it should be fairly straightforward based on what we've covered so far that's used in financial markets to represent asset prices. And it basically represents that instantaneous return is mu and the expected instantaneous volatility is sigma. That's what this basic model corresponds to. And in fact, this model is what is known as the Black-Scholes model. And not what you have sort of identified as a Black-Scholes model before. Before you may have identified s equals in distribution a log normal as the Black-Scholes model. That's not actually the model. That's a consequence of the model. This is the true model. The model is a dynamical model that tells you how asset prices actually evolve through time. Okay, and what we'd like to do is solve the stochastic differential equation and then see if we can make the conclusion that s is in fact log normally distributed. Okay, so we'll attempt to do that. So how do we go about it? Well, we can see again if we divide by s, assuming that s in fact is positive, otherwise we can't divide or non-zero, otherwise we can't divide by it. So we assume that and we have this form of the equation and the hint and the suggestion is perhaps we should investigate the logarithm of s and ask what stochastic differential equation does the logarithm of s satisfy? Because the left-hand side looks like the d of log s from standard calculus. But we know it's not correct. So let's define x to be the log of s. And from Ito's lemma, then what will we get? Well, what's the underlying h function here? It's log. So from Ito's lemma, what we should have is a partial derivative of h with respect to y and that is 1 over y, which we substitute s in there, times the volatility of s, or actually let's just say times ds, plus the partial derivative of h with respect to t, which is 0, dt. So this is just so that you recognize what I'm writing here. Stop. Now some people get confused when they see this notation here. You see partial derivative of t of h of s of t. And then what I see some people tempted to write is they get tempted to write that this equals h prime s partial t st. I see this happening all the time. This is incorrect. Because what you mean by that partial derivative, remember h is a function which maps you from r cross r, in fact r plus to r, and it takes the point t and y to h, ty. Partial derivatives of this object are partial derivatives with respect to any arbitrary real argument. And when you use the notation partial derivative s sub t, what you mean is this evaluated, takes a partial derivative and then evaluated at y equals s. So it does not mean you do not consider s a function of time when you take this partial derivative. S is just an arbitrary argument. It's really, think of it as taking the partial derivative without the s there at all, and then plug in s at the end. So that's why we have this equals to 0. So this is more just of a side note and point to be aware of. And then we have the itto correction term, and what is the itto correction term? It's two derivatives with respect to y. So that would give me minus 1 over y squared, but we have to substitute s there, times the volatility of s all squared, and the volatility of s is, and just to be complete, I'll write this as two derivatives. So this is just straightforward from its dilemma. And now what I do is I substitute in, what is ds? Well, it's mu s d t plus sigma s d w, and you can see that there's some cancellations going to happen here. After a while, after doing this for a while, you'll be able to skip a lot of these intermediate steps. But for the sake of completeness and to walk you through, we're going through a lot more detail. And we can see we can collect a couple of terms here. We're getting mu minus a half sigma squared d t plus sigma d w t. And this is d x. So again, we notice that this equation, the right hand side, does not contain any x dependence whatsoever. And so I can just integrate both sides, and I'll have an expression for x in terms of w, which is our final goal. So we'll get x t minus x zero equals mu minus a half sigma squared, little t plus sigma, the integral of w, d w is just w. And recall that x is the logarithm of s. So if you exponentiate everything here, you'll get s at time t, in fact. So we see that this solves this s d e. I don't know why I did that. This relationship is between s d e's of that form and an exponential expression of this kind shows up again and again and again. So you can almost just, I don't want you to memorize it per se, but you should try to recognize the components that come together. So first off, you can see that this sigma d w term here basically is responsible for that term there. Those are pretty much, one is responsible for the other. And this term here is what shows up there. The minus a half sigma squared is coming from the Ito correction term. This is the Ito correction. So if things were not, if w was not a Brownian motion, if it was just differentiable, what you would end up with is the turquoise term, mu t plus the yellow term sigma w. That's it. But because w is a Brownian motion, we get that extra correction minus a half sigma squared t. And that is our Ito correction, sometimes called a convex d correction. Okay, so we can do a few different things with this expression. One is we could ask the question, what is the distribution of s at some fixed point in time? Because this is not a relationship in distribution. This is a relationship path-wise. If you give me the path for w, then the path for s is expressed that way. You exponentiate it. So first of all, you take the w path, scale it by sigma, add mu minus a half sigma squared t, exponentiate it, and you end up with this. Multiply by s zero, and you'll end up with the expression for s. So that's the path-wise behavior. But we also know in distribution, well, first of all, just at a fixed capital T, we still have this relationship. And we talked a little bit about that, about distributional properties before. We can say that this equals in distribution s zero e to the mu minus a half sigma squared capital t plus, if I want to write this in terms of a standard normal, what would I put here? Yeah, squared t z and z is a normal zero one random variable. Now, at this point, I've simply been, we've been working under one fixed probability measure, and w is a Brownian motion under that fixed measure. So I'm not specifying q or p or anything like that. That's just the measure that we're using. So in distribution, you have this distributional property going from the first line to the second is not an equality path-wise. It's simply an equality in distribution. So that's one thing we can answer. And we obviously know how to compute the expected value of s. We've done this before. We have the distributional property using moment-generating function, and you'll end up with s, expected value of s. I'm not going to bother going through details of this calculation. It's work that you should all be very comfortable doing. There's something else that we could do with this equation. What we've done is we've kind of explicitly constructed the solution. But suppose I asked you the following thing. Suppose I said take s equals equation at the top of the screen, and I ask you, what SDE does s satisfy? So basically I'm asking you to go the opposite way. I'm not giving you an SDE and asking you solve it. I'm saying, here's an expression. Tell me what stochastic differential equation does that s satisfy? Now you know you should end up with, you should of course end up with ds equals mu sdt plus sigma sdw. But what I want to do is I want to use Ithos Lemma to actually construct the SDE. So let's do that. Let's say we write, so we're given, this is what we're given now, sd equals s0 e to the mu minus the half sigma squared, a little t plus sigma wt. Find the SDE or st. So this is going the opposite way. So how would I do? Well what you should do is you realize that s is a transformation of w. s is equal to sum ht wt and what is that function h? The function h is s0, that's a constant, e to the mu minus the half sigma squared little t plus sigma y. It's that function. And we can ask what is the d of s based on this relationship? And all we have to do is use Ithos Lemma and in fact we can use the simplest form of Ithos Lemma because it's a transformation of a Brownian motion itself. So from Ithos Lemma we will have, what's our first term here? Let's see if I can, if you can start to recognize, first term. Standard calculus results, right? So partial derivative with respect to y of h dw plus, still standard calculus, there's a time dependence here, an explicit time dependence. So it's partial with respect to that time argument. dt plus the Itho correction term which is going to be half second derivative dt. Because it's Brownian motion there is no sigma anywhere here. Sigma will come in through the function h. But it's a transformation on a Brownian motion, so the Ithos Lemma that I use is Ithos Lemma for Brownian motion. And now I simply substitute in all of these things. So partial derivative with respect to y of h, what do you end up with? You end up with sigma times h again, right? Y only appears up in this last argument. So I end up with sigma times h, but h is really s. So I end up with sigma f dw. Plus, partial with respect to time, what do I end up with? I pull down the mu minus a half sigma squared and again I get h back. When I evaluate that at w, I'll get s back. Plus, one half, two derivatives with respect to y, I'll get a sigma squared. And again I'll get h back. When I evaluate that at w, I'll get s. Now you see a nice cancellation here. The last term cancels the minus a half sigma squared s from the middle term. And you end up with mu s dt plus sigma s dw. Which is exactly the equation that we said we solved. So it's nice to see that you can go both forward and backward. Start with the SdE, solve it. Start with the solution. Show that the solution actually satisfies the SdE. Questions? Okay, so let's do a slightly harder SdE. Slightly harder. Significantly. So I'm going to introduce this type of process and it's a process which is used for interest rate modeling. So far in our interest rate modeling in discrete time, did we talk about the Vastachek model in discrete time? I don't recall if we did or not. Do you remember here we did? Who thinks we did? So I see a few hands. And the rest of you think we didn't. So we didn't. It doesn't matter if we did or didn't. I'm going to cover something and continue. I just wanted to know if you heard the name. Okay, so this is a model called the Vastachek model. And the Vastachek model addresses one of the issues that arise for interest rate models. When we talked about Ho and Lee, yes? Black Dermontoy, Ho and Lee. Come on. Okay, I know for a fact. I'm sure we did those. Lecture five. I'm not sure which lecture they would have been in. Yeah. Yeah, this is, here we are. Yeah. Yeah. I see. I didn't give you a name. That's what happened. Right? I didn't actually give you the name. I'm pretty sure I must have said it. So this is called the Ho and Lee model. This one that we covered before. So let me slide it there to the top of the screen. So what you do is that the interest rate at one time step equals where you were before plus a drift plus or minus sigma square delta t. Okay, so this is the Ho and Lee interest rate model. I'm surprised I didn't write it down. Did I say it? No? Wow. Okay, I apologize for not giving you the name. Okay, so that's the model. So that model, there's something not very nice about it. And I think we mentioned it. If you go out far enough in this tree, you build this kind of tree here, and if you go far enough, then eventually the interest rates could become negative, can't they? And interest rates at every point in time, they just go up or down from wherever they were before. It doesn't matter where you are in the tree. If I'm at a node, I'm going to get some sort of drift. The drift may be up, it may be a little bit down. And then I'm just going to go up or down by the same amount, sigma square delta t. But real interest rates don't behave like that. Do interest rates ever just blow up and become 200%? No, they don't. They do sometimes get quite low, as we've seen in the recent past. We've had recent past behavior where interest rates were very, very low. And this is due to some particularly unusual circumstances in the market. But they don't tend to blow up. What happens is interest rates tend to stay at around a typical level. And the government actually push interest rates to those levels in order to curb inflation or increase inflation. I mean, they don't ever want to increase it. But you want to keep inflation in a particular zone. So interest rates tend to do what's called mean revert. They don't just sort of fluctuate and go off to something very large or very small. The Vassacek model is designed to address that particular issue, mean reversion. So in the discrete time, the way the Vassacek model looks is this. So the increment is going to be, in fact, I'll write it in terms of an increment. The interest rate increment with some constant times another constant minus the level of interest rate times delta t plus or minus sigma square root delta t. This would be the discrete time version of it. And then we'll talk about the continuous time version. So what does this do for you? Well, if you kind of draw a sample path and ask yourself about the behavior of interest rates, so this is this level theta and this is r and this is t. So if you're in this region, what is the sign of that term? All these things are positive. Kappa, theta, sigma, these are all positive numbers. What is the sign of that term? It's negative, isn't it? If I'm in this region here, then the coefficient there is negative because I'm above theta, theta minus a larger number is negative, multiply my positive. So I get negative push. This coefficient is effectively the drift. It's the push of the process. So in this region, you get pushed downwards. And if you're in this region, then r is less than theta and so you get pushed upwards. So there's a tendency for the process to be pulled toward this level theta. And this is what's called mean reversion. So if you actually draw a sample path to this process, it'll do something like this. As opposed to if it was just a Brownian motion or the Hoenn-Lee type model, interest rates would tend to do, well, they could do that or they could blow up. If I run this over a long period of time, you see it just fluctuates around that level theta. And this, of course, is noise. Oh, sorry, I forgot the little X n's here, right? Or a little Bernoulli's. Don't forget the Bernoulli's that hang around to tell you whether you go up or down. So this is a mean reverting process. I don't need the plus or minus if I put the X, right? Mean reverting process. And mathematically, these kinds of processes, they have a much more complicated name, Ornstein-Ulenbeck. So Ornstein-Ulenbeck processes, Ornstein and Ulenbeck were the guys who actually invented the concept of mean reverting processes, not in the context of finance. Vasacek took that idea into finance much, much later. So how do we write down a continuous, what I want to do is I want to write down a continuous time version of this model and analyze it. So if you were to write down a continuous time model, which was driven by Brownian motion, what would you write down? What's the natural thing? If you write down a stochastic differential equation for r, it seems natural that the left hand side is dr, right? That's the increments of r. The right hand side, well, notice that the right hand side, theta minus r and minus 1, is already the left hand point of something, isn't it? So it's kind of already using the, it doesn't have to be the left hand point because that's a Riemann integral, but it's using the form that we have for it to integral, the left hand point. So this is kappa theta minus r dt plus sigma dw, right? We know that the square root delta t times the xn, eventually when we sum up a bunch of them, they're going to become something like a Brownian motion, right? We know that that last term there is what gets converted into Brownian motion. So this is our continuous time model and what I'd like to do is analyze the distributional properties, well first of all solve the stochastic differential equation and then analyze the distributional properties of it, okay? So any ideas, okay, the way that you, the one way to think about this is like what we did with the GBM, with the, with this, with this, where is it? Yeah, with this model. By the way, let me give you a name here since I seem to fail to give you the name before as well. This is a Black-Scholes model. Mathematically, this is a geometric Brownian motion or GBM, okay? For this geometric Brownian motion, one of what we did is we said the left hand side kind of looks like the d of log s when we divided out by s. In other words, we used what we knew in the situation when w was not a Brownian motion when w was just simply standard calculus. So that's what we'd like to try to use here as well. See if you can use some technology from standard calculus. In particular, what happens if sigma is zero? Suppose sigma was zero. Could someone tell me what, what is the solution of this differential equation? Now it's just a differential equation. There's no stochasticity in here. There's no Brownian motion. What's the form of the solution anyway? Is r not exponential somehow? Right? Should be able to recognize that as an exponential form. Okay, so let's solve it. I'll just solve this equation for you explicitly. And there's a few ways to do it. So one is via so-called integration factors, right? I can recognize the post-data was zero. Actually, there, actually, let's do that. If both of these things were zero, maybe you recognize it now, right? If both are zero, then clearly the solution is rt equals r0 e to the negative kappa t. Agreed? Okay, if I take the derivative of that, I'll get negative kappa times itself. That's what this equation tells me. Or you could just explicitly solve it, right? Divide by r, take the d of log of r, and you'd be done. Okay, so you get this exponential form. So one, one thing that you could try to do is introduce an exponential factor into your solution. So you can say, suppose that r equals, because we know that this won't be the full solution. We know that there is some correction to it. So what you could do is you could try rt equals e to the negative kappa t times some new process, g. And the idea is, let's see if we can figure out what sd e does g satisfy. And that will be something simpler, it turns out. So this is one thing that we could try. There's another thing that we could try. And that's, first of all, let's try to get rid of the theta term there. Because we see when theta is zero, it's kind of, it's much simpler, isn't it? So how could you possibly get rid of theta? You might want to add something to r that's time dependent that gets rid of theta. No, these are not techniques that you've learned in your ODE courses. Way back in second year. Okay, well, they should be. You could also try, and then the other thing is we could try a combination of the two. So you could try, suppose rt equals some function of t plus another function of t, or in fact, sorry, let me, let me write a different way. Let's define ht to be equal to, no, I did it right. I did it the way I wanted it. Yeah, it's equal to some function of t plus another function of t. And one of these is deterministic. Okay, I want to choose this thing to be deterministic. Such that the equation for g has no theta term in it. Okay, and I'm seeing confusion in the audience. So maybe what I'm going to do is let's just focus on one of the techniques. Okay, let's just focus on the first one, the integrating factor. Okay, so just focus on this case. So if you take the d of r, what must that be equal to? This is a tricky question, actually. g is a process that we don't know, right? We don't know what g is. r is a transformation basically of g, isn't it? So what you can do is you can think of this as some function of t and of g, and what is that function? It's e to the negative kappa t times y, right? And what do you know about g? Nothing, in fact. In fact, what you're after is you want to find the equation that g satisfies. That's what we're trying to do here, right? We're starting with an equation for r. We want to simplify it somehow, and we're introducing a transformation that's eventually going to tell us the equation that the transformed function satisfies. That's what we would like to do. So this is one approach. The other thing is we could simply say let g be equal e to the kappa t times r, and now I just view this as a transformation. This is the same thing, right? The point that I've written below is the same as the point above. But now I view this as a new, as a different function, but of r. And that's helpful because I know the sd e for r, right? In that first form, I don't know the sd e for g, so I can't go right ahead and apply Itso's lemma. But if I write it this way, I can see, oh, actually, this really is exactly in the form of Itso's lemma now. So let's calculate. What is the d of g? According to Itso's lemma. Partial derivative with respect to y of h of t comma rt dr plus partial derivative with respect to t of h rt dt. I'm just constantly rewriting now in the same Itso's lemma. There's nothing new here. One half, two derivatives with respect to y of h of t comma rt times the volatility in r, which is, since it's already in the form dr equals something, I don't need to scale anything here. So it's just sigma squared dt. And now we just go in and fill in the blanks. What's the partial of h with respect to y here? It's simply e to the kappa t. Agree? That's the partial of h with respect to, oh, sorry, there's a y. I've written the expression for y. H is that. dr is equal to kappa theta minus r dt plus sigma dw. Partial with respect to time, that gives me kappa e to the kappa t times rt dt. And a half, two derivatives with respect to y is in fact zero, which is nice. There is no Itso correction term in this particular case. So if I just collect terms here, I hope you can see there's a cancellation. This term and this term are going to cancel. So we end up with kappa theta e to the kappa theta dt plus e to the kappa theta sigma dw. Yeah, sure they are. You have kappa because this term, there's a kappa out here and then you get a kappa there. So this is a nice equation because what you have is the right-hand side only depends on the Brownian motion and not on the function g, which is what you're trying to solve. So now I can just integrate this, both sides, and I'll have an equation for g. So I'll get gt minus g zero equals kappa theta. What's the integral of e to the kappa? Do you know? From zero to t? Well, okay. It's e to the kappa minus one over kappa, but I'll just write it like this first. Sigma integral from zero to t e to the kappa s dw s. Is there anything that you can do to that last term? From our discussions before? No. There's nothing you can do. You simply have to integrate the increment of the Brownian motion, scale by that at every point and put down a partition, integrate it, and so on. But we're going to see we can interpret the distribution of it, but we can't actually know what the object is exactly. We just know about its distribution. Okay, so now we get kappa theta e to the kappa t minus one over kappa. This is just a pure stochastic integral. And what's the difference of g's? g is e to the kappa t rt minus g at zero is just r zero. And so you can write r little t equals, just put everything else on the other side of the equation and multiply by the e to the negative kappa. You'll get r zero e to the negative kappa t plus theta one minus e to the negative kappa t plus sigma integral from zero to t e to the negative kappa e to the negative kappa t minus s dws. So what I've done is I've collected, when I take this term and I divide it to there, I have a term that's e to the negative kappa t, and I've just collected these two together to give me that term. It's a constant, right? So I can pull it under the integral. There's no problem. This just has a nicer interpretation because I'm looking at the distance between the end point and where you are at. But alternatively, you could simply leave this last term in the form e to negative kappa t integral from zero to t e to the kappa s dws. You could leave it like that if you want. Yeah. Well, you could do an integration by parts, but the integration by parts isn't going to help you very much. You'll see if you try it. You don't really get much out of it. It depends on the whole path. That's the problem. You'll end up with an integral of w times an exponential ds. So you'll still end up with an integral with a stochastic. You'll end up with a Riemann integral that has a stochastic integrand. Doesn't help. These type of integrals, you'll see the last point that I want to talk to you today about, are properties of integrals of that kind. And they're very special. Okay. So that's why I don't want to touch it. So like I said, you could in principle write down an integration by parts formula for that last integral, but it will not actually completely simplify. Try it on your own. You'll see. Okay. Okay. So this is our result. This is the form that the results usually take. So I'm erasing all that other junk just so you can focus on it. And there's some bits that we can easily identify and some that we can't and we need to build a little bit of fairy in order for us to do that. So if we could, if I ask you what is the expected value of r, then if in principle I could interchange expectations and integrals and somehow put the expectation just on the D w, then that last term has a zero expected value, right? If I could do all of that. And that sort of implies that, you know, this here is kind of the expected value of the process. And what does that look like in terms of a function? This is theta. If this is my r0, I exponentially decay towards theta. That's what the line, maybe I'll draw that in blue. That's what the blue underlining would look like. It's just an exponential function that decays towards theta. And if r0 happened to be below theta, it will exponentially increase the theta. So as t goes to infinity, what's underlining blue clearly goes towards theta, converges to theta. So the question is how do we deal with this term? Can we say anything about the distributional properties of this term? Okay. And so that's where the last half hour of our class is going to take us. Turns out that integral, and integrals of that kind, when you just have a deterministic integrand, but of a stochastic integral, those type of stochastic integrals are in fact normally distributed, and their means are zero, and their variance has a very nice form. In fact, there's even a more general form for the variance when the integrand is just a function of w and s. So let's work on that, okay? And then we could come back and answer questions about the distribution. So sorry, a little more theory, and this is related to something called Ito's Isometry. Okay, so here's a statement. It says, if you compute the expected value of the integral from zero to t of gswsdws squared, this equals the expectation of, surprising results, the integral from zero to t gsws squared ds. Seems odd, doesn't it? You take an integral, a stochastic integral, you square it, compute the expected value, and that result is the same as computing the expected value of the integral of the integrand squared and get rid of this stochastic integral, change it to a Riemann integral. That's a really interesting result. And in fact, there's a slightly more general version of that, and I think we might as well just do the slightly more general version, because everything else will follow a specific case of it. If you compute, suppose you have two Brownian motions that are correlated to one another. Okay, so w and b are correlated Brownian motions. We talked about them before. We investigated various properties of it. Suppose you take the product, so you compute a stochastic integral with respect to one of the Brownian motions, and a different stochastic integral with respect to the other Brownian motion, take their product, take an expectation. Anyone want to guess what the answer is? There's a natural guess. No guesses? It's going to be the expected value of the integral from zero to t of Gs Ws Bs Hs Ws Bs Rho Bs. Okay, so maybe let me just put a little note here that Wt and Bt are correlated Brownian motion. Okay, and the correlation is a row. Okay, that is the result. And so that's it with isometry. There's one other related concept, and that was that if you compute the expectation of a stochastic integral, it's zero. And this is true even if I put another, well, might as well do the general case, Bs here. This is zero, even for a general thing. Okay, so let's see if we can prove these statements. Okay, quasi-prove them. Let's start with this one. Okay, expected value is zero. So how would you, what would your first step in a proof, an attempt at a proof be? Not quite. That's not the most, that's not the natural thing to do first. First, put down a partition. Right, put a partition first. Then, interchange the summation and the partition. Okay, so you take a fixed partition pi, because then I can certainly interchange and compute this expectation. The sum over k. So I just defined a stochastic integral, right, in terms of that partition. And I'm going to shorthand notation here. I'm simply going to write this as G sub tk minus 1. So that means evaluate this object at tk minus 1. Okay, there's no need to write down all the explicit arguments in there. And here we would have delta wk. So if I can show that this is zero for every partition pi, then I'm done. All right, then I don't even have to worry about interchanging limits and expectations. Is there, every single element in the sequence is zero, and the limit has to be zero. There's no chance. Okay, so here now I would get that equals, this is a sum, so I can interchange the expectation and the sum. I'm allowed. So we end up with that. Now, is there any special property about Brownian motion that we can use to simplify that expression? What are the properties of Brownian motion? Remind me. There's a few key ones. Start from zero. That's the first easy one. Second property. Stationary and independent increments, right? That's the key property here. Last one, or, well, normal, if wt is normally distributed with mean zero variance t, and continuous, right? Continuous paths. Okay, independence. Increments are independent. Here's tk minus one. This is tk. G is being evaluated here. Delta w depends on the change over this interval. This increment has to be independent of the Brownian motion's path, right? By definition of Brownian motion. Therefore, this expectation is just a product of expectation. This is another reason why the Ito integral is defined in such a way that you use the left-hand point for the integrand. If it was the middle point, I could not make this argument. If g was evaluated in the middle, I couldn't separate out delta w and g as two independent Brownian variables. And the second term is zero. The second expectation is zero, right? Yeah, it's not. We demonstrated that the increment of the Brownian motion, in fact, it's part of its definition, is independent of any other increment as long as if you touch, you have to touch it most at one point. If there's an overlap of the interval, then there's a problem. If they just touch, you're fine. They're independent. The increment of w, like w1 minus w0, is independent of w2 minus w1. It doesn't matter where w1 is. So this is in fact a zero. So that's it, really. We've basically shown this result now. Take the limit of the partition, the pi going down to zero, modulus pi going down to zero. Okay, slightly harder one, this bugger. So what do we do? Same idea. Take a partition. Use the same partition for both integrals. So if we do that, we're computing the expectation, sum over k, g, tk minus 1, delta wk, sum over l, h, tl minus 1, delta bl. That's where we're computing. And that equals, I can interchange the sum. Now, here's the thing. I want to break up that sum into three distinct sections. It's not obvious that that's what you should do. But if you look at this for a bit, you realize that and use this idea of independence of increments. There are really three relevant areas. So on this axis, I have the integers k and this axis the integers l. So these are all just discrete points here. I'm making a sum over this lattice. That's what that sum, that's what that double sum does for me. So if I isolate, first of all, the points along this diagonal. So these points are where l equals k. Why are those things of interest? Because look, we have increments of Brownian motions. We know the increments of Brownian motions behaves in certain ways when they're on the same point, right? When they're over the same interval, these are correlated Brownian motions. So they have to be treated separately than when they're not on overlapping intervals. And then there's everything else in this section, and then everything else in this section. In here, we've got k is bigger than l. In here, we've got l is bigger than k, right? So let's write the sum into those three sums. So let's just take the sum and we've got expected value. We're taking sum over k of expected value g tk minus one h tk minus one delta wk delta bk. That's the diagonal element, okay? Then we have sum k is bigger than l, expected value of g tk minus one h tl minus one delta wk delta bl. And then plus the same term with l interchanged by k, right? There's no need to write the second one down because they're completely symmetric. So here's my argument. Here's my statement. This is zero and this is zero. Why? Fundamental property of the independence of increments of the Brownian motion again. This is tl minus one. This is tl. In this region, k is bigger than l, right? So tk minus one tk. The only potential that could happen is that k minus one equals l. k has to be bigger than l. So the critical point is when k equals l plus one, in which case these two points touch, right? These two points will touch at the critical point, but they will never overlap by construction. I've chosen k to be bigger than l. I have, in this integrand, we're evaluating g at we're evaluating g at this point. We're evaluating h at that point. We have the increment of b here and we have the increments of w there. This w is independent of everything else by the independence of increments of the Brownian motion, right? Because there's no overlapping period. I cannot say the same about g and delta bl and h. Those are all somehow dependent on one another. Because g depends on the Brownian motion up to time tk minus one, which would have some correlation with the increments of b, which would have some correlation with h. But delta wk has to be independent of everything else. That's the key argument. And if that's the case, then clearly I can write this expectation as the product of... See, I'm not being very, very formal here, right? I'm just writing down, giving you sort of the hand-waving argument, but it's correct. This is the same thing as g tk minus one, htl minus one, delta bl times the expected value of delta wk. And this is equal to zero. Oh, sorry. I want to shut this down and put it back up again. And this is zero. The other way that you could do this is via iterated expectation. You could say condition on everything that you know up to time tk minus one, and you get the same result, because the increment is independent of the path. That's the key thing. Okay, so then we're left with just this one sum. The second term there, by the way, when l is interchanged with k, then instead, delta bl would be independent of everything else, because bl is the one that's off to the right. So we're left just with this one sum, and now let's play with that sum. This is where we can use iterated expectation. The expected value of g tk minus one, htk minus one, delta w tk, delta b tk. This equals, again, independence. What is the property of correlated Brownian motions? Rho delta tk. So therefore, we end up with, let's call this thing, I'll just call it i. The object that we were trying to calculate from the beginning. What we've shown is that i for a fixed partition is a sum over k expectation tk minus one, delta rho delta tk, and that is the same thing as the expected value of the sum, rho delta tk, because it's a finite sum. And now, so this is for a fixed pi. And now we take this limit. And here, there's a bit of a subtlety. You have to make sure that the integrands are well-behaved enough so that you can, in fact, interchange the limit and the expectation. So here's where there's a subtle argument that I'm going to leave out. And hopefully, you can recognize that limit under the expectation. That is just a three-minute integral, right? Zero to t of Gs Ws Bs Hs Ws Bs rho ds. Done. That's the proof of the formula, more or less the proof. So what I'd like to do now is use it in the context of this vaso-check model. So let's scroll back. We had this result, and we have that expression underlined. Now, we can say that this integral has to be normal with some mean and some variance. Why do I know it has to be normal? Because if you know the definition of this integral is put down a partition, take a sum of the increments of Brownian motion multiplied by this exponential evaluated at the left-hand point, and what is that? That's a sum of scaled normals. So that has to be normally distributed. So if it converges, which we know it does converge because the stochastic integral wouldn't be defined if it doesn't converge. So if it converges, it must converge to a normal. And the only question is, what does the mean and what is the variance of that normal? So we can now answer that question. What is the mean of the normal? According to the general result we just showed, the expectation of a stochastic integral is always zero. And the variance, what does that equal to? Well, since the mean is zero, it's the same thing as the expected value of this thing squared, which equals, we use it as isometry. You can always think of this as the integral, that integral twice, in which case the g and the h function in our previous calculation are just equal. And the w and the b are equal, so rho is one. And the g and the h are the same, so we just get the integrand squared. And since there is no stochasticity here, the expectation is irrelevant. In fact, it's simply just this integral. And that is 1 minus e to the negative 2 kappa t all over 2 kappa. So let's put that together with r itself and talk about the entire distribution of r. So now we know what we've demonstrated is that r is normal. Its mean is that from previous calculation. And variance, okay, so let's draw that. Sketch that out. Suppose we were to ask, what does the one standard deviation band for the process r look like? And let's suppose we started there. Well, first of all, we know what the mean looks like. It's exponentially decaying, as we already discussed. This here is telling us about the variance. So the square root of it will tell me my volatility. And do you notice that this variance, as t gets larger and larger, what does it converge to? What does the Brownian motion converge to? It doesn't. Brownian motion has variance, t. It grows. It blows up. The volatility of the Brownian motion just keeps growing and growing and growing in terms of distribution. Here, what we have is actually the volatility becomes confined. It stays finite. This is what I was talking about before, that interest rates don't just blow up and can become arbitrarily large. They stay in a band. This is a reflection of the fact that the interest rates tend to stay in a band. And it gets wider. It starts off at the volatility at little t equals zero. Volatility is zero. You can see if you do a Taylor expansion here, for small t, or let's say for t much, much less than one, I can Taylor expand that exponential. I'll get one minus two kappa t. The one cancels, the two kappa cancels, and this is exactly sigma squared t. So for small time, it looks like Brownian motion. But for large time, it converges. So it's going to do something like this, the envelope. It gets larger and then eventually flattens out. And here it goes up and eventually flattens out. It's always hard to draw this. There we go. It looks something like that. That's what the band looks like. So pads tend to stay in that window. And this is what I meant by if you wait long enough, if you wait for a while, the process will converge to theta and it will fluctuate within this band. And this is why people like to use it for interest rate modeling. So let's do a quick little computer experiment to finish off the day and show you what sample paths actually look like. So let's go out for 20 years, say. Let's take 1,000 steps. Let's put in a volatility of, say, 1%. Let's start interest rates out at 2%. Let's say the long run, let's say that the long run level that interest rates converge to is, say, 5%. And kappa is 1. So we'll talk about what the role of kappa is via this simulation. And simulations, let's do, say, 10 paths for now. So we'll start off, OK. So we're going to get our new interest rate is equal to, we'll start from time step 2, is equal to our old value. Basically what I'm going to do is I'm just going to discretize, where is the stochastic differential equation? There it is. OK. This stochastic differential equation, I'm simply going to discretize it. So dr is the same thing as r at step k minus r step k minus 1. So in other words, r drt equals kappa theta minus r dt plus sigma dw. What I'm going to do, we'll do two things, in fact. We'll say this is equal to r tn minus r tn minus 1. It's kappa theta minus r tn minus 1 delta t plus sigma in distribution. This is the same as square dt times a standard normal. OK. So we'll generate a new value by taking my old value, putting it on the side of the equation, and simulating this forward. OK. That's all. In fact, we can write this as kappa theta delta t plus 1 minus kappa delta t r tn minus 1 sigma square delta t z. And z is a standard normal. So we'll simulate forward like this. So it's kappa times theta times t. And we'll do this for all simulations at once. And here we have kappa times 1 minus kappa times dt times this plus sigma times the square root of dt times a bunch of random numbers. I think that should do it for us. And then we just need to plot r 0 dt t. OK. Good. So this is a collection of 10 sample paths. And why did it not? I should say n plus 1. There we go. OK. There we go. Yeah. So this is a collection of 10 sample paths. And you can see this sort of band that I was trying to draw before. Right. This is starting at lower than the long run level. If I started above the long run level, say 7%, and I ran it. Oh, sorry. I didn't start all of the starting points at r 0. There we go. So this is starting at higher than the long run level. But you can see they all fluctuate around this and come back. And if I increased kappa, what do you think will happen? So let's look back at the stochastic differential equation. If I increase kappa, what happens? It definitely is going to give us a smaller variance. We calculated that the invariant distribution, so invariant distribution, meaning as t goes out to infinity, it's sigma squared over 2 kappa. So larger kappa is going to make this tighter. You could also think about it intuitively from the equation, from the drift here. You can see if I'm not that far away from theta, but kappa is large, I still get pushed to theta very strongly. The larger kappa is, the more strong you get pushed to theta. So that means your variability is tighter. So let's run it. And there you can see you almost immediately come down to a level of 5% and just fluctuate very, very, very, very close to it. And let's see, what other feature can we do here? Those are pretty much the main features. So you want to see that large kappa pushes you toward it. Small kappa lets you fluctuate more, have more variability. OK, so we'll call that a day.