 So, in the previous lecture we were looking at types of random variables. So, this types of random variables are classified based on the nature of the measure, nature of the probability law that they induce on the real line. So, we said this is a non trivial theorem in measure theory that says that any probability measure on R can be decomposed into three components, a discrete component, a continuous component and a singular component, all the we did not really define what these are. We only looked at discrete measures, discrete random variables on R. We said that a random variable is discrete if it takes values in a countable set with probability 1. And we also said that for a discrete random variable you can just specify the probability law by completely by specifying the probability mass function which just gives you the probability that x is equal to E i for the countable set E. So, those are the discrete random variables are the most I mean they are the simplest kind of random variables. And then we have continuous random variables which are little more subtle and then singular random variables which are completely bizarre. So, continuous random variables. So, what do you understand by continuous random variable can someone tell me in your current understanding what is a continuous random variable. So, a random variable that takes values in a continuous range that is not that is not quite a continuous random variable. So, actually as a very specific definition. So, any other response before I tell you what it is see a random variable is said to be a continuous random variable if the probability law p x assigned to 0 measure Borel sets is 0. So, all if you take all Lebesgue all Borel sets of Lebesgue measure 0 on the real line if the p x assigned to all such 0 Lebesgue measure sets is 0. Then that random variable is said to be a continuous random variable that is the correct definition. So, I will state I will state some definitions couple of definitions. So, let nu and nu be measures defined on omega f. So, you have some measurable space omega measurable space omega f and let us say you have 2 measures defined on this measurable space. Then we say nu is absolutely continuous with respect to nu if for every n and f such that nu n equal to 0 we have nu n equal to 0. So, this is the definition of absolute continuity of one measure with respect to another. So, what do we so we are not a not at all talking about probabilities here I will come to it in a minute. So, you have a measurable space and you have 2 measures nu and nu on it. So, omega f nu is a measure space omega f nu some other measure space on the same sigma algebra and we say that nu is absolutely continuous with respect to nu if for every nu null set for every set which has measure 0. The nu measure is also 0 if the other way round in not hold. So, we are saying that nu is absolutely continuous with respect to nu, but not the other way round necessarily right. It may not be that for every nu measure 0 set you have mu measure 0 that is not necessary. This is the notion of absolute continuity of measures. So, now I am going to define a continuous random variable. See on continuous random variable the action is on R B R right. So, you have so what you have to just to give you a recap. So, you have your omega. So, you have your omega f p and then your random variable that maps. So, the random variable such that the pre measure of Borel sets is f measurable right that is your random variable. Now, this is our familiar R B R right this is our measurable space R B R. Now, on this measurable space we have 2 measures right that we can talk about. So, the one is R familiar Lebesgue measure R B R lambda right which just is length right. The other measure is the measure induced by the random variable x which is the probability law p x right. So, on this measurable space I can speak about 2 measures Lebesgue measure and p x. p x is the probability measure Lebesgue measure is not. So, I am going to say that the random variable x is continuous if I am going to replace mu with p x and mu with lambda that is it definition. A random variable x is said to be continuous if p x is absolutely continuous with respect to the Lebesgue measure lambda. In other words x is said to be a continuous random variable if for every Borel set n of Lebesgue measure 0 we have p x of n equal to 0. Is the definition clear? So, random variable is said to be a continuous random variable if you take any set any Borel set of 0 Lebesgue measure 0 length. All 0 length sets must have p x 0 the probability measure sitting on any 0 length set must in fact be 0 for every 0 length set. Then the random variable is said to be continuous. Is that clear any questions? So, it is a popular misunderstanding that random variables continuous if all single terms of probability 0 that is not correct. So, in particular c from here you can see that single terms of 0 Lebesgue measure right. So, if n is a single terms set just containing 1 point on the real line it has 0 Lebesgue measure therefore, it must have 0 probability measure. So, it is true that for a continuous random variable all single terms single point on the real line will have probability 0. But that is not define a continuous random variable for every Lebesgue measure 0 set you need probability measure equal to 0. Because in fact even for single random variables it is true that single terms of 0 probability. Is this clear? Now, I am going to state a result another important result from measure theory which again I will state without proof. But it is good for us to know what it states rather than how it is proven. It is a very important theorem called Radon Nicodin theorem. I will state that. So, Radon Nicodin theorem this is a special case actually Radon Nicodin theorem holds for arbitrary measure spaces with sigma finite measures nu and mu. I am going to state it only for probability measures. Let p x. So, let x be a continuous random variable i e p x be absolutely continuous with respect to Lebesgue measure. Then there exists non negative measurable function f suffix x from r to 0 infinity. Such that for any Borel set b we have p x of b is equal to integral over b f x d lambda. So, actually Radon Nicodin theorem says if you have 1. So, if you have nu if you have 2 sigma finite measures nu with absolutely continuous with respect to mu. Then you have a similar result nu x of any set is equal to integral of 1 if integral of some non measurable some measurable function with respect to the other measure. So, I have stated this for a particular case of probability measures and you just need to understand what it roughly says at this point. In fact as a matter of fact you do not you will not quite completely understand what this says. Because the main problem is you do not understand what this means at this point. We will get to this in a few weeks, but let me in plain English try to explain to you what this really means. So, you have a continuous random variable. So, this p x is absolutely continuous with respect to Lebesgue measure that everybody is with me. So, for every Lebesgue measure 0 set you have p x is 0 p x of that set is 0. Now, this Radon Nicodin theorem says that you take any Borel set you want not necessarily 0 measure Borel set. And let us say for that Borel set I want to compute p x of b that is the probability law. It says that for any such Borel set I can write p x of b as the integral of a non negative function non negative measurable function with respect to the Lebesgue measure. This is what is known as a Lebesgue integral. We have not really encountered this so far, but you can just think of this is. So, for now if this bothers you you can just take this as f x d x you are just integrating over the real line. If you if this d lambda is bothering you just write it as d x. You are integrating over that Borel set f x d x this f x is some non measurable some non negative measurable function and you are integrating over a Borel set. Now, you do not know what integrating over a Borel set means you only know integrating over intervals a to b. So, if you further want to simplify you can think of this Borel set as being a interval for now. Let us say it is an interval a to b then this Radon Nicodin theorem says p x of a to b the probability of lying the random variable being in a to b is equal to integral a to b f x d x, but it is true not just for intervals it is true also for Borel sets which can be quite bizarre as you know. Let us leave that alone for the moment I mean I think that in order to understand we will get back to this. You will precisely understand what this integral means in a few weeks at that point I will point out what this all this is going on. For now I will only say the following this is true for every Borel set. So, it should be true for the generating class. So, what is the generating class? Semi infinite intervals all open intervals it does not matter let us take a semi infinite intervals in particular p x of minus infinity x which is nothing, but what is this this is the c d f which is nothing, but the c d f at x is equal to integral f x of x d x. Now, you are integrating over the set minus infinity to x. So, you might as well write that if it is just integral over an interval the Lebesgue integral coincides with the Riemann integral which you are familiar with right that is it. So, for all. So, for our practical purposes at this point all you need to understand is that this Radon equilibrium theorem tells you that your c d f can be written as the integral of a non negative function non negative measurable function. There exists some non negative measurable function whose integral is your c d f this is true for all x with me any questions. So, this f x is a non negative measurable function it only takes values greater than or equal to 0 it can take any non negative value. If you integrate f x you get c d f and in general if you integrate f x over some boral set you will get the probability of the boral set the only problem is you do not know what this means precisely at this point which I will tell you later, but this you will understand at this point this is just a Riemann integral that everybody understands this right I have a non negative function I am integrating over an interval right or some infinite interval yes. So, this f x this see this guy c d f is a function that goes from 0 to 1 right it is a probability after all right. So, this guy has to be between 0 and 1, but this little f x can be any non negative measurable function all this the theorem is saying is that the c d f can be written as the integral of a non negative measurable function. And this little f x has a name this guy is called probability density function is called p d f are there any questions at this point. So, I have define the continuous random variable as a random variable which has 0 probability measure for all Lebesgue measure 0 sets right and then I invoke this theorem for measure theory which says that my c d f for probability law can be written as the integral of a non negative function and that non negative function is called the probability density function of the random variable x. Now, in slightly more elementary treatments often you find that continuous random variables are defined by this. So, there are books that say that x is a continuous random variable if it is c d f can be written as integral minus infinity to x f x of x for some non negative function little f x of x actually that is also correct definition. They are both equivalent definitions in fact Grimett Sturzaker uses this as the definition of a continuous random variable whereas I mean I what I try to do here is that it is not important to really go down to the nuts and bolts of Radon-Nikode theorem for us. But I just want to place in context where this density function is coming from all of you know probability density function right you have encountered it in some form or the other I am just bringing it out in a natural way from a measure theoretic theorem. Any questions now this probability density function is not see it is not uniquely defined see if you given some c d f of a continuous random variable it is only unique up to a set of measure Lebesgue measure 0 right for example, if you take a p d f you can change its value in a few points let us say finite number of points or countable number of points. And the value of the integral will not change right if you change the redefine a function at one point the integral does not change that you know right. In fact, even if you change it at even if you change the functional value of this density function on a 0 measure set the integral will not change. So, this p d f is not really a unique there is not a unique p d f, but it is unique up to sets of measure Lebesgue measure 0. So, we will refer to the p d f, but there is really a whole equivalence class of p d f that is you can change the functional values at finite number of points or countable number of points or 0 measure set of points and the integral will still be the same are there any questions of this point. So, the fact that c d f can be written in terms of an integral like this imposes certain importance imposes certain properties on the c d f itself. So, you can show that in fact, because of this representation the c d f is differentiable almost everywhere it has derivative except possibly on a set of Lebesgue measure 0. And you can show that f x also it is not only a continuous function it is actually satisfies a stronger form of continuity known as absolute continuity. So, for a continuous random variable you can show that the c d f is not just continuous, but absolutely continuous and further more it is also differentiable almost everywhere meaning that it is derivative exists everywhere except possibly on a set of Lebesgue measure 0. This f x of x no for a continuous random variable a capital f x of x is a continuous function it cannot have any jumps. See generally c d f can have jumps in fact, for a discrete random variable we saw that it only has jumps it is either flat or it jumps for a continuous random variable not only is f x continuous it is actually also absolutely continuous. And in fact, it is differentiable except possibly on a set of Lebesgue measure 0 these properties can be shown. So, I want to see by the way. So, I think I said this last class the c d f is continuous at a point if the single term if the probability of that single term is 0. So, if you are asserting that f x the capital f x is continuous everywhere it just means that all single terms have 0 probability which is true for continuous random variables, but it is not the defining property of continuous random variables, because it also holds for single random variables as we will see. So, not only is f x continuous it is also absolutely continuous and differentiable almost everywhere. So, I will give some examples now again see I will give examples and I will only specify I will not talk about omega f p right as I said you can just talk about p x right once your random variable realizes you can only talk you can afford to just talk about the measure induced on r b r. So, I will only specify p x the first example is that of a uniform random variable. So, your sample space is. So, I mean. So, it is basically you are going to take r b r let us see let us see. So, let me. So, this is a measure on r b r such that let me just specify the c d f that is a little bit easier. So, f x of x of the uniform random variable is equal to 0 for x less than or equal to 0 x if 0 less than or equal to less than x less than or equal to 1 and 1 f x greater than 1 right. So, essentially you are putting. So, this is your c d f and this corresponds to a uniform random variable on the 0 1 interval essentially what happens is you are taking the Lebesgue measure. So, the p x I did not write it properly. So, the p x will be the Lebesgue measure restricted on 0 1 right after all the p x itself is Lebesgue measure on the 0 1 interval right and p x is right. So, this is called a uniform random variable on 0 1 actually. So, uniform random variable is a random variable which induces the uniform measure. Uniform measure we have already studied on 0 1 right that is why it is called a uniform random variable. So, it is c d f is what I have written out there. So, your f x will look like that f x of x against x. So, it stays at 0 and then it goes at was like that right to 1 that is your c d f this is the function x. Now, this function for example. So, this function I have drawn here is differentiable everywhere except at 2 points right there is no derivative here there is no derivative here right everywhere else it is differentiable right. And in fact, you can see that this c d f can be written as the integral of the derivative of this right. So, essentially you have here everywhere that if you want to plot the little f which is the probability density function you can have 0 all of here and then it is certainly 0 all of here. And in between it is 1 right and here at this point it is not really defined right derivative is not defined right, but you can define it wherever you want because you are going to integrate it right. So, you can put the functional value here if you like or downstairs if you like no problem right after all the probability density function is I mean once you integrate it you are going to get the right probabilities anyway right does not really matter. So, you can change afford to change the functional value at a few points and the integral will not change all right. So, the functional value here is 1. So, I want to emphasize that the probability density function has no interpretation whatsoever in terms of a probability. It has no interpretation whatsoever in terms of probability it has the interpretation of probability only tiles integrated over a Borel set the integrate fx over a Borel set you get probability of the Borel set. If you integrated over an interval you get the probability of the interval. So, this fx itself has no interpretation of a probability. So, in particular for example, I could talk about a uniform measure in 0 half. So, in that case your function will be 2 x in the interval 0 half and your density will be 2 in the interval 0 half. So, it clearly does not have I mean the value of f x itself little f x itself has no meaning in terms of probability. It has the interpretation of probability only if you integrate it. So, do not pick off values from the p d f and say that is the probability no right. You have to integrate it to get probabilities any questions yes. So, I am saying capital f x is differentiable almost everywhere for a continuous random variable yes. So, the derivative of f x need not exist on a it is not the derivative does not exist everywhere. It exists everywhere except perhaps the set of measure 0 right here it does not exist in 2 points right. So, the second example I want to give is that of an exponential random variable. This is a this is very commonly encounter random variable by the way you can also define uniform in a to b. There is nothing sacred about 0 1 or 0 2 or any a to b you can define a continuous random variable. And it you will have this kind of p d f except the height will be 1 over b minus a right. So, the exponential random variable has the following c d f. So, I will just specify f x of x is equal to 1 minus lambda x. So, lambda greater than 0 is some given parameter for x greater than or equal to 0 equal to 0 for x less than 0 that is a c d f. So, the c d f looks like. So, you are going to 1. So, you are plotting f x of x against x that looks like 0 everywhere here and then it goes of like that right. So, 1 minus e power minus lambda x and you can see that the derivative at this point the derivative does not exist, but everywhere else it exists right. And the p d f will look like f x of x little f x of x will look like lambda e power minus lambda x or x bigger than or equal to 0 and 0. So, this will look like the start at lambda and then go down like that that is your probability density function. So, if your lambda is very big you will start of at a very high value and d k very quickly lambda is very small you start of small and d k slowly that is the exponential random variable. Now, this exponential random variable is of great importance because of a very special property that it possesses. So, the exponential random variable possesses a property known as memorylessness. So, memorylessness means the following intuitively. So, this exponential random variable is a non negative random variable right. So, it has the following property it actually occurs a lot in practice as well. So, if you take a radioactive d k of some of a material it emits these particles right it emits particles once in a while. And you can measure this when these emissions occur and it is well known that the interval the time interval between two consecutive emissions is distributed like an exponential random variable. So, the memoryless property is the following property. So, suppose you have this radioactive emissions and suppose that you have waited for some amount of time let us say some t seconds and you have not seen an emission. Let us say an emission occur you have been waiting for some time right and there is been no emission from the time onwards it is not more likely or less likely that the emission will occur soon. So, the time from the point is distributed as your original time to original inter emission times right. So, I will to say this more precisely I will introduce this memoryless property. So, we will say that a random variable x has the memoryless property if random variable x non negative this is only for non negative random variables. If it satisfies for all t comma s greater than or equal to 0 probability of x greater than s plus t given probability x greater than t is equal to probability of x greater than s I hope I have written it correctly x greater than x plus t given x greater than t that is correct. So, I am just defining a property called a memoryless property. So, this property says that see this is the event. So, obviously this is your conditioning on some event right the event is that your omega is such that x of omega is greater than t right. So, you are looking at that probability the probability that the random variable exceeds x s plus t given that it has already exceeded t right that proper that proper probability if it is same as your unconditional probability of x exceeding s in the first place. So, what am I saying my x has exceeded t now given that it has exceeded t what is the probability that it will exceed t plus s it is same as the probability that it exceeds s in the first place. If that property is satisfied then x is said to be a is said to have the memoryless property. It is by no means a property satisfied by all random variables you can verify that this property is satisfied by the exponential random variable. So, exercise show that the exponential random variable is memoryless it is actually a very trivial exercise because you can compute this probability of a given b as probability of a intersection b over probability of b. So, what is the probability of x greater than x plus t intersection x greater than t it is simply the probability that x greater than s plus t right which is e power minus lambda s plus t right. And the denominator you have probability x greater than t which is e power minus lambda t. So, if you divide one by the other you get the probability that x bigger than s. So, it is a trivial exercise to show that the memoryless property holds for the exponential random variable which is what makes this random variable very important and useful in practical applications. It is also a random variable that occurs a lot in nature just as I said radioactive decay is a very good example. So, given that you have not seen an emission in a long time does not make it more likely or less likely that an emission will occur soon. Now, what is somewhat more remarkable is that the exponential random variable is the only continuous random variable which has the memoryless property. There is no other continuous random variable which has this kind of a memoryless property. So, in fact there are only I mean the discrete world you can show that geometric random variable has memoryless property and the continuous world exponential random variable has the memoryless property. So, you can prove that if you assume that if you are only looking for continuous random variables you can prove that this equation itself implies that is exponential you can prove that. If you are given a continuous random variable and I tell you that is memoryless you can prove that it has to be the exponential. The parameter can be anything lambda can be anything, but it has to be an exponential. You can take that also as an exercise maybe it is a homework actually. What you have to do is solve this function of the equation. So, I think this may be a good place to stop because I mean. So, I think next class I will give a couple of more important examples. I just want to ask if there are any other questions on memorylessness or anything we have discussed today. No questions? All right see you tomorrow.