 Now, let us start what we did in the last classes, we said what is the CDF of a random variable and then we defined what are the properties the CDF satisfies. So, we gave a statement which was if and only if that is if there is a CDF for a random variable it has to satisfy certain properties and if you come up with a function which has this property there will be an associated random variable for which this function is going to be a CDF. Now, what we are going to do today is we will build on the whatever the random variables we try to characterize some more properties of the random variable using the CDFs and another PDF that I am going to introduce in the class. So, before that so far what all the examples I gave and what all the way I defined my CDF it was for a random variable that took finite number of outcomes. So, for example, in the toyon cause it was just heads or tails which we mapped to 1 or 2 and then in the case of dice it was 1, 2, 3, 4, 5, 6. So, like that like the outcomes was only finite. So, it is not necessary that the random variable X has to take finite outcomes it could be taking continuum of outcomes. For example, height of the population in India it could be like any value we said sometime back it be possible values are something between 2 feet to 8 feet any value is possible. So, we try to make this continuous and discrete random variable notion bit more formal today. So, we are going to say a random variable X is discrete if either it takes finite set of values or countably in 5 set of values. So, for example, let us say I have index at I, this could be simply this index at I could be simply 1, 2, 3, 4, that is then it is going to take 6 different values that random variable or it this I could be just consist of 1 or 2 in this way. In this case my random variable takes only 2 possible values X 1 and X 2 or it may happen that this I is countably infinite. For example, this I could be all possible integers positive integers 1, 2, 3, 4, 5, 6 like that and I have that many possible outcomes that my random variable can take. For example, if you want to model something let us say so number of atoms that you are going to see in some experiments I mean there could be uncountably many like there is no bound on how many atoms you could. So, you could be like indexing them 1 atom, 2 atom like that that many. So, in that but you are still able to index them and you can enumerate them. So, that is why we are going to so you understand the meaning of countably infinite right. That means basically you are able to enumerate all the possible outcomes. So, whenever it is the case that I have finite values that finite outcomes my random variable takes or it can take countably infinite set of values then we are going to call it as discrete random variables. So, in this case I know that there are only certain values that are going to take that are going to be realized when I conduct my random variable and then the probability will be only associated with these points right all the points that this outcome can take. So, any other values there will be no associated probability because that values are never going to arise. So, when we have such a discrete random variable the more interesting I mean instead of looking at directly the CDF maybe you just want to look at something called probability mass function. Where right not really because I have already said this is for all I coming from this set I right. So, if I do not have this and if I write like this then right and I need to include this, but the way earlier I have written is it is just like my xi and taking I from this. Now this means it is already including all possible values that my random variable is taking that is why this probability is already one. So that is simply taking probability of the points at which are being taken by this random variable x. So, what we are basically doing is I am going to call something probability mass function which is going to just give me probability of the points that my random variable is taking right. So, here maybe I should write I am just. So, let us say I am going to take my random my x takes only 5 outcomes. So, if I just give probability for each of the possible 5 values then I am going to take it as probability mass function and it could be like uncountably infinite also, but at each of the points if I have the corresponding probability that is going to define the probability mass function of that random variable. So, is there a relation between this probability mass function and CDF then. So, we have defined something called d of f of x, f of x right. We defined it in the last class. What did we say? So, this was like I mean we just said that this is nothing but probability that x is less than or equals to c minus x less than c. We also defined what we mean by probability that x strictly less than c right. Then do you see any relation between probability mass function and this d of f of x c? They are going to be the same right. This is just like a probability that x of c. So, what I have now expressed is I have expressed my probability mass function in terms my CDF function right. So, do you think I can express my CDF in terms of the probability mass function? How is that? So, f of x of c, how should it be? Is this correct? So, look for all the points which are less than or equals to c and just add them. So, then this will result in the cumulative probability till the point c. So, this is basically cumulation right. So, basically so, I recall that for a discrete case we had come up with one example where my CDF look like right like this value being 1. We said that this jump corresponds to let us say whatever this point let me call this as c, this jump here corresponds to that d of f of x. Now, we are exactly calling that jumps for all the points that wherever it is happening as the probability mass function and then we know that this function is nothing but the accumulation of all this point till any suppose if I take this as c. So, then if I get f of x c by accumulate all these jumps and that is what this is. So, the way I interpret so even though I have written as a summation, summation means always we are adding finitely many terms right, but in the, but here I am writing it as like y less than or equals to c. That means there are uncountably many y's as the way to interpret is all those points where all those y's where this guy has positive value, where only summing over them ok fine. So, most of the time when you have a discrete random variable maybe either you give me CDF or probability mass function they have kind of same amount of information and most of the time when we are dealing with the probability mass function we will just deal with the, whenever we are dealing with the discrete random variable we just will deal with the probability mass function. That means basically saying that you have this many discrete points that are possible outcomes. Probability mass function just tells you what is that probability that each of these points can be taken by my random variable that is it ok. Now let us move on to continuous random variable ok. So, to make this notion of CDF or what is what is ok. So, to make the notion that what is that the probability of a continuous random variable that is falling in some interval. So, let us say I am interested in finding the probability. So, what this means? Suppose, so b is some set b is some subset of my real numbers ok. Now, if x is a continuous random variable I expect like it could be taking any positive number right like it is not like it can only take some finite number or countably finite number. But now let us say this b is some interval or whatever set. Now I want to ask this question what is the probability that x belongs to that set b? Now we are going to say that or maybe let me write it in this fashion. If there exist f such that for all b which is like a subset of r. All of you understand what I mean by this? If I am able to express whatever this probability in this fashion then I am going to call my random variable x as continuous. So, ok. So, let us ponder on what we are trying to say here. Earlier we know that like if I want to find probability. So, in the discrete case when we have a like if I want to find ok what is the probability that my x takes some value you just try to add the probabilities of all the possible values that your x can potentially take right. But now there you are able to add because x only took some positive take at most like finite number of values or countably many right. But now you do not have that what it is. Now you are have to deal with x taking continuous set of values or uncountably many. So, now you have to define that through integration. So, that is where this integration is coming. And now you are saying that if there exists such a function if there exists such a f such that this holds for any b you are going to ask the question ok whether my x belongs to b for whatever b you like right. It is not like you are going to ask it for any specific b. If this holds for any b that is a subset of r then you are going to say that your x is continuous. So, do not worry this is just like a formal definition. But what we mean by continuous random variable is something that takes value in a take all continuous set of values ok. Then so for example your b could be simply a interval your random variable is continuous and so now you are going to ask the question ok what is the probability that x takes value in the interval is a b. Then what is this going to look like? This is simply going to look like integration over a b f of x dx right. This is just like area of this curve in the interval a b. So, in a way this function f has acting as a corresponding probability right, but not exactly probability in that in some sense it is trying to give us that notion here. So, if I have a continuous random variable. So, I am going to say if my random variable x is continuous if at all I can come up with such an f such that this is this is satisfied. So, now I am a further look at this. Now, suppose I had taken b to be interval here. Suppose let us say a equals to b. Then what is this? So, in this case it is going to be 0 right. Then what we are saying probability that x is equals to a right. We are saying x is equals to 0 because this integral value is 0. So, what we are saying is if x is a continuous random variable the probability that it taking a particular value is 0. It is possible that it can take value positive probability in an interval, but it taking a particular value is 0. So, that is what right like if you have if you are going to have looking at the height of the population in India like population of India is in billions right. If you are going to ask the question that what is the probability that a given a selected person is going to be exactly of height 5 feet. That may be like negligible fraction right like compared to the entire population. So, in that way this continuous random variable is capturing the notion that if this random variable taking a particular value is going to be 0. But if you are going to ask the question what is the probability that the height is going to be lie in the interval let us say 4 to 5 feet. Then maybe there is a non negligible amount of mass which will has going to have that right. So, then that case you expect this probability to be strictly positive. And you expecting like particular height value you expect that value to be very very small or maybe like negligible compared to the size of the population. Suppose now further I want to express now I want to connect this by the way we are going to call this F as CDF of sorry PDF what is PDF probability density function. And to denote that this is a PDF of an associated X we subscript this F with X. So, now I know that my CDF of my random variable is nothing but by definition this is nothing but X by C. So, let me express this in terms of this in terms of a set. If I want to express this like this what should be this B minus infinity to C right. So, then applying our definition here this is minus infinity to C suppose my random variable X is continuous already. So, if my random variable is continuous I know this definition already holds right. So, my CDF is integral of my PDF ok. So, we are just going to say that my d by dc of 2. So, this is F of fine and we just get this by differentiating both sides. Now, to get bit more intuition about why what does this function F mean fine mathematically we have defined there exists such a function which need to satisfy this property and we also said that this function F seems to be corresponding proxy for a probability in my continuous domain, but what does that actually mean? Now, suppose you want to take this probability. So, what is this probability saying? So, I want X to be between C epsilon by 2 and C minus epsilon by 2. So, I am asking my C to be sorry my X to be in this interval where this is epsilon by 2 sorry this is C plus epsilon by 2 and this is C minus epsilon by 2 ok. And now let us say I also want to now look at my in this region I want to look at like let us see how. So, suppose and so, I want to plot my F of X in this region whatever like let us say F of S look something like something like let us say for some reason let us say let us say F of X looks like this. Now, if you are going to apply a formula here and what we are saying this is between C plus epsilon by 2 and C plus epsilon by 2 and now suppose I let epsilon shrink or epsilon become smaller. So, if I let epsilon become smaller and smaller what means this interval is shrinking right? It becomes shrinking and shrinking and if I choose epsilon small enough can I approximate this integral just F of C F C into epsilon right? So, I can approximate this as epsilon. So, first thing you notice suppose if you let epsilon become small that means we are basically asking the question X is exactly equals to C that is already going to be 0 right? If you let epsilon go to 0 that we have already observed. But now what it is saying is if you look at a small interval neighborhood of this C what this function F is telling is basically it is giving you if you this the area of that the neighborhood of this F that is actually giving you the probability that my X takes the value in that interval. So, this F is itself is not a direct proxy for F, but in a way this F is capturing the probability of this event through this. So, I assume that this is in the small interval this is like almost constant just taking the value of F of C in that small interval then this will come out then you will be left with epsilon by 2 plus epsilon by 2. So, that will give you an epsilon right in that range. So, in that case we can interpret F of C is a measure of which is variable. So, in a way if you are going to look at this curve like this if you are going to get a curve like this maybe if this value is large in a sense we can say that my random variable is maybe like likely possibly can be taking value here. But that is only when we interpret it in this fashion when we are going to take this interval very small and then look what is the probability and that probability is simply turning on to be epsilon times F of C. So, if this quantity is large this probability is also large right this F of C is large in that way F of C is a measure of how likely it is that the random variable is taking value close to C.