 So, now I have defined a random variable with whatever we discussed so far, we try to distinguish what we mean by a discrete random variable and a continuous random variable. So, when the things are discrete, things are lot easier to visualize than understand that your random variable is just taking some finite number of values. If I describe what is the probability of each of this possible value, I kind of have almost all the information. Now, to move to the continuous case, we try to make it more formal, like we know that like if you when I say random variable is continuous, that means it is just like it can take possibly more than uncountably many possible values, right. But to make that formal, we introduce this notion of probability density function associated with our continuous random variable and then just try to connect what is the interpretation of that probability density function. So, often we just may not be interested in what is the value taken by the random variable, right. Like if you do, if you have, if you go for a gambling or something, how many matches you won, how many matches, in which match you won, which match you lost, you do not care. What you care is at the end, how much money I made. I mean aggregate behavior. If you are going to go to some, like let us say this happens in intra-IIT sports, right, that I do not care on which event I lost, which event I won, whether I got the largest number of medals across all the competitions so that we are the champions. So, often instead of looking at the specific value outcomes, you want to look at the average behavior, okay. For example, what is the average rainfall in Mumbai? You do not care like how much rain today, how much it rained yesterday. All the water that comes maybe gets stored in our lakes there, right. So, at the end what you care about is how much of the water got accumulated. That is what interest you throughout the year, like not like on which day I have got how much of the rains. So, instead of looking at the micro information, maybe I want to get kind of macro information that is like a average. So, we would be next looking at what we call as expectation of random variables. So, if I say expectation like I want to get a global information about a random variable, what could be a natural way to define the expectation. Let us say if you conduct experiment, I do not want to look at the specific outcome, but I want to see if you going to repeat that experiment again and again, what is the outcome that you are going to see on an average. So, that is fine. Some of the outcomes divided by number of outcome, but that you have actually conducted the experiment, but a priori I tell you this is the description of your experiment. If you do this experiment, your experiment is going to take these possible values and it has this associated PMF. From that can I say something before conducting even the experiment? How? So, what you one possible way you want to do is you want to multiply each outcome with its associated weight. What could be the associated weight? The probability itself because the each outcome is going to happen with certain out probability. So, maybe you just want to multiply these two and sum over all possibilities. So, for the discrete random variable this is straight forward or it could be even infinity. So, let us say I have a discrete random variable as I said it is going to take either finitely many possible outcomes or it could be countably infinite, but whatever it is I can find this sum. I know what are the possible outcomes. So, Xi's are the possible outcomes and this is going to give you the associated probability. So, this Pxi is basically the PMF associated with that random variable. So, this value which is nothing but the weighted sum of my outcomes and the weights here are probability I am going to call this as the expected outcome. I am going to see if I do a this is the expected outcome of my random variable X. So, if it only takes finitely many terms then this summation will not be over infinity it is going to be only over the possible number of outcomes. Right now we are just saying this quantity here could be finite or infinity. It is not necessarily that even though my random variable takes always finite values, but the expectation could be infinity. So, can you think of an example where this expectation can be infinity for a discrete random variable? Your series does not converge. Your series does not converge, but and what about the probabilities? So, let us take your series you want to maybe just take this as your series itself like 1, 2, 3, 4 is your possible outcomes. Can you give me a, can you put a probability mass on this so that if I look at the expectation it is 1 by. So, will it make a probability mass function? No, but this if you sum it over n it has to add up to 1. So, we can appropriately scale it right that the scaling happens to be pi by 6, pi square 6. So, I will just write that as a constant you can come up with a constant so that this will make it a probability mass function. So, now if you apply this what is this is going to be and will this converge? No, this is going to diverge right. So, and this could be infinity in this case and so this is infinity, but even when we have uncountably many possibilities the expectation can still be finite. For example, I can come up with like why outcomes has to be n it could be n by n, 1 by n itself. My possible outcomes are 1, 1 by 2, 1 by 3, 1 by 4 like that. So, if you just then if I make it 1 by n this guy is already I know this guy is going to converge and this is then in that case it is going to be finite. So, we have an expectation defined for a discrete random variable like this. Now, what about continuous random variable? We are getting 1, 2, 3, 4, 5, 6. So, in a normal way you may be asked what is your expectation. So, that values should belong to that. So, what I am expected by this by throwing a dice. Yeah, what your expectation means what I am expecting should be there. So, I will get 3.5, 3.5, 1. See like okay let us say let us say you defined your expectation such a way that it turned out to be one of the possible outcomes like 3 or 4 or 5, but let us say that happened to be 4. Now, if you throw a dice do you think your expectation is met because the die can be taking one of the 6 values and you may not 4 may not come when you throw it. So, your expectation is not met. So, by expectation here yeah fine. So, by expectation here what we are just saying mean is and the actual interpretation we are going to do is if you are going to do this experiment again and again and again and then you do average all possible values, what is the value you are going to see? What is the expectation? What is the? What is the exactly? Yeah. I am not going to get 3.5. No, average over all values you got that done. So, I am saying you right. So, you are a gambler you went to casino first day you got some you lost or gained whatever you got, but you are now addicted to this you go there every day every day and you are going to win lose whatever every day, but I do not care you do not care right what happens unless you go bankrupt or whatever. So, what I care is to throw this what is the value I got average over all I mean on an average like okay. So, so other way of interpreting is you are going to do this experiment let us say you do 1, 2, 3 times every times you did this every time you are going to get a different value and if you take average of all this what is the value would have gotten that is the value that is the interpretation with which we are defining this expectation. So, I can coming back to this gambling one if you are going to whatever the money you are going to put for whatever the number of days you are going to do it whatever the total wealth you are accumulated. Now, I want to average if I average it then I can kind of in a sense I can interpret right this is what I would have got on an average in on each day right because like so suppose let us say so let us say on day 1 you got this day 2 you got this day 3 like this day 4 like this you add all of them and then 10 you did this where end is and you got this ratio now other way of interpreting it instead of this looking at over for all these days on an average this is what I would have got every day instead of looking at the value of per day instead of looking at the accumulation and then dividing it by and we can think of this average as per day maybe this is what I would have got that is what the interpretation like we want to come this is what the interpretation this is what this is what that leads to this definition this kind of interpretation that we want to derive that should be naturally coming from this formula right. So, fine this expectation need not be the final point is this expectation need not be one of the possible outcomes this could be any value in the real number ok. Now, how to define it for continuous random variables we are going to define the expectation of a continuous random variable as what is and what is called as the Lebesgue's. So, as we expect in the in the continuous version right to define integration naturally the integration will come into picture and there are different notions of integration Riemann's integration, Lebesgue's definition of integration and this Lebesgue's this kind of integration. So, what we will take we will go with this it is called Lebesgue's integration and this is going to be defined as X of derivative of my CDF function which I can write it as right we have already shown that D of of X is nothing, but this from this definition here. So, through this definition of expectation of random variable or mean value of random variable we are trying to characterize the kind of global observation about that one can make about this random variable X not about the individual value it is going to get, but what we are saying as the value are going to get on an average or the expected value here. Now, we are going to define it like this maybe what are the properties this character has here like when we defined our CDF we looked at what are the properties that satisfies right. So, what properties this expectation satisfies. So, for this we are going to also call it we are going to call average of random variable or we are going to call it as mean of random variable or expectation of a random variable here all them in the same thing. So, first thing what we call linearity property. So, take a C then expectation of C of X is going to be C of expectation of X. So, do you think that this is true this property. So, what I am saying is you give me C whatever the initial random variable X I had I am going to scale it by C and then if I look at the expectation of that scaled random variable that is nothing but scaling the expectation of the original random variable and further if X and Y are random variable then expectation of X plus expectation of Y is expectation of X plus expectation of Y. I mean this should be very much straight forward from this formula right. So, when I say C X that means I have to replace X I by what? C X I. So, then it is already true that you can just pull out C and it is not remains and that same thing holds in your continuous random variable also. And now expectation of X plus Y, how you are going to verify this? We are going to reply X I by X I plus Y I and then what is this probability? They will be different. No, I am not saying independent. I just say that just I say X and Y are random variable. How we will show that? So, what is this? Let us say my new Z is my new random variable X plus Y right. Now this Z may take values different than what this X and Y would have taken. Let us say X is through of my one die, Y is through of another die. So, X took value 1, 2, 3, 4, 5, 6. Y also took value 1, 2, 3, 4, 5, 6. What Z took? Z started from 2, 3 all the way up to 12. So, this Z is going to be different. And now to find the expected value of Z, you have a different probability mass function. But do you think that is going to separate out and you can get this? Why is this should be true? Expectation of X plus Y is equals to 2. Okay, think about this. Let me complete the other properties. So, what is the meaning of this? Probability that X is going to be take greater than or equals to Y is 1. What does this means? Yeah, what do you mean by that? So, if you take any, so X and Y are the two random variables that are defined on the same sample space, right? On any sample point, X is going to give a higher value than Y. And the collection of those sample points is such that that has a probability of 1. If X is going to give a higher value for every possible value of sample point, do you expect this expectation to hold? Then in the expectation also this should be higher, right? So, basically what we did here is when I wrote this quantity, right? So, you can replace this by X of omega i, right? X i is what? X i is the one of the realizations taken. And I can say that realizations taken is for some sample point omega. And this is what my expectation, if my Y is such that on the same sample point, it is going to assign a smaller value, then you expect that this whole of the expectation is going to be smaller, right? So, and also I have assumed that this expectation is strictly greater than minus infinity. This is because if this expectation was not strictly greater than minus infinity, but it is minus infinity, then this trivially holds, right? Because anything is going to be larger than minus infinity, right? On the same probability space, yeah? So, yeah, this is what I have taken here, yeah? And by the way, and this property is bit more than that, like this, you can see that this not necessarily that X and Y here should be coming from the, should be defined on the same probability space. Even if they are possibly defined on different, this should be satisfied. But if you have not checked that, we have just, we just showed that this, I mean, we have not actually showed anything, but we are just saying that if X and Y are two random variables, this linearity property holds, like I am going to take this expectation of the sum of random variable as nothing but the sum of the expectation of the random variables. And you can extend this argument not just two random variables, you can extend it for any n number of random variables, okay? Third, suppose you have a random variable X and you have a function, you apply a function g on that. So, what does this mean? Whatever the outcome of the random variable, you are going to further apply another function on this. For example, you want to like, you, whatever the outcome of the dice, you just do not want it, you are going to take twice the value of that. Then the g is that function which is making it twice, okay? Now, you want to compute the expectation of not X, but g of X. How does expectation of g of X look like? The X has some underlying, let us say if it is a continuous random variable, it has some underlying pdf. And now if I apply g of X on that X, it is going to take different values. And g of X may have a different outcomes than X itself, right? And because of that, its pdf may change, right? And its cdf may also change. Is it necessary that I should first compute the new cdf and then with respect to that I should find this expectation? So, one natural way to solve, okay, let us call this Y, whatever is Y. If I can find the pdf for cdf for this Y, I know already you know how to find the expectation, right? If I know what are the possible values of Y and associated cdf pdf, I already have a formula. But without computing the new cdf of pdf, can I write this with the original cdf pdf of my X? It looks like that is possible and the, this is going to be simply g of X of, so I really do not need to find the, so this is the pdf of my X. Earlier if I, instead of g of X, if I just write X here, what would it have been? It is just the expectation of X. But then like even if I want to find expectation of g of X, all I need to do is replace this X by g of X. I still retaining my pdf function of the original X here. That means in a way when I am doing this computation, the random variable X is still having this property. Only the value you are looking at these values. And, but so you are just like, so you are, it is that your value outcome of the random variable has changed. But the underlying X's are still generated at the same pdf, so that is why we can use this formula. And this formula is often called what they call Lotus or what they call law of unconscious satisfaction. So it is called unconscious here because like even though you are computing the expected value of g of X, you are still not caring about changing the pdf of this. You are still working with the original pdf. And one formula which comes pretty handy in some cases, the expectation of X can be written in terms of only cdf like here we have written it in terms of the pdf right. But it can be written only express in terms of the cdf, the area under the cdf. How it looks like? 0 to infinity. So you see that, so how I get this formula, this is I think we just need to apply integration by parts here, where is my formula. So, I mean just check that like we know that expectation of X is nothing but X d of of X dx right. We know this is the case. Just use integration by part formula and manipulate it, it should you should able to get this. Now what did this saying is, if I have if I have some cdf like this and let us say this is one here. In the interval 0 to infinity, what I am looking at, I want to integrate the area under 1 minus f of X. So what is 1 minus f of X? So this is my f function right, 1 minus f of X is this portion. From this it is saying that you just subtract the area under my f of X, but in the negative half of the interval. And what is this region? This region is going to be just this. So you find the region total area of this and from this you subtract the total area from this, then this is going to be still in the expected value of expected value of your random variable X. Suppose let us say your random variable X is such that it is only going to take positive values, it is not going to take negative values. So then this part would not be there right. And then in this case your expectation is simply going to be this part, which is a just a much simpler version right. It is just like you are going to integrate the area, I mean the complementary area of your cdf in this particular discrete random variable. So you can also show that. So by the way what is 1 minus f of X? So what is f of X? f of X is f of X is I know this is probability that X less than or equals to c. What is 1 minus f of X is just probability that X greater than c. So then this simplifies that your expectation of X is nothing but you can write it only in terms of your probability that probability that X less than c into d of X. This is provided your X is only taking positive value right that in that case you can ignore this negative values. And also if your X is a continuous random variable even if I include equality here nothing changes. Is that true? Because probability X exactly taking value X is 0 that is not going to change anything here. And so what was your question? Can we do write a similar formula for the discrete ones? So based on this you think you should be able to write your expectation in a discrete case also similarly. Let us see what you have in mind. So in what way you want to convert it? We have this PMF right. We want to remove this PMF and bring in CDF but we already have said that the relation between PDF and CDF right. So we know that this is nothing but we said that probability that X equals to taking XI is nothing but that jump of my CDF at that point. So then we have you can write this expectation in a discrete case in terms of the CDF directly in this fashion. So that is nothing fancy about this right. Okay fine then let us stop here.