 Welcome back. So, yesterday we were discussing the monotone convergence theorem and we also explicitly approximated measurable function g from below using simple functions and wrote down a limiting expression for the integral of some g d mu. So, we gave an explicit expression rather than a supremum definition that we gave later earlier. So, some in some developments this the approximation that I gave and the limit of it is taken as the definition of the integral of a non-negative function. This is you can also define it that way and prove linearity becomes very easy for example, but if you define it using the approximation I made the some other properties become hard to prove. So, supremum definition make certain properties easy to prove, but linearity difficult to prove, but if you take the explicit definition using this limit in the previous class there are certain other properties that become difficult to prove and linearity becomes easy to prove. So, they are both equivalent ways of developing the abstract integral. Now, we will talk about expectations of a discrete random variable then we will discuss expectations computing expectations over different spaces and then we will derive a formula for expectation of continuous random variables. Suppose, x is a discrete random variable and it will take only countable values. So, taking values let us say a 1, a 2, dot dot could be finite or countable countably infinite. So, what the discrete random variable does? So, you have all these countable discrete values that it takes. If you look at all those omegas that map to a certain a i let us say a 1. So, that will be some set of omegas and similarly a 2 will have some other set of omegas and so on. So, what a discrete random variable does is that it partitions the sample space into a. So, it partitions into a countable set of events. So, corresponding to each of the values it takes its pre image let us call that capital a i is an event clearly and the omega gets partitioned by these a i's. So, in fact a discrete random variable can be written as a countable sum of weighted indicators like. So, i equals 1 through infinity. So, this could be a if it is a finite. So, if the random variable takes a finite set of values then it is a simple random variable. If it takes a countably infinite set of values then the sum is an infinite sum. For the sake of let us say that x is non negative to first derive an expression for non negative the expectation of a non negative discrete random variable. So, these a i's will be non negative and these big a i's are those omegas which map to little a i and these big a i's partition omega. Now, the task is to compute expectation of x all right. See if this were a simple random variable we already know what it is. It is simply sum over i equals 1 through n a i probability of a i right that is from the definition. Now, this is an infinite sum. So, we have to be a little more careful right. So, what we will do is we will use monotone convergence theorem to derive an expression for the expectation of x. So, let x n of omega equals sum i equals 1 through n a i indicator a i. So, here what I am doing is only keeping. So, I am only keeping the first n terms and this x n of and this is a sequence I am going to define this for n greater than or equal to 1. So, this is a sequence of random variables and this is a sequence of simple random variables clearly right. And I know how to compute the expectation of this right and it is also easy to show that x n of omega is less than or equal to x of omega and x n of omega is monotonically increasing. So, I am trying to construct a I am trying to approximate from below the discrete random variable x using simple random variables all right. So, that I can invoke the monotone convergence theorem and obtain the expression for expectation. So, let me just do that. So, this is the definition of x n note x n of omega is less than or equal to x n plus 1 of omega. Why is that the case for every this is true for every omega right that is because you add one more term a n plus 1 right. So, if the omega lies in one of these little a big a i's then it will be equal if it lies in a n plus 1 x n plus 1 will be equal to little a n plus 1. Whereas, x n will be equal to 0 correct with me. So, this is a very simple argument. So, it is a monotonically increasing sequence. So, this is easy next fix some omega and omega. So, we want to show that x n of omega converges to x of omega for all omega. So, I know that it is monotonically increasing. Then I have to show that x n of omega converges to x of omega then I can invoke monotone convergence theorem right that is what I am trying to do here. Then there exists a k such that omega belongs to a k right there exists k as such that I think. So, I want to say that there exist k it is not equal to 1 such that omega belongs to a k correct. So, I am. So, this discrete random variable is partitioning the sample space in two of these countable a i's right. I am ignoring measure 0 sets right on measure 0 sets it may map to some uncountable set or whatever, but I am ignoring the technicality for now. So, I am just saying that the random variables takes only countable set of values and the pre measures are this partition. So, I am looking at some omega and that omega should be will belong to a particular a k right because omega has been partition right. So, whichever omega you take it will belong to some a k correct. So, if you look at. So, then if you then x of omega for this guy will become x of omega will be a k right and. So, what I am saying is for this omega x of omega is equal to a k. So, for all n greater than or equal to k we have x n of omega. So, x see if n is less than k. So, k is that guy right k is the index that contains my little omega my omega and if my n is less than k x n of omega will be 0 right because. So, I will sum only till n and my k is bigger than n. So, it will never figure in this sum all right. So, it will be 0, but if n is bigger than or equal to k we have x n of omega equal to what a a k right and n less than k x n of omega equal to 0 correct. Thus limit n tending to infinity x n of omega equal to x of omega for all omega agreed. So, x n of omega will be equal to a k for all n bigger than or equal to k correct and x of omega is anyway equal to a k. So, x n of omega I will converge to x of omega and this is true for every omega right because I fixed an arbitrary omega and prove this correct. This is clear sequence of arguments clear. So, I fix any omega at all and that omega must belong to some capital a k then my x of omega will be little a k agreed. Now, if you look at x n for that omega it will be 0 for n less than k and a k for n greater than or equal to k which means as n tending to infinity x n of omega will converge to x of omega and this is true for all omega correct. So, now I have proven that x n is a simple is a sequence of simple random variables which converge to x monotonically correct. So, by m c t I can invoke m c t expectation of x equal to limit n tending to infinity expectation of. So, expectation of x n correct agreed. Now, what is expectation of x n expectation of x n I know because x n are simple right. So, this is equal to limit n tending to infinity sum over i equals 1 through n a i times probability of big a i right which is simply the same as. So, let me just write this as probability of big a i then this is equal to sum over i equals 1 to infinity a i probability of x is equal to a i right I am skipping 1 step. So, this limit of this infinite sum always exist because this a i's are non negative otherwise the sum may not be well defined. So, I have assumed which is assume that x is non negative random variable begin with right. So, I have. So, this is this sum will always converge right it may go to plus infinity, but this is well defined right it may either be converging to a real number or it may go to plus infinity, but in either case this limit is well defined. So, which is denoted by this infinite summation and after all capital a i is nothing, but the set of all omega's for which x is equal to a i little a i right and that is nothing, but your p m f right. So, you can if you want you can write that as i equals 1 to infinity a i p x of little a i right is that clear, but that is only for non negative discrete random variables. So, you have a discrete random variable which possibly takes negative values what would you do and decompose it into x plus and x minus right. If x is discrete random variable possibly taking negative values right x is equal to x plus x plus minus x minus where x plus is equal to max of x comma 0 is not it max I mean max is not it max of x comma 0 and x minus is equal to minus min of x comma 0 right. So, these are the positive and negative components of this is like f plus and f minus for functions right I am just writing x plus as x minus for random variables. Now, these are both. So, x plus and x minus are both non negative random variables correct which is why I put a minus sign here. So, to make it non negative right min of x comma 0 will be possibly negative right. So, which is why made it non negative define expectation of x. So, this is like I mean this we have already done right expectation of x is equal to expectation of x plus minus expectation of x minus which is well defined as long as we do not have a. So, as long as we do not have infinity minus infinity what I really mean is if at least one of them is finite this is well defined. If they are both equal to infinity then the expectation of the random variable is not defined for a random variable that takes well for a random variable that takes only non negative values it is always defined right. This is always perfectly well defined it might be plus infinity, but it is still well defined. But, for a random variable that takes both positive negative values this guy may be infinity this guy may be real in which case the expectation is infinity. This may be infinity this may be real in which case the expectation is minus infinity. If they are both real no problem right and if they are both infinity the expectation of the random variable is undefined does not exist. So, I want to make sure. So, expectation is not defined is different from saying expectation is infinity they are both different expectation being infinity is well defined expectation being infinity is just this summation diverging to infinity plus infinity right. So, with me over so far. So, examples. So, let us compute the mean of a geometric random variable 1 minus p power i minus 1 p and this is for i greater than or equal to 1 right. So, this is geometric random variable. So, if you want to compute. So, this is a non negative random variable right. So, you can directly apply this formula correct. So, expectation of x will be sum over i equals 1 through infinity i times 1 minus p to the i minus 1 times p. So, this is. So, if you pull this p out you will have an arithmetic geometric series which you can sum right. So, what you will get is this will become p over 1 minus. So, I think you will get something like this 1 minus 1 minus p the whole squared I think I think this is what you get right. So, if you have sum this arithmetic geometric series. So, you will get. So, that will become 1 over p just do the summation. So, this step do it right. So, this will be 1 over p. So, x is the. So, recall that x is the number of coin tosses before you see the first head right. If p is the probability of head x denotes the number of coin tosses before you see the first head right. And the expected number of those coin tosses is 1 over the probability of head which makes sense right. So, if the probability of head is 1 and 3 the expected number of trials before you see a heads is 3 is what this is saying right. It makes sense yes. . . . . . . . . . . . . . . . . . . . . . .So, what you have to do to be mathematically sound you have to express x as first you have to express x as x plus minus x minus. So, these are both non negative random variables for each of these random variables you approximate monotonically. So, which is why I am writing it like this right. You have to compute the positive part separately and negative part separately. And for the expectation of x plus you will compute using this formula expectation of x minus you will compute using a similar formula. And then you subtract one from the other if the if the subtraction is well defined. If it is both infinity it is not well defined. . . X plus and X minus is basically equal. . . We will get to that example we will see an example. So, this is a very simple example. So, I want to this is a case this is very benign case where it is a non negative random variable and the summation worked out to be something finite. So, the mean is the expectation is 1 over p. Now, we will give an example of a case where the expectation is plus infinity. So, if you I think I put this down before let me put the p m f s p of x is equal to k is 1 over k square example. I think we did this earlier 6 over pi square 1 over k square this is for k greater than or equal to 1. This is also a non negative random variable. So, you can use that formula. So, in this case what you get expectation of. So, you have expectation of x is equal to sum over k equals 1 through infinity k times probability of x is equal to k which is equal to 6 over pi square sum over k equals 1 through infinity k times 1 over k square which is 1 over k and this is harmonic series. So, this diverges to plus infinity. So, this is equal to plus infinity. So, this here is an example of a random variable with mean is equal to infinity it is well defined it is the mean is plus infinity the expectation of the x is plus infinity. Now, to move on to another example I just want to tweak this example to make it two sided like he was mentioning. I want to give a third example where the expectation is actually not defined. So, what I am going to do is this guy as 1 over k square kind of probability mass in the positive axis what I am going to do is make it symmetric. So, let consider this example let probability of x is equal to k equal to now 3 over pi square 1 over k square for k in all integers, but not 0. So, essentially this is a PMF defined over plus or minus 1 plus or minus 2 plus or minus 3 and so on it is defined over non 0 integers. So, in this case so you can verify that this is valid PMF because I mean you can this is sum to 1 right this is clearly a valid PMF. Now, in this case now you have to be very careful right you have to compute x plus and you have to compute the x plus and x minus separately correct. So, what you do is you will have 1 the x plus expectation of x plus will be plus infinity expectation of x minus will be plus infinity right because you get 2 harmonic series right. And in this case what will happen is both infinity expectation of x will not be defined right expectation of x plus will have a harmonic series expectation of x minus will also have a harmonic series right and therefore, it will. So, expectation of x plus will have a PMF like that right like 1 over k square kind of a PMF. So, expectation of x plus will be plus infinity expectation of x minus will also be plus infinity therefore, expectation of x will not be defined. So, what you cannot do so 1 caution I want to give you is you cannot write it like this right the following is wrong what I am going to write now is wrong. So, you cannot write this as sum over k equals 1 through infinity k times 1 over k square forget the pi square outside. So, I will just do that for positive plus k equals 1 through infinity minus k over k square right and 3 over pi square outside right. So, if you write it like that you cannot cancel corresponding terms and say the expectation is 0 that is not allowed. Because, whenever you have a summation which contains both positive and negative terms right you cannot take any grouping you like right I think you might have studied this in sequences in series right. If you have summation a n some terms are positive some terms are negative you can actually group these terms in any which. So, you can get any answer by grouping the terms in some particular way right. So, in particular you are not allowed to group such that the answer is 0 that is not valid you have to compute them separately and say infinity minus infinity is undefined. You might have known that. So, I mean if you sum a n summation a n if the summation is absolutely convergence then you can group any which are where you want and you will get the same answer. But, if it is not absolutely convergence this is not absolutely convergent you cannot group it any which are where you want. So, that is something you already know I think. So, here the expectation is undefined it is not 0 I want to make that clear. And this is a different example from this here the mean is infinity the expectation of x is plus infinity here it is undefined they are different cases. So, this is not 0 you should not say this is 0 this is wrong. So, this if you say this is 0 is wrong any question so far. So, if you have a Bernoulli random variable Bernoulli random variable is a simple random variable actually. And in that case the mean of a Bernoulli random variable with parameter p will simply be p. So, that is expectation of x that is the end of it. So, the physical so far we have not given any interpretation of expectation of x the we have just called it an expected value or expectation or mean or whatever. But, we have not given it is simply the integral of x d p that is all we have said it is simply a number the interpretation of the expectation is given by the law of large numbers. So, we will give it a physical interpretation a frequency interpretation if a frequentist interpretation when we do law of large numbers. So, as far as you are concerned now it is just a number. So, that concludes discrete random variables expectation of discrete random variables. So, I have a if next I will move on to expectations over different spaces. Suppose, I have a situation where. So, I have a random variable x this is my omega this is my x. And let us say I have this is g and. So, I have y equals g of x. So, I have some function of a random variable which is also random variable g is a measurable function. So, what I want to do is let us say I want to compute the expectation of y expectation of g of x. So, what you would expect is that suppose I know the probability law of x here or I know the probability law of y here. I should be able to compute the expectation of y in any of these spaces and they should be equal. So, what I am trying to say is let us say if you have a let us say x is discrete for the time being just to fix ideas. If let us say x is discrete for the time being taking values a i's. So, what I want is the following. So, I want expectation of y. So, I want expectation of y. So, expectation of y will also be a discrete random variable. Let us say actually let us say that it is all non negative also to fix ideas. So, I am trying to say that I want expectation of y I want to be able to compute as y i probability that y see this is fine. Let us say i equals 1 through infinity or whatever. So, this is one way of computing it. Another way of computing it is to simply look at the value of g of x. I want to say it is equal to g of x i probability that x is equal to x i. So, I want this to be true. So, this is what I want. I hope this is true rather hope. I hope this is true i equals 1 through infinity. So, g of x i probability that x is equal to x i. So, what I am saying this. So, the random variable takes values x 1 x 2 and so on here and it takes after the mapping it takes values y 1 equals g of x 1 y 2 equals g of x 2 and so on. So, this is something you been doing in your elementary probability. So, if I tell you that these are the values that my y takes and if you somehow figured out say let us say I know the P M F of x. You can do your transformation of random variables stuff and figure out the P M F of y and then you compute this one way of doing it and that is expectation of y. But, you sometimes if I am. So, if suppose I ask you to compute expectation of x squared or log x or something you also do a sum over x square probably. So, you know what. So, you would do this right g of x i probability of x is equal to x i. You probably have not thought to think whether they are actually the actually equal. You know that you get the same answer in any whichever way you do it. But, you just been doing it I guess in your elementary courses right suppose my g of x is x squared. You would either compute the P M F of y equal to x square and compute its expectation or you will simply put x i x i square probability of x equal to x i and you always get the same answer right. So, this is what you been using and this is what you hope is always true right. So, that is the theorem we are going to prove in full generality know. So, what I am doing here is this is expectation over this space if you think about it right. I am just integrating y with respect to the probability law here and this is integrating g x with respect to the probability law here. And I want to say that they are all equal right that I just wrote it down for discrete random variables, but it is true for all random variables understood. So, this is what I am going to prove in general full generality. So, this is an important theorem. So, you have this setting. So, now you have y equal to g of x and x is a random variable on omega f p. Now, in the setting you have integral y d p is equal to integral g d p x equal to integral y d p that is the theorem. So, it says it has how many statements well this is a definition. So, that is not a statement. So, that and that it says that these three integrals are equal right. So, you can either compute the expectation of y by just computing. So, y is the overall map right and obviously, expectation of y is integral y d p by definition. What this theorem is saying that you can rather integrate this function g. So, if you just forget this part and you take this function g and the measure p x sitting here you can integrate g with respect to p x or you can integrate the function little y with respect to the law of y. So, in the more elementary terms that I just told you this guy is equal to this and this guy is equal to that right. And they are both equal to expectation of y that is what the theorem is saying the statement clear. So, you are integrating over different spaces here you are integrating over here you are integrating over omega right there is an implicit omega here here you are integrating over r and the r is this r and here also you are integrating over r that is this r. And the measures are all different this p x is a measure here the law of x and p y is the law of y is a measure on this and this is always true. So, what you have been doing is right and it is holds in much greater generality. Now, how would you prove this there is only one way to prove anything in integration right start with simple function right there is no there is really no other way of doing it right start with simple functions then use either the supremum definition or monotone convergence theorem now you have monotone convergence theorem to generalize for approximate any non negative function using simple functions right. So, I do not know if I will complete today. So, the time is I have 5 moments let me let me see if I can finish this let G be simple taking values y 1 y 2 dot dot y n right. So, what I am saying is that this G is a simple function and it takes only n values y 1 y through y y 2 through y n then integral y d p is equal to. So, this is a simple random variable. So, I know what its expectation of y is simply I equals 1 through n little y i probability of omega for which y of omega equal to little y i right. So, this is my big a if you like the set is my big a right. So, this I can write as I equals 1 through n little y i probability of omega for which G of x of omega G of x of omega equal to little y i y of omega is nothing but G of x of omega right. So, that is my that is that right. So, I have just written out that now for the second integral integral G d p x. So, I am looking at this function G as a function from here to here and I am integrating with respect to the measure p x on this line right. But G is a simple function right it takes values only these little y i is is not it. So, this will be equal to sum over I equals 1 through n y i times p x of say I am integrating with respect to p x right. So, p x of those x is now not omega those x is for which G of x equal to y i right. So, good. So, that is equal to sum over I equals 1 through n y i p x of. So, I can write this is p x of G inverse of y i right is the same thing right. This it is nothing but the inverse image of y i under G right and this is nothing but sum over I equals 1 through n y i probability of omega for which x of omega is in G inverse of y i right this I am again writing it out. So, I am going from measure p x to measure p right now am I done. So, this is equal to sum over I equals 1 through n little y i probability of G of x of omega. So, this is the probability of omega for which G of x of omega equal to y i right. So, this is the same as this. So, these two are equal right. So, we have proved it for simple functions these integrals are equal right. So, next what you would have to do is for any non negative function G you have to approximate G using simple functions from below then use monotone convergence theorem. So, I think I am out of time now. So, we will just keep this and resume next class with the more general for non simple functions.