 Let us get started again covariance. We talked about expectation, we talked about variance right. Expectation of a random variable, we talked about variance of a random variable we talked about and that time we are all dealing with one random variable, but now we have extended our scope, now we are talking about multiple random variables and then we talked about joint CDF, we talked about joint PDF right. When we are going to talk about multiple things, we said that they could be dependent. We talked about independent, if we are talking about independent there has to be some dependency or notion of dependency also right. Now, how to capture this notion of dependency? One notion to capture that to what extent some things are dependent. If something is not dependent, yes there is a dependency. We need to now characterize how much they are dependent. For that we use a notion of covariance and covariance for simplicity we will define it between two random variables and it is defined like this by definition covariance between two random variables is notice the two things here. I am defining basically a new random variable which is product of two random variable. This is one random variable and this is another random variable. This random variable I obtained by subtracting mean from the x1 and the second one obtained from x2 after subtracting its mean value and then I am taking the product and this is I am calling a random variable y and now the covariance is between x and y is actually the expectation of this which is expectation of this quantity which I have written here. So, what it is basically doing is it is taking the product of these two random variable after centralizing them. What I mean by centralizing is that I am subtracting the mean from that random variables right. Whenever we subtract the mean from random variable we call it centralizing them. So, this is basically we have centralized the random variables taken their product and looking at their expectation and we will argue now that this in a way captures how much they are dependent with each other if at all they are dependent. One obvious thing that will come simple algebraic manipulation is if you just expand this product and apply definition of expectation you will get this. So, by the way how many of you know that expectation operator is linear operator in IE 621 was this discussed right. So, what do you mean by let us say I have two random variables and I am taking their product x 1 plus x 2 and if I take expectation of z what is this value is when I explain does this required x 1 and x 2 be to be independent no it does not matter what the distribution is ok that is why this this is called expectation is a linear operator. So, you need to use that linearity property of expectation and you take their product here inside multiply x 1 with x 2 x 1 with this like that you get 4 terms and after you simplify you will get this term. Now, one immediate observation is suppose if x 1 and x 2 are independent what is going to happen to this term this term is nothing but expectation of x 1 into expectation of x 2 right. So, if x 1 and x 2 are independent right then you see that right away covariance of x 1 and x 2 is going to be 0. So, if two random variables are independent we are saying that their value covariance is going to be 0 then what does what is the meaning that if their value is not 0 then it they are not depend independent right there is some relation between them. So, let us try to understand this what does this mean ok. Let us take two events A and B on some sample space now I am going to define two random variables random variable 1 I am going to say assign to value 1 if that event A occurs otherwise I will take value its value as 0 whenever event A occurs I will take random variable x 1 to be 1 otherwise I will take 0 similarly for x 2 x 2 is based on event B if B 2 occurs I will take x 1 x 2 to be 1 otherwise I will take it to be 0. Now, let us try to understand what does covariance of this means ok. Now, you can quick calculations if you apply this definition of expectations right the covariance between x 1 and x 2 you can verify that this will simplify to probability that x 1 is 1 x 2 is 1 minus probability that x 1 equals to 1 times probability that x 2 equals to 1 this will it will work out like this. Now, suppose assume that x 1 and x 2 the covariance is positive you assume this what does this indicate let us see what does this indicate if this has to be positive then the right side has to also be positive this is going to imply that this term has to be larger than this term everybody agree. Now, I will do further simplification this x 2 equals to 1 I will bring in the denominator then this ratio has to be larger than this quantity. Now, that by the definition of our conditional probability what it is saying is this is probability that x 1 equals to 1 given that x 2 equals to 1 right everybody agree with this definition of conditional probability. Now, this conditional probability is greater than probability that x 1 equals to 1 what it is saying is this is a if and only if conditions all of them if the covariance between x 1 and x 2 is positive that means that if x 2 equals to 1 has happened then the conditional probability that x 1 equals to 1 is higher than the unconditional probability of x 1 taking value 1 that means if x 2 is going to has happened 1 then x 1 is also more likely to take the value of 1 right that is what this is saying that means it is kind of already telling that somehow if I know something about x 2 I am able to infer something about x 1 not only that, but in what directions they are moving in a way that if x 2 is 1 it is likely that x 1 is also going to be the same there are only 2 values right 0 1 it is saying that if x 1 is 1 x 1 is also is going to 1 it is not going towards 0 it is going towards 1 ok. So, what it is going to indicate is if this absolute value of the covariance if it is positive it indicates that positive means strictly positive it indicates that occurrence or non occurrence of one random variable provides some knowledge of about the other. So, there is some dependency if this value has been 0 then we know that by definition this unconditional probability is equals to conditional probability if they are independent that is when this is equals to 0, but now that we are assuming this is greater than 0 that means x 2 equals to 1 implying that x 1 equals to 1 has a higher probability now. Now in general the covariance can take positive or negative value ok positive negative value and whenever it is a positive value it is an indication that when one random variable is increasing other is also increasing on an average sense ok. All we are talking in average sense nothing is in terms of one realization it is average across all realization ok. And similarly whenever this covariance is going to be less than 0 this is an indication that when x 1 increases x 2 is decreasing like they are not in the same direction and again this is an average sense of average sense nothing like absolute realization sense. Now this definition of the covariance has a straightforward properties. Now you can always ask for a covariance of a random variable with itself like covariance of random variable x 1 with itself and by definition this is exactly variance of x 1 right. So, here if you make it x 2 is also x 1 this is going to be expectation of x 1 minus expectation of x 1 whole square and this is exactly the definition of variance of x 1. The second one covariance is not dependent on the ordering of this random variables whenever you are talking about two random variables either you take covariance of x 1 and x 2 or covariance of x 2 and x 1 they are the same right. Because you multiply whatever like you multiply this term first this term first and then this or this one first and that one does not matter. And now if you scale one of the random variables let us say x 1 by factor A then whole of this covariance gets scaled by a factor of A. Again this I would like you to verify this last two properties and if I have one random variable let us say I have now three random variable and one random variable y which is nothing but the sum of two random variable x 1 and x 2 and now I want to look into the covariance between y and x 3 this can be written as covariance of x 1 and x 3 and covariance of x 2 and x 3. Again this follows from properties of your covariance basically by the definition of covariance please check all these things ok. Now I am going to switch to next slides so meanwhile if you have any questions about this covariance independence of the random variable you should ask now ok let us continue. So, is that moment generating functions are covered in IE 6 to 1 ok ok let us go through it yeah. So, far we covered all these stuffs right independence of random variable and correlation of random variable. So, what we will do is now we will quickly start migrating to the statistics and these two theorems the limit theorems law of large numbers and central theorem limits these are our bridging bridges between probability and statistics ok. So, once we conclude these two theorems we are ready to get into statistics, but before that let us understand some more concepts of probability let us say you have two random variables and now you are trying to get new random variables out of this and you may have multiple of them. So, one of them I get is by applying some function g 1 and I got a new random variable y 1 which depends on both x 1 and x 2 and I have another random variable y 2 which is again applied on obtained by x 1 and x 2 after applying this function g 2. So, now, we are expanding right earlier I had just one random variable and I did this like one function of one random variable. Now, I have multiple random variables and I get I am applying function which is applied on both this random variables. Now, the question is how to derive joint distributions of these two random variables y 1 and y 2 where where where you may end up with such situation let us say I have two random variable x 1 and x 2 you are interested in their sum and their difference. Now, y 1 is the sum and y 2 is the difference right and now you want to understand what is the joint distribution of y 1 and y 2 and another thing you have let us say Cartesian to polar coordinates. Let us say you have this x 1 and another is x 2 you may be take some point here x 1 and x 2 one thing is you will be interested in one is your magnitude and. So, you have this Cartesian form r and theta. So, your r the radius order distance is given based on x 1 and x 2 and your angle is also depends on x 1 and x 2 right. Now, you have these two random variables you have based on that you have new random variables. Now, you may want to know how this radius and angles jointly behave. So, that is your interested in finding the joint distribution of y 1 and y 2 how to do this and what is your given you know well x 1 and x 2 and you know their joint distribution. Let us say their joint distribution x 1 and x 1 is available to you, but you from that you need to find out the joint distribution of y 1 and y 2 how to do this one obvious way is if I want to find out cdf of my random variable let us call this y at point y 1 and y 2 is simply integrate it over all x 1 and x 2 such that g 1 of x 1 and x 2 is less than or equals to y 1 and g 2 of x 1 and x 2 equals to y 2 you are basically interested in that region right and anyway this is given to you the joint cdf sorry the joint pdf of x 1 and x 2 is given to you. So, I integrate it over your region of interest, but this is often TDS. So, what we will look into is some simple methods to compute this ok. So, for that we will assume that this we have two equations right this is one equation this is another equation for given x 1 and x 2 value you get y 1 and for the same x 1 and x 2 depending on your function g 2 you will get y 2 and let us say that these are such that you are going to get uniquely for a given value of y 1 and y 2 you can get what is the corresponding x 1 and x 2 that results in this value uniquely. Now, also assume that your functions g 1 and g 2 have continuous partial derivatives I am talking partial derivatives here because g function is functions of multiple variables here ok. Now, we are going to define something called a Jacobian matrix at the point x 1 and x 2 which is defined like this actually I am taking the determinant of this Jacobian matrix here and which is computed like this ok. So, first row corresponding to the function g 1 and the second row corresponding to the function g 2 assume that this Jacobian function is not 0 at any point x 1 and x 2. Now, it so happens that the joint pdf of your new random variable at point y 1 and y 2 you can obtain simply by dividing joint pdf of x 1 and x 2 by this determinant of your Jacobian matrix right you do not need to go and do complex math here like this complex integration here to get this all you need to do is like first of all this is this was only giving you the cdf if you need to get a pdf you need to differentiate it right before you get a cdf sorry pdf, but here we are expressing these things directly in terms of the pdf ok. How it comes we will not get into the proof this is just for our use ok and you are going to use it later actually I want you to be aware of this this relation. We are simply saying that the method is if I have given two random variable and two functions and I have y 1 and y 2 as new random variable to compute the joint distribution and y and y 2 I need to first find out the Jacobian matrix and find out that x 1 and x 2 which uniquely solve for that given y 1 and y 2 and for a given f function I use this relation ok. Example is here like the example I said let us take g 1 function to be a sum of random variables and g 2 be difference of these two random variables and assume that x 1 here is exponential with parameter lambda 1 and x 2 is exponential with parameter lambda 2 and also assume that they are independent. So, then what is the value of this the joint distribution of x 1 and x 2 it is going to be product of two exponential one is lambda 1 lambda 1 x and other is into lambda 2 x ok. Now, another thing we said our method applies if you are going to for a given x 1 and x 2 I will uniquely obtain x 1 and x 2 which solves that equations right. So, in this case if I given x 1 and x y 1 and y 2 you will see that x 1 corresponds to y 1 plus y 2 by 2 and x 2 is y 1 minus y 2 by 2 all you need to do is solve these equations. So, what I have is y 1 equals to x 1 plus x 2 and y 2 is x 1 minus x 2. For a given y 1 and y 2 I compute and find out what is x 1 and x 2 and x 1 happens to be this and x 2 happens to be this. Now, if you go and compute the determinants of the Jacobian matrix this is exactly this value ok and now I am going to simply apply the formula we discussed the denominator is minus 2, but why did I take 2 here and notice that this is absolute value of the determinant value in the denominator. So, even though I got the determinant as minus 2 here I have taken this absolute value that is why it is 2 here and the numerator the joint pdf it has placed into the marginal because they are independent and now all I have done is use this formula here and replaced x 1 by y 1 plus y 2 by 2 and x 2 by y 1 minus y 2 by 2. Now, this is entirely in terms of my y ok. So, there is another example here this is based on the Cartesian things and one more thing for tomorrow's quizzes this question may or may not come tomorrow's quiz you better work out it may or may not come if it comes you have worked out you already have the solution ok. So, this moment generating function maybe we will take it up in the next class this is a new topic and we will stop here.