 Hi, I'm Zor. Welcome to Unizor Education. I would like to talk about dependency between random variables and independency as well. As part of this course with advanced mathematics for teenagers and high school students, the course is presented on Unizor.com. I suggest you to watch from that website because every lecture has comments and the functionality of the website actually can inspire you to get a little bit deeper into studying of mathematics because it includes certain exams which you can take under supervision of your parents or teachers. You can enroll in some courses and basically see the progress. And the site is free by the way. Alright, so we'll talk about correlation between random variables. So the word correlation actually means, well, they are dependent in some way. Now, the purpose of this lecture is to establish some numerical characteristic of this dependency. Now, let me just start from presenting the problem. So we have two random variables. Now, to establish these two random variables, let me start with the very foundations of random variables. We have some probability space which contains elementary events e1, etc. em with corresponding probabilities p1, etc. pm, that's one probability space. And then there is another probability space with elementary events f1, etc. fn and the corresponding probabilities q1, etc. qn. So that's what we have. We have one elementary set of elementary events and their probabilities. And we have another one with other elementary events and their probabilities. And on these two probability spaces we have defined two different random variables. Now, random variable c on elementary events e1, etc. em takes corresponding values, let's say xi. And when random variable eta of fj has value of yj. So we can say that the probability of c to take value xi equals to ppi and the probability of eta to take yj is equal to qj. So these two random variables. Now, first of all, let's think about their independence. What does it mean that they are independent of each other? Well, if they are independent, it means that basically the conditional probability of one of them to take some value, I usually use i with x, in case another is equal to something. So conditional probability of c to take the value of xi. Undercondition that eta has already taken the value of yj. It's supposed to be equal to unconditional probability. Now, if this is true for any elementary events, for any values of xi and eta, then we are talking about independent variables xi and eta. Now, obviously this definition is not symmetrical because this means that xi is independent of eta. Now, the independent variable has certain properties as far as this is concerned. We have already discussed them. I'll just very quickly mention them. So the first one is the symmetry, which means that xi is independent of eta, eta is independent of xi. And I have proven this in corresponding lecture of this course. So that means that probability of eta to take yj under condition of xi took value of xi is equal to unconditional probability of eta to take yj. Now, the second property which I would like to mention is the product of probabilities. So the probability of combined event, xi has taken the value xi and eta has taken the value yj. So this combined probability is equal to the product of probabilities. Probability of the combined events is equal to product of yj. So this is true for independent events. We have already discussed it when we were talking about independence. And the third one, which I have recently added to the course, that was the lecture which I had before, the mathematical expectation of product of these two random variables, independent variables, is equal to product of their mathematical expectations. So just remember these three properties, symmetry, if xi is independent of eta, eta is independent of xi. Now, this is the theorem which I have proven about the probability of the combined event, which includes two different random variables, independent variables which took certain values, it's equal to the product of probabilities, and the multiplication of the mathematical expectation. So these are independent random variables. Now, but I was talking about what if it's not independent? How can we measure the dependency between these two variables if this is not true? Well, the whole lecture is about introducing this measure. It's a number, basically. Now, I will introduce it into space, into steps. First, I will talk about covariance, and then about correlation. Basically, my purpose is to establish certain numerical characteristic, which ultimately, let's say when I'm talking about correlation, ultimately, will be equal to zero for independent variables. It will be equal to one for very rigidly dependent random variables. What is rigidly dependent? Well, for instance, eta is equal to a times xi, where a is some constant, not zero. Now, this is a very, very rigid dependency, which means that the value of xi determines the value of eta with probability one, which means definitely, all right? So for these type of cases, I would like to have my numerical measure of dependency to be maximum. In this case, the maximum I have chosen in such a way that it will be equal to one or minus one. One for positive a and minus one for negative a. Well, why is it necessary? Obviously, because if xi is increasing and a is negative, then eta is decreasing. That's why it's more convenient to have this coefficient to be negative. And for all other dependencies, I mean, there are some partial dependencies, it should be in between minus one and one. So that's my purpose, that's my goal. So let's see how I can accomplish it. So the first thing is, I will introduce a concept of covariance between two variables, two random variables. Now, it's just a definition. And let me just basically just express it the way how I consider it's important. So first of all, I would like to centralize my variables. And then I will take their product. And then I will take the mathematical expectation of this product. So I claim that this is a good measure, not my final measure, but it's a good measure to basically reflect whether there is a dependency or there is none between two different random variables xi and eta. And here is why. Well, first of all, I think it makes sense to open the parentheses and you will see that it can be simplified a little bit. So let's open the parentheses. So I will have xi times eta minus xi times expectation of eta, which is a constant, by the way, right? It has eta times expectation of xi and plus xi times eta equals... Well, mathematical expectation of sum of random variables is always sum of their mathematical expectations. That's xi times eta. Now, expectation of random variable times constant, you can get the constant outside of the expectation, right? You remember this. So it's minus eta and expectation of xi. Now here, the constant is this expectation of xi and mathematical expectation of eta. So it will be... And plus this. And as you see, this cancels out. And I have a simpler formula for covariance. Now you see that for independent xi and eta, this is equal to zero because if you remember, I just mentioned that for independent variables, mathematical expectation of their product is equal to the product of mathematical expectation. So for independent variables, that's exactly zero, which is good. Now, let's consider dependent variables. So let's, again, my definition of covariance is I will put this simpler expression. Okay, now let's consider that eta is equal to A times xi. What will be covariance in this particular case? Okay, okay. xi times eta would be A xi square, right? Now, A can be taken out. So it would be A minus E of xi times A E of xi. What is this? It's A times E of xi square minus E of xi square. Now, what is this? Mathematical expectation of the variable square minus square of mathematical expectation. That's a variance. So as we see, my covariance would be equal to this multiplier A times variance of xi. Now, this is a constant. And now we see that if A is positive, my covariance will be positive because variance is always positive, right? If A is negative, my covariance will be negative. And this coefficient basically is a measure of how fast one is changing with changing of the value of another. And one more example which is basically part of this. What is the covariance of variable with itself? Well, in this case, A is equal to 1. So this is equal to variance of xi. And finally, I would like to a little bit, I would like to consider a little bit more complicated case. I call it half dependency. Now, here is what half dependency actually, I mean. It's not an official term which people use in the books or textbooks. I just invented it right now. What is half dependency? Let's consider you have two different random variables which are identically distributed on the same space, probability space, and they are independent of each other. I would like to know the covariance between xi and average of xi with other variable which is exactly identically distributed as xi but independent of xi. Well, let's just do it. I mean, it's a simple thing. So it's mathematical expectation of their product, right? So it's xi square divided by 2 plus xi xi prime divided by 2 minus product of their expectations. Now, what's the expectation of this? The expectation of sum is equal to sum of expectation, right? They are identically distributed which means that expectation of xi prime is equal to expectation of xi so it will be twice expectation of xi divided by 2. So it's expectation of xi equals. Now, this is also sum. So it would be one half expectation of xi square plus. Now, these guys xi and xi prime are independent. I said it in the very beginning which means that expectation of their product is equal to product of their expectation. So it's one half expectation of xi times expectation of xi prime which is exactly the same, so I put square here. And minus again, e squared equals to one half e of xi square minus e square of xi which is one half of variance xi. So if you remember the previous case when I was talking about covariance of xi and xi I had that it's equal to variance of xi. Now, in this case when I'm trying to establish the dependency between xi and this half sum of xi plus another variable which is independent of xi and identically distributed I have half. So what does it mean? Well, it means that covariance seems to be a well-defined measure of my dependency between variables because in this case these two variables they are identically identical which means they are extremely rigidly connected to each other. Now this, it's not exactly because half of this new variable is still rigidly dependent of xi and another half is completely independent. So it's half dependency. That's exactly why I use that term. And as we see our covariance have been reduced by half because of that. So these are examples of how covariance represents the dependency. But this is not all. I want it actually, you see it depends on variance. I would like actually a measure of dependency which is kind of just number without any other relationship with characteristics of the variable. And this is called correlation. That's the new definition which I'm going to introduce. The correlation between two variables is their covariance divided by square root of product of their variations, variances. Now why is this better? Well, because covariance actually as you see dependent on the variance of components. Now if I will divide it by this you will see that it's always between minus one and one with one being the most dependent minus one being the most dependent in a negative sense and zero means independent. So let's just think about it again. Let's go through all my three examples. First example is independent. If C and A are independent as we know their covariance is equal to zero which means correlation is equal to zero. Second, let's just check a very simple and very strict rigid dependency. Strict, rigid. A is equal to C. What happens in this case? Well, as we saw, covariance of C and C is equal to variance, right? Divided by square root of variance of C times variance of C. So this is variance and this is variance square and arithmetic root from variance square is variance. So it's one. Now the third case. A is equal to A C. What happens in this case? Well, we saw the covariance is equal to covariance is equal to A times variance of C. Now in this case, I have square root of C times square root of A, right? In this case. Now if A is equal to A C I will have square root of A square and variance square of C, right? Which is equal to what? Well, square root of variance is variance. It's positive so we can reduce it. We can cancel it out and I will have A divided by square root of A square which is absolute value of A. So if A is positive I have one. If A is negative I have minus one. Exactly the way how I want it to be. So if this rigid dependency is with a positive coefficient A then the correlation is one. If A is negative correlation is minus one. And the last example just to show you how sensitive this correlation coefficient is let's consider half dependent variables. So we don't forget it. Okay, the third case is I have correlation between C and C plus C prime divided by two. Where C prime is identically distributed as C and independent of it. Well, again, you remember that covariance is equal to one half of variance of C. Now in the denominator I have the following. I have a square root of variance of C times now variance of eta which is variance of this thing. Now one half is going out from the variance as one quarter right in square. Now variance of a sum of two independent variables is sum of their variances we know that property of the independent variables, right? But their variances are the same because they are identically distributed so I will have two variance of C variance of C then one quarter times two. So what do I have? I'm missing something. If this is one half I will have the result greater than one which is wrong. Let's say again the covariance is equal to multiplication covariance between C and C plus C prime is equal to of C square divided by two. I think that's where I made mistake. I think this plus C prime divided by two. Right? That's my minus expectation of C times expectation of C plus C prime divided by two. So their product is this which is one half and this is the same so it's E of C square plus E of C square. That's what it is, right? Minus E of C square this times this divided by two and minus E of C also square divided by two because E of C and E of C prime are exactly the same, right? So what happens? One half of E square plus one half minus and minus one half minus one half E of C square which is one half of variance I was right here so where am I wrong? Where am I wrong? Now I have variance of one is this and variance of this is this so it should be less than one and now I have one half divided by square root of one half or maybe this is correct with me. So variance will cancel out so I have one over two this is one half square root of two well maybe that's right actually this is one half so that's why it goes through the top okay in my notes I put one half for whatever reason probably I have to revise it so this looks like the correlation coefficient between half dependent variables okay so again it's obviously less than one and it's not exactly one half although I was considering that probably if I have this half dependency maybe I should have half of one apparently it's not but in any case I just wanted to illustrate that reasonable number which basically corresponds to degree of dependency and this correlation coefficient is exactly what is being used for this purpose if you would like to know whether two random variables which you know their distribution etc etc are correlated or not well or dependent or not you calculate the correlation between the coefficient of correlation and if it's zero then it's a good indication that they are independent now it does not mean that if it's zero then it's independent no it means only that it's probably independent because it's only if it's independent then the correlation is zero there is no reverse theory in this particular case so if it's independent the correlation is zero so if you've got the correlation zero it means that the independence is kind of suspected but it's not necessarily true because by some sheer accident the probabilities can be distributed in such a way that calculation of this thing gives you zero basically that's it for today most importantly this particular property of correlation coefficient is used in statistics obviously because in statistics you don't know the probability distribution so you cannot actually exactly calculate covariance or variance etc you can only evaluate it based on statistical data and there in this particular case the correlation coefficient which is closer to one for instance or significantly different from zero is a definite indication that there is a dependency between random variables which you are examining using statistical methods but this would be in a completely different lecture okay that's it for today thank you very much and good luck