 Hi, I'm Zor. Welcome to Unisor Education. We will talk about correlation between two random variables. We will just make some calculations based on certain knowledge about these random variables. In the previous lecture, which was actually called Problems No. 4, I have introduced these two random variables and I did some preliminary calculations. Now we will go into the covariance and correlation for these random variables. Now this lecture is part of the course of advanced mathematics for teenagers and high school students presented on Unisor.com. And I do recommend you to watch it from that website because it has notes and for registered students you have the certain educational process you can follow, which includes enrollment and taking exams. Alright, so let's go through... First of all, let me just repeat a little bit what I did in the previous lecture. I have introduced two random variables, c, which is taking value x1 or x2, only two values, with corresponding probabilities p and 1 minus p. And variable eta, which has values y1 or y2, also only two values, with corresponding probabilities. I have also specified the combined probability of c to take x1 and eta to take y1 as some number r. Now, why do I need this probability? It's the probability of their simultaneous distribution and that's exactly what I need if I would like to, let's say, average their product, which is part of the covariance calculations, right? Now, from this, I can obviously derive all other combinations. So, if the probability of c to take x1 and eta to take y1 is r, then the probability of c to take x1 and eta to take y2 combined with this probability should be the probability of c to take x1, regardless of what value eta take, right? Because these are two different events. They're non-intersecting, obviously, because y1 and y2 are two different values. But some of these two is basically supposed to be all the events where c takes x1 and eta doesn't really matter what it takes. So, it's supposed to be equal to p, right? So, by necessity, this is equal to p minus r. Very similarly, probability of c to take x2 and eta to take y1. Let's just think about it. What is this? Well, if this is r, now this combined with this, the first and the seventh means that eta takes y1 and c is irrelevant, basically, because it has either x1 or x2, we don't care. Which means some of these supposed to be equal to q. So, that's why this is equal to q minus r. We have to subtract this. And finally, the probability of c to take x2 and eta to take y2 should be equal to, okay, this plus this is the probability of c to take x2 irrelevant of eta, which means it's supposed to be 1 minus p. So, this is supposed to be 1 minus p minus q plus r. So, these are four combined probabilities. I can actually write it a little bit shorter, so it doesn't take as much time. Let's call it this way. p11 is equal to r. p12 is equal to p minus r. p21 is equal to q minus r. p22 is equal to 1 minus p minus q plus r. So, that's my all the combined probabilities. And now, we have enough information to basically calculate covariance. Now, covariance is equal to expectation of this minus product of expectation. So, in the lecture where I have introduced the covariance and correlation, I basically derived this formula. And if you don't remember, please refer to the corresponding lecture in correlation, all right? So, let's just calculate it. That's kind of easy. So, what is expectation of product? Well, product can take all the different values when one random variable takes one of the values and another takes another value. So, what kind of values do we have? x1, x2, sorry, x1, y1, x1, y2, x2, y1, and x2, y2. There are only four combined values for the product, c by eta. And each one of those products have corresponding probability weight, which is supposed to be used when we are calculating this mathematical expectation. So, it's x1, y1, r, plus x1, y2, p minus r, plus x2, y1, q minus r, and x2, y2, 1 minus p minus q plus r. Now, minus this one. So, what is the expectation of c? Well, it's x1 times p, plus x2 times 1 minus p, multiplied by expectation of eta, which is y1, q, plus y2, 1 minus q. So, this is a formula for our covariance. Well, the only thing is it's a little bit long, so let's try to maybe simplify it just a little bit. Let's think about this way. I will try to group together the corresponding x and y. So, x1, y1 contains r, and from this it has minus pq, correct? Now, what else? x1, y2. We have p minus r. x1, y2, it's p minus pq. So, it's minus p plus pq, plus x2, y1. We have q minus r, and x2, y1. It's 1 minus p times q, which is q minus pr, minus q plus pq. And the last one, x2, y2. We have 1 minus p minus q plus r from here. And from here we have 1 minus p times 1 minus q, which is minus 1 plus p plus q minus pq. It was p, q, p and q and 1. So, that's quite convenient. We have r minus pq, and this is minus r plus pq. This is, again, minus plus r and pq, and here is plus r minus pq. So, it's very symmetrical. Basically, r minus pq we can factor out. So, what will be inside? It would be x1, y1, which is this one. Now, this would be with a minus sign, minus x1, y2. This also will be with a minus sign, because it's minus r plus pq, right? So, it will be minus x2, y1, and this will be with a plus sign. Equals r minus pq. Now, this is also, look, it's x1 times y1 minus y2. Minus x2 times y1 minus. So, it's basically x1 minus x2 times y1 minus y2. So, it's a relatively symmetrical and compact expression for covariance. Okay. So, let's just remember this. So, that's the covariance. I will put it here. This is our covariance. r minus pq times x1 minus x2 times y1 minus y2. That's quite an interesting, by the way, covariance, because already from this, we see a confirmation that if our variables are independent from each other, now independent means that the probability of them taking some combined value, one is equal to one value and another is equal to another value, equals to the product of their probabilities. Right? Remember this. So, the probability for an independent variable of c to take x1 and for a to take y1 is supposed to be equal to times probability of a to take y1, which is what? This is p and this is q. And this is, we have decided at the very beginning, because r, right? So, if r is equal to pq for independent variables, then this thing is equal to 0. So, covariance is equal to 0. According to this formula, we just have a confirmation. I mean, we did have it in some general case, but this just confirms that our calculations are not incorrect. I'm not saying correct and saying not incorrect, because if it's not equal to 0 in this particular case, that means we made a mistake. So, we didn't make a mistake. All right. So, that's our covariance. Now, if you remember, our purpose was actually to introduce correlation coefficient, which is between minus one and one, and it signifies the level of dependency with zero means for independent variables. And for ultimately dependent variables, it's one or minus one depending on whether they move to the same direction or opposite direction. And everything else, partial dependency should be somewhere between minus one and one. And if you remember, our correlation coefficient, which we have marked with r, was a covariance between them divided by square root of the product of their variances. Okay? Now, we did calculate the variance in the previous lecture. It's relatively easy, and I will do it again right now. The variance of this one is equal to expectation of its square minus, square of its expectation. So, that's expectation of a square is what? X one square P plus X two square one minus P minus. A square of expectation is X one P plus X two one minus P square. That's what it is, right? Equals two. Let's just open the parentheses. X one square P plus X two square minus X two square P minus square of this one, which is X one square P square. Then minus two X one X two one minus P and P and minus X two square and one minus P square, which is this. Equals. Okay. I hope I didn't make a mistake. Alright, X two square, I see this and I see this. X minus X two square times one. Okay, that's good. What else? So, X one square, we have P minus P square and that's it. Now, X two square, it's plus two P. You see minus it, minus it's plus two P, X two square and this is minus. So, I will have plus X two square P and minus P square, which is this one. So, we have covered this and what else? That's the only thing is minus two X one X two P and one minus P equals. Well, P minus P square is the same as P minus P times one minus P, right? So, it's P times one minus P and here I will have X one square minus two X one X two plus X two square, which is X one minus X two square. Sorry, square is outside of the parenthesis. So, that's what we have to put here. So, the square root of X one minus X two square P and one minus P. And the same thing for eta, which is Y one minus Y two square, same thing, right? Q and one minus Q. So, all we need right now is just to simplify this formula as much as possible. Now, here I will also take whatever we have here, the covariance, which is this. Now, what's interesting by the way, you see X one minus X two and this is square root of X one minus two square. And the same thing with Y. So, it's tempting just to cancel them out, right? Because this is square root of a square, right? Now, that's not exactly the right way to do because this can be positive or negative, depending on their comparison. And this can be always positive because this is the absolute square root of a square is absolute value of a, right? Because square root always without plus minus sign, it assumed a arithmetic value, which is only the positive part. So, it's not just simple factor it out, but we will do, we will actually do it differently. We will put something which I basically designate as number K. Now, K is either plus one or minus one. K is equal plus one or minus one, depending on sign of this product. That's what it is, right? So, let's just remember this. So, this is K, which is plus one or minus one, depending on. Then we can basically get rid of this X one and X two. So, it will be only R minus PQ divided by square root of P one minus PQ one minus Q. So, we can't really simplify it any further. It's simple enough. Now, here we see basically dependence of the correlation coefficient from their individual probabilities. And parameter which signifies the mutual, the simultaneous taking certain value, the mutual distribution of values of variables C and A. So, this is the coefficient of correlation. Let's just again think about this. Obviously, if independent variables are given, then it's equal to zero because R is equal to PQ. R is basically the probability of C to take X one and A to take Y one simultaneously. But if they're independent, it's equal to probability of one times probability another, which is P times Q. So, for independent variables, that seems to be correct. Now, let's think about what will be in case when C is equal to A. What does it mean? Well, first of all, it means that X one is equal to Y one and X two is equal to Y two. They're exactly the same, right? But it also means that whenever C takes X one, A inevitably takes the same value, X one, which is equal to Y one, right? So, that means that, what it means in terms of R? Well, R is supposed to be equal to, let's just think again. What's the probability of C to take X one simultaneously A to take also value of X one? Well, if they are completely dependent, then it's exactly the same as the probability of C to take X one, right? Which is P. So, my P one one is equal to P. My P one two is equal to zero, right? The probability of C to take X one and A to take X two, which is equal to Y two, is zero because they are never on the different values. They're always the same. And P one P two one is equal to zero for the same reason and P two two is equal to P, right? So, what do we have here in this case? Well, P and Q are the same right now. So, I will have R which is P minus P square divided by, now this is the same as this, right? The same. So, square root would be P times one minus P. Now, this is also P times one minus P. So, this is equal to one. So, the correlation is equal to one in this case. S is supposed to be. So, basically, it does make sense, this formula. By the way, if I will do differently, if I will do in opposite direction, then I will have minus one here for obvious reasons. So, that's why K is either plus one or minus one. So, that's basically the problem, that's all the problem. Our correlation coefficient has been calculated and you just have the feeling how it's calculated. Now, in this case, we are within the theory of probabilities when all these numbers are given to us. Practically, it's all for statistical purposes because we don't really know the probabilities. We do know statistics, we have the sample data, etc. And that's why what's important is to calculate our correlation coefficient based on statistical purposes and then see how close it is to zero or to one or something in between or to minus one because that actually gives us the very good picture of how two random variables related to each other. And the perfect example, and I mentioned it in the previous lecture, is the drug and some illness which this drug is supposed to treat. So, the effect of the drug on the effectiveness of the treatment can be statistically evaluated because we have basically statistics, we give so much this particular drug and we have these particular results. The question is whether they are related or not. If the correlation between giving the drug and effectiveness of the treatment is very close to zero it means, well, the drug is basically useless. If correlation is closer to one, obviously it means that the drug is effective. So that's where we have to make a judgment how much positive correlation we consider sufficient to recognize this drug as an effective treatment of that particular illness. But that would be a purpose of our statistical discussions in a different lecture. Okay, that's it. Thanks very much and good luck.