 So, what are the characteristics of the correlation coefficient? Correlation is a numerical value that describes in mayors the three characteristics of the relationship between x and y variable. So, it is a numerical value which is denoted by r at ranges from 0 to 1. So, what it describes actually number one, the direction of the relationship. Correlation could be positive, correlation could be negative. Positive correlation means if one variable is increasing, the other is increasing as well. For example, if I am studying for more hours, my GPA will increase. So, if I am investing more time in studying, my GPA will increase, one is increasing, other is increasing, it is a positive correlation. So, same way if one is decreasing and other is decreasing, again it is a positive correlation. When the two variables are moving into the same direction. This is an example of the positive correlation. You can see that this is temperature in degrees and this is the amount of bare sold. So, of course this is from the Gravator book. So, we are, I hold slides, I am using your textbook which is Gravator which I showed you in the first lecture. So, all the examples I use in these lectures have been taken from Gravator. So, here you can see that if temperature is increasing, the consumption of beer is also increasing. So, here one is increasing. If this is 20, this is less. But if it is kind of 80 temperature, the amount is also increasing. So, this is called a positive correlation when one is increasing and other is also increasing. Here is an example of the negative correlation. Negative correlation means that if one variable is increasing, other is decreasing. As you can see that this is 20 and this is 60. On the continuum, if this variable is increasing, this is decreasing. If one is x is increasing and y is decreasing, this is an example of negative correlation. So, here is an example of a temperature increase and then amount of coffee sold. As the temperature is increasing, it means it is getting hot. So, the amount of coffee consumption is decreasing. So, that is an example of a negative correlation. When one variable increases, the other would decrease. You can look around yourself. The many examples of that if one is increasing. For example, smoking and health is a negative correlation. If you are smoking and you are increasing this variable, the amount or quantities increasing, the health will be going down. The second characteristic of the correlation is that the form of the relationship. The form of the relationship means the most common use of correlation is to make a straight line relationship. So, correlation could be in the Pearson correlation, we talk about linear correlation or it is a form of relationship that it is forming a line, either it is a negative or positive. Correlation could be curvy linear also. Curvy linear means that one is increasing and then it is going down. For example, if you take an example of age and running speed, when you are growing, your age is increasing, your running speed is increasing as well. But when at the point when you are 40 plus or 50 plus, the age is increasing, but now the running speed is decreasing. So, that is an example of curvy linear correlation where we have a negative and a positive. So, the second characteristic of the correlation is the form of the relationship, whether it is a linear or the non-linear. The third characteristic of the correlation is the strength or consistency of the relationship and that is the main thing correlation coefficient tells you all about. The correlation may is the consistency of the relationship. For a linear relationship, for example, the data points could fit perfectly on a straight line. Every time x increases by one point, the value of y also changes by the consistent and predictable amount in the same way. The strength or the magnitude of the relationship is also kind of characteristic that is defined by the correlation coefficient. However, relationship are usually not perfect. There are very few examples where we can have a perfect correlation. For example, in the salary and the months, there could be perfect correlation that for the first month, if the salary is 1 lakh rupees, then it will be, in the second month, you will have 2 lakhs, 3 lakhs, 4 lakhs and so on. So, here is the months and here is the salary. But there are many, very few examples where we can have a perfect relationship where if one variable is increasing with the one point, the y variable would also be increasing with the same magnitude or strength. Although there may be a tendency for the value of the y to increase whenever x increases, amount that y changes is not always the same. And occasionally, y decreases when x increases. So, there are many examples where x could increase and then could decrease also. So, there could be like this, like there is a low correlation, not necessarily every time x is increasing and y is also increasing, but sometimes maybe x is increasing, but y is decreasing. So, when there's a low correlation, it means that it's not consistently going in the same direction, but there is a scatteredness in the data. So, if you look at this, this is a perfect correlation, which is an ideal correlation, 1. So, if there's a perfect correlation, the magnitude or strength of relationship would be 1. Correlation ranges from 0 to 1. 0 means that there's no correlation between the two variables, which means that the variables are independent. They don't relate with each other. And 1 means that there's a perfect correlation that exactly with the same magnitude if x is increasing, y would also be increasing. So, this is the perfect correlation. As we go by, we could have a dot like this. It means this is a strong positive correlation because the dots tend to make a line. Again, if we just make them scattered, then it means there is a relationship, but there is a low correlation because there are few points that are moving in the same direction, but there's a scatteredness. So, we can get an idea about this strength of the relationship and magnitude of the relationship with this plot, which is called scattered diagram, which we talked about. Strength or consistency of relationship is measured by the numerical value of the correlation, which I just told you, it could be between 0 to 1. 0 means no correlation and 1 means perfect correlation. So, this is an example of the zero correlation. As you can see that the dots are just scattered and they are not forming any line. Like if the dots are like forming any direction or a line, we can say that there's a relationship, but in this example, there's a zero correlation because the dots are not tend to form any pattern or line. So, for a correlation of 0, the data points are scattered randomly with no clear trend or line. The range or the sign and the strength of the correlation are independent. For example, the relationship of 0.6 and the relationship of minus 0.6. So, the magnitude is the same or the strength of the relationship is the same regardless of the sign. So, negative sign means that there's a negative correlation, which means that if x variable is increasing, y is decreasing. But for the plus 0.6 value, it means that both variables are moving in the same direction. So, the sign and the magnitude are independent. A correlation of 1 indicates a perfectly consistent relationship, whether it is a positive or a negative. Similarly, correlation of 0.8 and minus 0.8 are equally consistent relationship. So, a sign negative or positive, means that it is telling you the direction. Otherwise, minus 0.8 or plus 0.8 means that correlation is very high, strong and strengthened. But the sign means that there's a positive and a negative correlation. One of the most common errors in interpreting the correlation is to assume that a correlation necessarily employs a cause and effect relationship. This is very important. To read in correlation, just pay maximum attention to this point that correlation doesn't mean causation. Correlation does not employ cause and effect relationship. For example, if we say that the strong positive correlation of smoking or health is still it does not mean that smoking is causing the bad health because we cannot claim causal relationship or causation until we do experimental study where we control all extraneous variables. Release of the movie and the road accidents. Two variables can have a strong relationship because they both are moving in a same direction but that will not employ that one is the cause of the other. But it is just saying that there is a relationship between the two. If you randomly look at things, although there may be a causal relationship between two variables, the simple existence of the correlation does not prove it. Even smoking or health may be because of smoking, until or unless we run experimental controlled study, we cannot claim a causal or cause and effect relationship. To establish a cause and effect relationship, it is necessary to conduct a true experiment in which one variable is manipulated by a researcher and other variables are rigorously controlled. If you want to find out the cause and effect in smoking or health, you have to control for all individual variables, environmental variables, lifestyle variables, genetical, other things and then after controlling all extraneous variables by running the lab experiment, you can conclude that there is a cause and effect relationship. A study shows a positive correlation between number of churches and number of primes in towns and cities. Again, it is reasonable that small towns could have few churches and then fewer crime rate. Rather because the real reason is that the town is small. So two things can relate very well. If you run the causal study, the size of the town was the real cause of the relationship. It is from zero to one, what is positive, what is negative, what is linear, what is non-linear relationship, does not employ a causation. It just tells us the relationship and doesn't employ that one is a cause and other is the effect.