 Statistics and Excel. Correlation calculation with strange result. Got data? Let's get stuck into it with statistics and Excel. You're not required to but if you have access to one note we're in the icon on the left hand side. One note presentation 1740 correlation calculation with strange result tab. We're also uploading transcripts to one note so that you can go into the view tab. Immersive reader tool. Change the language if you so choose be able to read or listen to the transcript in multiple different languages using the time stamps to tie into the video presentations. One note desktop version here thinking about correlation where we have different datasets to see if there's a mathematical relation or correlation between them. In other words are the different dots and the datasets moving together in some way shape or form. If there is a correlation or mathematical relation between the datasets the next logical question would of course be is there a cause and effect relationship that's causing that correlation or mathematical relation and if there is a causal relationship the next logical question would be what's the causal factor in the causal relation which is causing the mathematical correlation. In prior presentations we thought about a perfect positive and perfect negative correlation which are things that you don't actually often see when working in practice problems or in practice because usually we're looking at two different datasets that might have trends together for some reason but not a perfect correlation. So it's useful to think about the perfect situation in theory but in practice it's not usually going to be a perfect correlation. We then looked at an example with a few data points so that we can see an imperfect correlation with a few data points so we can analyze it fairly easily. We then looked at a correlation where we had random datasets that we generated so we can see how we generated the datasets and what the correlation between them were. Now we're looking at a situation where we're going to get an unusual result with the correlation. This being a reminder that like any statistics we can't simply rely on one number all the time. We still have to use our intuition. We still have to think about what is actually happening here, what is it telling us and oftentimes we have to look at things from multiple angles if we want to get a proper perspective about what the data is actually telling us. So we're going to construct our data this way. We're going to have an x and a y. This being our two different datasets and we're just going to randomly we're just going to pick an x and a y and see whether or not they're correlated. The points we're going to be plotting are x and y are 1, x and y are 2, x and y are 3, x and y are 4, x and y are 5. Wow, they seem very correlated. But then the last one you've got x is 0 and y is 7. So if we were to consider this data, let's do our mathematical calculations. Obviously looking at it we would say, you know, if I looked at that dataset I'd be like yeah they look like there's some kind of relationship going on there. There looks to be some kind of, you know, one looks to be tied to the other possibly in some way shape or form. Let's do the math on it. So if we do the calculation of the mean, the mean is going to be the average. We can actually do it now since we don't have that many data points we could say well this is going to be 1 plus 2 plus 3 plus 4 plus 5 plus 0 divided by how many 1, 2, 3, 4, 5 divided by 5 is going to give us, let's do that one more time. I think I messed up 1 plus 2 plus 3 plus 4 plus 5 plus 0 divided by 1, 2, 3, 4, 5, 6 divided by 6. And then they get the 2.5. If I do the same thing for the other one, I get a 3.7 or 6.7 on it. And then if I take the standard deviation, we're taking the standard deviation of the sample. This is a measure of spread of the data. You will recall 1.87 and 2.16. Now let's do our mathematical calculation taking in essence the z scores of the first data set each point minus the mean divided by the standard deviation times each point same z score of the second divided by n minus 1. So if I take my first data set of x point 1 and I look at the z, the z will be, calculating the z is going to be 1 minus 2.5 divided by the standard d 1.87 gives us about point 8. And obviously the second one would be 2 minus 2.5 divided by the standard d 1.87. That's going to be the point 27.