 Statistics and Excel. Correlation, large data sets. Focus on Zscore relationship part number two. Get ready, taking a deep breath, holding it in for 10 seconds, looking forward to a smooth soothing Excel. Here we are in Excel. If you don't have access to this workbook, that's okay because we basically built this from a blank worksheet, but we started in a prior presentation. So if you're using a blank worksheet, you may want to begin back there. However, if you do have access to this workbook, there's three tabs down below. Example, practice, blank. Example, in essence, answer key, practice tab, having pre-formatted cells, so you can get to the heart of the practice problem. The blank tab is where we started with a blank worksheet that had just our starting data sets on it and where we will be continuing this time. Quick recap of what we're looking at. We're thinking about the correlation relationship to different data sets to see if there's a mathematical relationship or correlation between them. If there is a mathematical relationship, the data points are moving in some kind of format together. In other words, the next logical question would be, is there a cause and effect relationship? And if there's a cause and effect relationship, the next logical question would be, what is the causal factor? In this time, we've been looking at height and weight data. We had more data than we've seen in prior practice problems. So there's 25,000 lines of data in our practice set for both the heights and the weights. We calculated the mean standard deviation. We noted that both of them seem like they conform to a bell-shaped curve type of scenario, which doesn't necessarily mean there's a correlation, but might give us some indication of what's going on between them. That would make sense given the fact that we're looking at things that are kind of nature-related, heights, lengths of things in nature, for example, often having a bell-shaped type of curve. We then got our mathematical formula. We did our table that we've seen in the past to calculate the correlation manually or with Excel, but in a manual format. And then we also did it to double-check it with the data tool set. And then we also used the data tool set to give us the general data for the two data sets. Now, noting that both of these seem to conform to a bell-shaped curve, we now are going to say, let's plot this thing out to the bell curve and then look at the Z-scores related to the two bell curves to see if that could give us a better understanding of the Z-scores. So I'm going to select these two data sets again. We're going to select from A to B. So we can say, control C, A to B, and then control C so we can copy. And then we'll put that in the AA cell. So I'll put them in AA and control V, pasted it down, and make a skinny Z. And that's going to be our starting data. Now I'm just going to copy over the same calculation as well that we did for the mean and standard deviation. I'm just going to copy this stuff. I'll put it in the same relationship to these cells because there's no absolute references. It should paste and pull in the right information. So I'm going to say control V. And then if I double click, it's pulling in the right data. That looks good. Let's make a skinny AC here. And then we're going to plot this out as a bell curve. So using our norm.dist. So I'm going to use X, H, and then P of X. And we'll say this is for H. I should probably say H and W, but I'll just do it that way. And so I'm going to plot these out as we did in this prior section when we looked at bell curves. So I'm going to go home tab. Let's go to the font group. Let's make this black. Let's make it white and let's center it. And then let's say that we're going to take it standard deviations, number of standard deviations. I'm looking at the heights here. So I could start at like zero and then go up an inches from zero up to the highest height. But that's probably too much data. We don't need that much data in it. So let's just take it for standard deviations up and for standard deviations below as has been our custom. So I'm going to say four standard deviations. This is going to be for the height and this is going to be for the weight. These are my headers. Let's make that a header tab, home tab, font group, black, white will center it. And then we're going to say that we have a lower X and the upper X. That should probably be using H and W, but we see what we're doing here. So let's say lower and upper. Maybe I should do that. I should just say this is H and this is W. And so we'll just say this is H and this is going to be P of H, let's say. And then I'll say, OK, so four standard deviations. This equals the mean minus the standard deviation 1.9 times four. So I'm going to say, OK, and so that's going to be my lower point. This equals the mean times one point, actually the mean plus 1.9 times four. That's going to be our upper point. And when I do the weights, I'll do the same thing over here. The lower point is going to be equal to the mean minus the standard deviation times four. And the upper point is going to be the mean times the standard deviation. I'm sorry, plus the standard deviation times four. So four standard deviations up and below for the weights and the heights. So now let's, so I'm on the heights right now. So that means I don't need to go to zero inches. I'm going to start at just 60. We'll round it down to 60 inches. And then I'm going to go up to 75 inches. So I'm going to go 60, 61. I'll go inch by inch here. We're going to inch our way up, inching our way up to 70. Let's go to 76 inches. Inching our way up to 76 inches. And so there we have it. And then the P of H is going to be our norm dot dist. So this equals norm dot dist function. We saw it in a prior presentation because we're going to approximate the data with like a smooth curve like with a bell shape. So we're going to say this is going to be the X, which is that right there. We're going to say comma. The next argument is the mean, which is going to be that 67.99. F4 in the keyboard, dollar sign before the letter and number, comma, standard deviation here. F4 in the keyboard, dollar sign before the letter and number, comma, should it be cumulative? I'm going to say no, because we're just using that one point closing it up and enter. And then let's percentify it. Home tab number group.