 Statistics and Excel. Correlation, large data sets. Focus on Z-score relationship part number two. Get ready, taking a deep breath, holding it in for 10 seconds, looking forward to a smooth soothing Excel. First, a word from our sponsor. Yeah, actually we're sponsoring ourselves on this one because apparently the merchandisers, they don't want to be seen with us. But that's okay whatever because our merchandise is better than their stupid stuff anyways. Like our, trust me, I'm an accountant product line. Yeah, it's paramount that you let people know that you're an accountant because apparently we're among the only ones equipped with the number crunching skills to answer society's current deep complex and nuanced questions. If you would like a commercial free experience, consider subscribing to our website at accountinginstruction.com or accountinginstruction.thinkific.com. Here we are in Excel. If you don't have access to this workbook, that's okay because we basically built this from a blank worksheet but we started in a prior presentation. So if you're using a blank worksheet, you may want to begin back there. However, if you do have access to this workbook, there's three tabs down below. Example, practice blank. Example, in essence, answer key practice tab having pre-formatted sales. So you can get to the heart of the practice problem. The blank tab is where we started with a blank worksheet that had just our starting data sets on it and where we will be continuing this time. Quick recap of what we're looking at. We're thinking about the correlation relationship to different data sets to see if there's a mathematical relationship or correlation between them. If there is a mathematical relationship, the data points are moving in some kind of format together. In other words, the next logical question would be, is there a cause and effect relationship? And if there's a cause and effect relationship, the next logical question would be, what is the causal factor? In this time, we've been looking at height and weight data. We had more data than we've seen in prior practice problems. 25,000 lines of data in our practice set for both the heights and the weights. We calculated the mean standard deviation. We noted that both of them seem like they conform to a bell-shaped curve type of scenario, which doesn't necessarily mean there's a correlation but might give us some indication of what's going on between them. That would make sense given the fact that we're looking at things that are kind of nature-related, heights, lengths of things in nature, for example, often having a bell-shaped type of curve. We then got our mathematical formula. We did our table that we've seen in the past to calculate the correlation manually or with Excel in a manual format. And then we also did it to double check it with the data tool set. And then we also used the data tool set to give us the general data for the two data sets. Now, noting that both of these seem to conform to a bell-shaped curve, we now are going to say, let's plot this thing out to the bell curve and then look at the z-scores related to the two bell curves to see if that could give us a better understanding of the z-scores. So I'm going to select these two data sets again. We're going to select from A to B so we can say control C, A to B, and then control C so we can copy. And then we'll put that in the AA cell. So I'll put them in AA and control V, paste it down and make a skinny z. Let's make a skinny z and that's going to be our starting data. Now I'm just going to copy over the same calculation as well that we did for the mean and standard deviation. I'm just going to copy this stuff. I'll put it in the same relationship to these cells because there's no absolute references. It should paste and pull in the right information. So I'm going to say control V and then if I double click it's pulling in the right data. That looks good. Let's make a skinny AC here and then we're going to plot this out as a bell curve. So using our norm.dist. So I'm going to use xh and then p of x and we'll say this is for h. I should probably say h and w but I'll just do it that way. And so I'm going to plot these out. As we did in this prior section when we looked at bell curves. So I'm going to go a home tab. Let's go to the font group. Let's make this black. Let's make it white and let's center it. And then let's say that we're going to take it standard deviations number standard deviations. I'm looking at the heights here so I could start at like zero and then go up in inches from zero up to the highest height. But that's probably too much data. We don't need that much data in it. So let's just take it for standard deviations up and for standard deviations below as has been our custom. So I'm going to say for standard deviations. This is going to be for the height and this is going to be for the weight. These are my headers. Let's make that a header tab home tab font group black white will center it. And then we're going to say that we have a lower X and the upper X. Now should probably be using H and W but we see what we're doing here. We're going to say lower and upper. Maybe I should do that. Just say this is H and this is W. And so we'll just say this is H and this is going to be P of H. Let's say and then I'll say OK. So for Dan standard deviations this equals the mean minus the standard deviation one point nine times four. So I'm going to say OK. And so that's going to be my lower point. This equals the mean times one point. Actually the mean plus one point nine times four. That's going to be our upper point. And when I do the weights, I'll do the same thing over here. The lower point is going to be equal to the mean minus the standard deviation times four. And the upper point is going to be the mean times the standard deviation. I'm sorry. Plus the standard deviation times four. So four standard deviations up and below for the weights and the heights. So now let's so I'm on the heights right now so that means I don't need to go to zero inches. I'm going to start at just 60. We'll round it down to 60 inches and then I'm going to go up to 75 inches. So I'm going to go 60 61. I'll go inch by inch here. We're going to inch our way up inching our way up to 70. Let's go to 76 inches inching our way up to 76 inches. And so there we have it. And then the P of H is going to be our norm dot this. So this equals norm dot this function. We saw it in a prior presentation because we're going to approximate the data with like a smooth curve like with a bell shape. So we're going to say this is going to be the X, which is that right there. We're going to say comma the next argument is the mean, which is going to be that 6799 F four in the keyboard dollar sign before the letter and number comma standard deviation here. F four in the keyboard dollar sign before the letter and number comma. Should it be cumulative? I'm going to say no, because we're just using that one point closing it up and enter. And then let's percentify it. Home tab number group, percentify, decimal eyes, double click on the fill handle, dropping it down. So there it is. Then we can have our Z score. This is the Z score of H. And so now same format, we're going to say black, white, center, similar calculated calculation of the Z score we did before. But now we're we're using the kind of the smoother curve of the actual approximation of the bell curve based on the data set information. This is the equal brackets. Each data point minus the mean F four on the keyboard dollar sign before the letter number closing it up divided by the standard deviation F four in the keyboard dollar sign before the letter and number and enter double click in the fill handle to drop it down. So there we have it. Let's select all three of these. Make it a little bit thinner. Let's make a thinner a G as well. And so there we have it. Now if we were to plot this out, I can plot this out and we should get our bell curves. I'm going to say control shift down control backspace. And if I was to say insert charts, I won't do an area chart. Let's do a just a normal kind of bar chart just to see the shape. So there it is. Now we've got a smoother bell shaped curve, although it's it's doesn't have as much, you know, detail in it here. So we're going to say, OK, so there is that. Let's do the same thing for our weights. So I'm going to say now let's do the weights. So it's going to be WP of W. And this is going to be Z of W. Selecting those three home tab font group black, white, center, wrap. We don't really need to wrap it, but we'll do it anyway. And then we're going to say this one lower. We're going to have in pounds. Don't let's go to 79 79 pounds and then we're pound our way up. We inched our way up last time. Now we're going to pound our way up up to 174. I'm pounding my way up to 174. I'm sick of inching my way up. Now I'm pounding my way up, man, because I'm that's how it has to be because people wasn't wasn't happening with the engine. This is going to be equal to the norm dot dist. And the X is now 79 comma the mean is now the 12708 F4 in the keyboard dollar sign before the letter number comma. And then the standard D is now 1166 F4 in the keyboard and comma the cumulative no zero or false closing up the brackets. Enter percentifying it home tab number group percentify adding some decimals double click in the drop it down. Let's do the Z score. The Z score equals brackets. Each of our data points 79 in this case minus the average weight 127, which seems light to me. But that's again, because I'm obviously so yoked up that I'm heavier than that. So it's not like I'm it's for health. I'm healthier by being heavier. Just kidding. So then what's divided by the standard deviation here and then we'll double we'll double click to drag it down. So there's our Z's. Okay, so now that we have that we can then plot our bell curve for this one. Selecting the middle column control shift down control backspace to get back up. I'm going to put that over here somewhere home tab or insert tab charts. Let's just do a chart like this. We get our bell shaped curve. So there it is. All right. So now let's just kind of think about these Z scores the relationships between the Z scores. So I'm going to copy this. I'm going to pull this to the right. Pull this to the right. And let's just try to match up our our Z scores as best we can. I'm going to make a skinny AK first. And then I'm going to select all these columns from a H to a N. I'm going to say control C. I want to paste that here, but I'm going to paste just the values only. I'm not going to pull over the formulas right click and paste it 123. I'd also like to paste the formatting just not the formulas. So I'm going to right click and paste the formatting only. So now we have it looking the same, but there's just hard coded numbers. In other words, no formula. I'm going to move this stuff to the right so I can have some room. And so these pictures need to be moved to the right. So we have our space to work here. And then I think it's easier. This one has less data. So I'm actually would like to move this onto the right side. So to do that, I could select all of this. And then I can say I could try to move it this way, but it's useful to note that it's the same to do a cut. If I cut that and paste it over here, control V, pasted it over there. That's kind of the same thing. Then I'm going to delete these columns, which I don't need from AO to AR right click. And let's delete those. We'll make a skinny AS column here. So there we have it. So now what I'd like to do is for the best I can is match up the Z score. So not going to match up all the, you know, exactly, but we can match up the Z scores basically as best we can. Because that's the tool that we can use generally to match up the data. So for example, the second one here, I can take this and move it down and say this is close as close, close pretty close to this one. So it's actually close to this one 3.6 a 3.69. This one, I'm going to grab this and move it down. It's pretty close to the 3.15 like that one. It's pretty close between between those two. Two point. This is going to be 2.63. 2.63 is pretty close to that one. And then we're going to say the next one is 2.1. It's pretty close to that one. It's kind of in between these two. I'll put it there. And then we've got the 1.57. 1.57 is pretty close to that one. And then we've got the 1.05 is pretty close to that one. And then we've got the 0.52, which is exactly that one. And then we've got the 0, which is pretty close to that one. And then we've got the 0.53, which is pretty close to that one. And then we've got the 1.06, which is pretty close to that one. And then 1.58, 1.58. And then 2.11, which is around there. And then this is 2.63. And then this is 3.616. This is 3.68. And this is 4.21. So something like that. And then here, for each Z that's pretty close to each other, I'm going to take the difference between the height and the weight. So I'm going to put the difference over here. I'm going to say this is the difference. So I'm going to take this as the W minus, let's just call it W minus H. And I'll make that black and white. Home tab font group black, white. And let's center it. And then I'm just going to subtract out for each of these points the weight, 79 in this case, minus the height for each point where the Z score is as close as we can get. And I'm going to copy that down. Copy, paste, paste, paste, paste, paste, paste, paste. Paste, paste, paste, paste, paste, paste, and paste. And then I'm going to copy all of this information and try to just squish it together so that I can put it in one so that we can graph it. So I'm going to copy this. I'm going to paste it over here, but paste it just the values only. And then I'm going to just move all this stuff up so it's in a nice column here. So I'm just going to delete these columns right here or these rows. I'm going to delete, shift up, and there's the next one. I'll move that one up and then I'll just select these and say delete and shift up. And so there's the next couple. So I'll put this one here, this one here, scroll down a bit, and then this one can move up. And then I can delete all of this. Right click, delete, shift up, and then I'll move this one up and then I'll move this one up. And then I can move this one up and then I can delete all of this. Delete, shift up. And so then let's move this one up. Let's move this one up. And so we still have more. Delete all of this. Right click and delete, shift up. And then I'll move this one up and then I'll move this one up. And then there's more here. So I'll say right click, delete, shift up. And then I'll move this one up and I'll move this one up. Is that everything? There's still another one here. I'll move that one up. And so I think we have everything. So now we've got those differences in a column that again correspond to the similar Z scores. And the point here is that that if I plot this, you're going to you're going to see a trend, right? And that's the relationship of the Z scores that we're looking at. So if I say insert here and the charts and I insert a chart on this, you could see basically that relationship. So you can, and that's, so we'll call this difference, difference and Z difference. So X, this is going to be W minus H for related Z score. Score. And so the idea here is that we can, I'm losing it. That's not what I'm trying to do. Excel, you know what I'm trying to do. We see that relationship between the differences of the data points that are related to the same Z score. So just to recap that, what we did here is we plotted out the bell curves and then we calculated the Z scores. And then we tried to map out or compare the Z scores that were the same or as close to the same as possible for both of our two different data sets. And then with those similar Z scores, we took the difference between the weight and the height are two data sets. We put them together so that we can them plot them and we see this kind of linear type of relationship between them. Just to give us a kind of an intuitive sense of how the Z scores are being used to do this correlation type calculations. All right, let's go ahead and just make these format these like we typically do. Let's go to the home tab over here and make this blue and bordered. I'll make this blue and bordered blue bordered. I'll make this whole thing control shift down border blue. I'll make this whole thing control shift down border blue make this whole thing control shift down. Let's make it border blue and then I'll make this whole thing. It's going to be a little bit difficult to do to control shift down. I'll say control shift down down down down down down down down down down down down down down up and then we'll make that border blue. then this whole thing controls shift down. Let's make that border blue. Let's make the same skinny size. I'm going to select the AX, the AS, this one, and let's go to the left and pick up all of our skinnies and we'll try to uniformize the skinny eyes, the skinninesses, and so we'll select all the skinnies all the way across. All the skinnies need to be uniform, a uniform skinniness, and so there we have it. The fat ones could be all different variants of fat, but the skinnies have to be uniform. So then we're going to go to the review, spell check it. Correlation. You did it again. You did it again. I'll ignore that one. Ignore, ignore, ignore. Correlation's wrong. Ignore, ignore, ignore, ignore, ignore, ignore. Okay, looks good.