 Statistics and Excel. Perfect positive correlation. Get ready, taking a deep breath, holding it in for 10 seconds, looking forward to a smooth, soothing Excel. Here we are in Excel. If you don't have access to this workbook, that's okay because we'll basically build this from a blank worksheet. But if you do have access, there's three tabs down below. Example, practice, blank. Example, in essence, answer key, the practice tab, having pre-formatted cells so you can get to the heart of the practice problem. The blank tab, blank worksheet so we can practice formatting cells within Excel as we work through the practice problem. Let's go to the example tab to get an idea of what we will be doing, where we will be going. First, a word from our sponsor. Yeah, actually we're sponsoring ourselves on this one because apparently the merchandisers, they don't want to be seen with us. But that's okay whatever because our merchandise is better than their stupid stuff anyways. Like our trust me, I'm an accountant product line. Yeah, it's paramount that you let people know that you're an accountant because apparently we're among the only ones equipped with the number crunching skills to answer society's current deep complex and nuanced questions. If you would like a commercial free experience, consider subscribing to our website at accountinginstruction.com or accountinginstruction.thinkific.com. We're going to be thinking about correlations. In other words, two different data sets to see if there's a relationship, mathematical relationship or correlation between them. Are they moving together in some way? If there is a correlation, the next logical question would be, is there a cause and effect relationship that is resulting in that correlation? And then the next logical question would be, what's the driving factor which of the two sets of data are driving the correlated relationship? And what of course is the cause of that? All right, we want to first look at a perfect correlation, which is something we don't always find in nature because usually when we're thinking about correlations, it may not be a perfect correlation. They tend to move in alignment, but not perfectly aligned. But let's take a look at the perfect scenario first, which happens with things like conversions, for example. So if you have a set of data that are in inches, and then you look at another set of data that's in feet, then they're going to be correlated, perfectly correlated together because of the conversion. So what we'll do is we'll actually make our inches data from the randomly generated norm.dist function just to practice that. So we'll have a bell curve kind of information. We will round it, and then we'll do our feet conversion. We'll do our mean, our standard deviation. We'll plot it out in terms of the norm.dist, just so you can see the relationship that way as well. If you had a set of data that was conforming to one of our distributions we've talked about in the past, such as the normal distribution or bell curve, and then we'll plot out our data set and look at the Z scores and we'll calculate the actual correlation. We'll do it both in kind of a mathematical format within Excel, and then we'll use some Excel tools to calculate that correlation and also look at our data set over here that'll give us some standard statistics just to practice with that as well. Alright, let's go to the blank tab to start it out. I'm going to select the entire worksheet with the triangle, right click on the selected area and format the cells. Within the cells we're going to say we want currency, negative numbers bracketed and red. Let's remove the dollar sign. I'm going to start with no decimals to start out with and okay. Let's go to the home tab, font, recording, when you're on camera, you've got to be bold. You've got to be bold, so that's what I'm doing here. Mean, so we're going to say I'm going to start to create the generated data, so we're going to randomly generate data that we can play with here. So I'm going to say the mean is 36. To do this, I'm going to conform it to a bell curve. Later on we'll do some just random data, but I'm going to conform it to a bell curve, randomly generated bell curve data, which still has randomness in it. Standard deviation is 15. These are the some conditions we need to have met in order for us to generate this data that conforms to like a bell curve situation. Let's make a skinny C and then I'm going to say this is going to be an inches. So I'm going to imagine this in inches. Let's make this our header format, home tab, font group, bucket, dropdown, black and white on here and let's center it alignment and center and then we're going to go to the data tab. We're using the analytics analysis. If you don't have the analysis tool pack, then go to the file tab over here. You want to go to the options down below and then you want to be on general. I'm sorry, not general add ins, add ins, and then you want the Excel add ins and then go and then tick off the toolkit, the tool pack, powerful. It's a powerful pack of tools and then we'll have the data, the analysis, and then the data analysis and we're looking for the random number generation, random number generation and okay. And then let's say it's going to be a one here for the columns, number of random numbers. Let's make some kind of a more unusual, let's make like 310. I don't like just putting 100 all the time because then sometimes it comes out that we always think of as a sample of being like 100. But in any case, let's hit the dropdown and say that we're going to say this is a normal, oh, hold on, k-passo, normal distribution right there. All right, then what do we need in order to generate this? We need the mean. So you need to be mean in order to generate this. So go out and be mean. You don't have to be mean. You need the mean like the average, the average. And then down here, I want to put this somewhere. Where do I want to put it? I want to put it right there. That's where it needs to go, yo. Did you just say yo? What is wrong with you? Okay. What do you mean, yo? I was okay. So there we have it. And then let's round this inches rounded. So I'm going to I'm going to round it now to a whole number. So now I'm going to say inches rounded. Now we could have generated this any other way you could use random data and you'll still have a correlation that will be perfectly correlated because we're going to be converting inches to feet. But I just want to practice the random generation because we could possibly be looking at a data set that has a bell curve related to it. Say we're talking about inches of something in nature, how long a worm is or something, or a snake, I guess, of this guy. I don't know. And then we can say it will have some kind of bell curve. And then of course, when we convert it to feet, the feet you would think would have a similar kind of relationship. Alright, so inches rounded. Let's go to the home tab font group black white. Let's wrap it. Let's center it. And I'm going to round this thing equals round it down. So round. I'll just pick the one to the left. And then I'm going to say comma two digits to two digits. That's to the whole numbers. Close it up. And boom, there it is 16 on the whole numbers. Now my format appears in a number generation, because when I generate these, it puts it back into the general. So I'd kind of so well, let's keep it like that. We'll keep it like that. I'm just going to double click it down. And boom, brings it on down. So what happened to this one? Oh, it's a negative interesting. Because it was a bell curve. So we had a negative number. Okay, so there we have it. Now if I convert to feet, let's convert it to feet. And we're going to say, okay, font group black, white center. So feet would be inches divided by 12. That's how we do things over here. In America, we use this weird, you divide it by 12, like 12. That's so sloppy. Like, why don't you use a unit of measure based on 10? You know, that would make sense. Whatever. Whatever. This is how we do things. Home tab, number group. Tell me don't tell me to get the king of England over here trying to tell me that I have to or we measure actually we got our measurements from the king of the king of England's foot, I think is how we anyways, whatever that's how it is. So I'm going to close this up. And then we're going to say all right, then all right, let's make a skinny G skinny G skinny, skinny G. And then we're going to say inches, let's say I'm going to say this is going to be equal the inches again, rounded, and then the feet. And I'll just copy the format painter this time. Home tab clipboard format painter. Boom. All right, let's make it let's make the mean mean is going to be equal to we'll say average tab going on over to the inches and control shift down, taking the average of it, we can add some decimals decimalizing it home tab, number group decimalize. And then we're going to say this is the feet equals average decimalize isn't a word by the way. But I'm making it into a word control shift down. I think it will be a word once I have shown its utility in practice, decimalized. All right, then we're going to say this is going to be standard deviation equals the standard deviation we're look we're moving to the sample this time so s not the population standard deviation of the sample tab. I'm going to control shift down picking that up. Boom. Decimalize and that one home tab number decimalize. So we could really recognize you can't recognize how they really are unless you can see the decimals. You need to see the decimals you need to get under the hood before you know who they really are. Home tab number group decimalize. Okay. So so so now when I compare these two obviously I can look at the difference now the difference and I could say all right well what if I was to like look at these two this way. I could look at the difference in the mean and the difference and the standard deviation but that doesn't tell me too much right because home tab number group because they're in different units so it doesn't that doesn't really help me but I could compare like their z scores so that's one thing I can do I can compare and and that's how our correlation will basically work before we do that however let's first let's first imagine that we're going to say hey that this set of data looks like it might look like a bell curve like if I select this data I'm going to go hmm is there a relationship with this data if I select all this data control backspace and then go insert and make a histogram from it here's my histogram based on inches so this is the inches histogram and why is it not when I type you do something that's how computers work computer don't you start protesting on me you know I was typing something so and if I do the same thing for the feet control shift down backspace control backspace and insert and then boom and then this is going to be feet okay so I can see there's kind of a relationship right I can say yeah those look kind of similar but I can't really compare that center point of course because they're in different units but it's like okay well they both look similar they both look like they conform possibly to a bell curve relationship so maybe I would maybe I'd compare the bell curve just to play with the bell curve before we get into the correlation here and then so I'll do that because that'll be fun muy divertido muy divertido okay so this is going to be then let's say we'll take the standard deviation let's take four standard deviation so if I'm going to do this bell curve thing what I want to do is plot out the bell curve x for inches and then p of x and then we'll do the z as well and then we'll do the same thing for x for feet and then p of x and then we'll do the z for feet this will be the z for inches all right but I need to know how far I need to go where do I need to start at with my x's so I can say we'll do the four standard deviations as we saw in a prior unit because that'll encompass the vast vast majority of the data so I'm going to say four standard deviations and let's say that we go upper x and lower x so I'm going to put them I'll keep the same headers here inches and feet this time so the upper x is going to be equal to the mean plus the standard deviation times four that's as high as we that's how far up we're going to go and the lower I wait I usually do the lower first don't I let's do the lower lower x and then the upper x so let's be consistent if you would otherwise you confuse people times this no it's minus minus this times four okay so that makes sense so we're taking the 16.29 times four standard deviations minus the center point the mean all right so negative 30 and then this one is going to be equal to the upper which is going to be the center point of the mean plus the standard deviation times four standard deviations okay that is making more sense and then this one if we do it in feet is going to be equal to the feet middle point the mean times the standard deviation times four I'm not times minus the standard deviation times four so middle point minus four standard deviations will get us to the lowest x and then we'll take the upper x which will be the middle point plus the standard deviation times four okay let's add some decimals here home tab number decimalizing all four of them at one time boom decimalize and then we'll close let's shorten this up and so there's that and then what we will do then is let's make these black and white making them our headers home tab font group black white alignment center the inches let's make this smaller two let's wrap them as well home tab alignment wrap them wrapping wrap them like Santa Claus on Christmas Eve okay so then we're going to say we're going to start with the inches at the lower which is going to be negative 30 so I'm going to say negative 30 and then I'll say third and we'll say then 30 this is going to be negative 29 and so on and so forth I'm going to copy it down till we get to 100 I'm going to say let's drag it down that's not too many I should be able to get there pretty fast it's still kind of far but it's not too bad I could have done my sequence formula to make it faster but let's go back up to the top and then we're going to say okay let's do our norm dot dist equals norm dot dist and then we're going to take our x here comma we're going to take the mean which is going to be this one that's outside my data set therefore want to make it absolute f4 on the keyboard dollar sign before the i and the 2 comma and then the standard deviation picking up this one f4 on the keyboard dollar sign before the i and the 3 comma do we want it cumulative no therefore zero or false close it up and let's decimal let's let's percentize it home tab number percentize and decimalized percentize and decimalize and then we'll copy it down and there's our normal distribution and then we can look at the z score which would be equal to brackets the negative number minus the middle point the mean and then i'm going to f4 on that one to make it absolute dollar sign before the i and the 2 closing up the brackets and divided by the standard deviation f4 in the keyboard dollar sign before the i and the 3 and enter there's the z score let's decimalize it home tab number group decimalize and double click the fill handle so there we have our z score and then i'm going to make the p smaller and let's do the same thing and say the feet also looks like it's conforming to a bell curve of course because it's converted so i'm going to say let's do that here too let's just copy this format home tab clipboard format painter pasting that here and let's just play with this one we're going to say all right the negative the feet needs to go down how low does it need to go we're going to say it goes down to negative three and then negative two and then we'll bring it up to eight eight boom all right and then let's do our norm dot dist equals norm dot dist we're picking up the mean f4 in the keyboard making it absolute comma standard deviation and f4 in the keyboard making it absolute comma oh hold on first was the x i have to pick up the x i got ahead of myself and then comma then the mean then the mean so then the mean and then f4 in the keyboard making it absolute comma then the standard deviation standard d f4 in the keyboard making it absolute and comma and then we want a zero or false closing up the brackets and enter let's percentize and decimalize home tab number group percentize decimalize double click in the fill handle to bring it on down so there we have it let's take the z score which is going to be equal to brackets the three minus the the mean f4 in the keyboard to make it absolute closing up the brackets dividing by the standard d f4 in the keyboard and enter let's decimalize it home tab number group decimalize and double click to bring it down so there we have it now if i so if i looked at this bell curve kind of relationship i can start to compare like the z score so remember i can't really compare it the the units themselves but i can start to look at the z score and i can say okay well this this one like this four i don't see anyone a four over here but this 3.62 is similar to this 3.62 right so if i like huh that's interesting right we can kind of see patterns happening here 3.62 3.62 and then i can be like well this 2.8 uh is there there's a 2.88 here and so let's make this a different color and say hmm 2.88 2.88 and if we looked at uh if we looked at one uh foot one foot 1.41 i'm going to say 1.41 and compare that to the z score over here you're like all right 1.41 is 12 inches right so we kind of see that that sounds 12 inches is one foot z score is the same you see how you we see a relationship happen if there's if i go to 24 inches say i think i see a pattern here 24 inches that's 0.67 i would think that would be like 2 feet 0.67 so you see see how these things are you could say you could start to see where the patterns are lined up because of the z scores could kind of help you to line things up and say okay i think there's there seems to be a relationship here and that's in essence what the the z score is kind of doing so what we'll do uh we're going kind of long here so i'm going to copy the formula on the z score and it's going to look something like this so next time we'll we'll consolidate right now we've kind of come very we're like there's got to be a relationship between these data sets right we already know there's a relationship but we can say well i i noticed that it conforms to a bell curve and i noticed that both the inches and the feet conform to a bell curve and then when i plotted out the actual like a bell curve based on the data for inches and feet i can kind of see a relationship between the z scores and so now i could say well can i use those z scores to kind of define the relationship mathematically and here's the formula that we'll look into we're going to sum up in essence the first z score minus the second z score over the n minus one and the n minus one uh is the number of items minus one which usually corresponds or ties into when we're using the sample data as to like the population data and so and so this will give us our formal kind of z score calculation and we'll do it a couple we'll calculate that a couple different ways uh next time i said formal z score it'll give us we already calculated the z scores but the z scores will be used within it to get the formal correlation calculation and we'll do that next time