 Statistics and Excel. Perfect positive correlation part number two. Get ready, taking a deep breath, holding it in for 10 seconds, looking forward to a smooth soothing. Excel. First, a word from our sponsor. Yeah, actually we're sponsoring ourselves on this one because apparently the merchandisers, they don't want to be seen with us. But, but that's okay, whatever, because our merchandise is better than their stupid stuff anyways. Like our CPA six-pack shirts, a must-have for any pool or beach time. Mixing money with muscle, always sure to attract attention. Yeah, even if you're not a CPA, you need this shirt so you can like pull in that iconic CPA six-pack stomach muscle vibe, man. You know, that CPA six-pack everyone envisions in their mind when they think CPA. Yeah, as a CPA, I actually and unusually don't have tremendous abs. However, I was blessed with a whole lot of belly hair. Yeah, allowing me to sculpt the hair into a nice CPA six-pack-like shape, which is highly attractive. Yeah, maybe the shirt will help you generate some belly hair too. And if it does, make sure to let me know. Maybe I'll try wearing it on my head. And yes, I know six-pack isn't spelled right, but three letters is more efficient than four, so I trimmed it down a bit, okay? It's an improvement. If you would like a commercial-free experience, consider subscribing to our website at accountinginstruction.com or accountinginstruction.thinkific.com. Here we are in Excel. If you don't have access to this workbook, that's okay because we basically built this from a blank worksheet, but we started in a prior presentation. So if you're using a blank worksheet, you may want to begin back there. If you do have access to this workbook, three tabs down below. Example, practice, blank example. In essence, answer key, practice tab, having pre-formatted cells so you can get to the heart of the practice problem. And the blank tab is where we started with a blank worksheet and are continuing at this point in time. Quick recap of what we have done thus far. We're thinking about correlations, different or two data sets in seeing if they have a mathematical relationship or correlation between them, meaning are they moving together in some way, shape or form? If there is a mathematical relationship or correlation, the next logical question is, is there a cause-and-effect relationship? And if there is a cause-and-effect relationship, the next logical question is which is the causing variable in the relationship and how is that working? So we're thinking here first about a perfect positive correlation, which you don't find so often in nature because usually when looking at two different data sets, they might move in somewhat alignment, but not perfectly in alignment. We want to start with a perfect alignment situation just to check that out. So we're using a conversion scenario going from inches to feet. So what we first did is look at our data set, construct our data set according to randomly generated numbers in accordance with a bell curve or normal distribution by using the data analysis. And just remember that it's not always the case that you're going to have your first data set that conforms to a bell curve. So it might, if we're measuring like the inches of something like a snake or something, and then you would think that it might correlate to something like a bell curve, but you might have random numbers or some other distribution of data. It doesn't really affect the correlation, but it's useful to kind of see that we could still see those relationships that we can approximate with the curves like a bell curve. Then we converted it to feet so that we can see the relationship. There would be the conversion factor, but if we didn't know that, then we might then go through this analysis to try to look at this data. We first looked at it with the mean and the standard deviation. We can look at the difference between the inches and the feet, but that's not going to help us too much because they're in different units, of course. So we then said, let's take a look at the actual inches and feet in terms of a histogram. And they look like a pretty similar type of distribution, but of course the numbers are different. So that looks like there might be a relationship there. We then said, since this happens to be in a bell curve format, let's make a bell curve distribution between the two sets of data. So now we made our bell curve as we've seen in prior sections for the feet and the inches. And we did the Z scores. And we noticed that when I look at the Z scores, I could see, for example, one, let's go to one foot corresponds to the 12 inches over here. And the Z score is the same, 141 and 141. Two feet corresponds to 24 inches over here at .67 and .67. So the Z scores are something that we can use for comparison, possibly to see if there's kind of a relationship. So then, of course, we have our formula for our correlation formula, which takes into consideration the Z scores. So what this formula is basically saying is when we have our two sets of data, we're going to be taking the first X's minus the mean over the standard deviation and then do the same for the Y's. That's basically the Z score. We're summing all that up for all of the I's and then we're multiplying that times the Z scores for the X's and the Y's and then divided by N minus one. So when we look at our correlation formula, it takes into consideration the Z scores. All right. So now let's start building this out. I'm going to bring this over to the right. I'm just going to copy my data over here. I'm just going to go over here and copy these two cells. Control C and bring our data over to the right. So we have it here so we can work on it and say right click and paste. Let's paste it one, two, three. So I'll paste it one, two, three. So we have it there just the numbers and then I'm going to say format up top home tab number group black and white and then let's center it and wrap it. I wrapped it and then centered it. Okay. So there's going to be our information. So now maybe I don't need rounded anymore. I'm just going to call it inches and feet. Okay. So then let's make this a little bit smaller. So the next thing we could do we might just graph these together. This would be the next lot another logical thing to see if there's a correlation. I could select these two. I'm going to say control shift down and then I'm going to say control backspace to get back to the top. And I'm going to go into the insert tab now and I want to go into my charts for scatter, scatter chart and typically we'll use I'll use this first one here. And I'm going to say wow the scatter chart will pick up that scatter chart looks like it's a straight line which is a pot which is like a perfect positive correlation. So obviously that would be an indication that there's something going on here with the with the correlation. So let's say this is going to be I'm going to delete the title here and what I really want to see is what's on the y and the x axes. So I'm going to hit the plus button here going to my trend line not my trend line. I'm going to go into my axis titles. And then whenever I have this kind of chart, it will pick up by default whatever's on the left and put that plot that on the x axis. So the easiest way to plot these charts is to put whatever you want on the x axis, which is usually going to be the independent we often think of it as like the independent variable. Although, you know, you can plot it either way, which we'll see in a second, but that'll be the x. So I'm going to say this is equal to on the title, the inches and then this one I'm going to click on it and say that's equal to the feet. So there we have it here. And if I look at the trend line, of course, it's the same as a straight line. So we'll add the trend lines, which will have an impact later. But just to check out the trend line, if I go into the more options with it, then we have our line. If I go to my bucket down here, I can make the line a straight line. I wanted to make it just a line and then I can make it orange if I want to make it orange. There it goes. And there's our trend line, which is of course right in the middle of all the dots because it's a perfect correlation. Now, just also just realize you could if I wanted to switch. Maybe I don't maybe I don't know what's the you know, we can get into questions as to the dependent and independent variable here, which doesn't really this is just a conversion. So we might not know what the dependent or independent variable is, but in any case, we'll get into that in a second. Let's say that we want to plot it the other way so that we have the inches on the other axis. So I'm going to select the same data and then let's insert it. And I'm going to insert again for a scatter. And so here we have it again to make it a little smaller. I'm going to delete the header and what I really want are the axis title. So hit the plus button and axis titles this time. What I want to have, which isn't there yet is I want to have let's do that. I always think about the X first. I want to have on the X the feet, which is not there right now. And then on the Y, I want to have the inches. So I have to flip these two. Now the easiest way to do that it would be for me when I copied and paste them to put the feet on the left and then it would have done it automatically. But if I don't want to do that or if there's multiple columns I'm pulling from, I could then go into my chart design, my data. And when I click on this one, I'm going to say edit this data. And because it's a scatter, I've got the X and the Y. Notice it's picking the series name up from the feet. I'm not too worried about that. I just want to switch the X and the Y's. So the X I want to be feet. So I'm going to click on this one, delete what's in there. And this gets a little finicky. So be careful with it. Put my cursor in here, control shift down and control backspace. So I think that's picking it up. I'm going to say OK. And then on the Y, I'm going to say this one and delete this stuff equals the inches. Control shift down, control backspace. And then OK. So I think it flipped them now. So it's picking up. Right. Right. OK. So say OK. And obviously you have a very similar looking chart, but it's not exactly the same, right? The feet are over here now. You can see the feet going from up to 8 and the inches going here up to 100. Again, you can add the trend line, which isn't much helpful here because you can see it's exactly a straight line. But if we add it just for practice purposes, we'll say bucket. I like to make it a straight line and then bucket drop down and orange. Sometimes it kind of lags when you are working in the charts. So be patient with Excel. Don't get mad at Excel because then if you get mad at it, it just closes up and stops. Responding altogether. And you don't want Excel to get like that. Don't make it get like that. OK. So then what we'll do then, let's actually pull these over here to get them out of the way. Pull them under here, Talves. Mui B to the end. That's perfect location for those charts because they can really be pretty large and express themselves in a meaningful way without interfering with other areas that also need to have their own space for self-expression. OK. So then let's add then another column between the inches. So and I'm going to put the Z scores related to it. So this is our actual data, not what we did before. What we did before was then plot the normal distribution. Now we're going to do it the same similar kind of process with the actual data. So I'm going to put my cursor in column V and select the entire thing, right click and insert, which will insert a column in the middle. And then I'm going to call this Z for the inches. So these are my Z scores for my actual data. So same calculation of the Z, it's going to be equal to brackets. I'm going to pick up that 16 minus the mean for X, which is over here. So the mean for X was this one. And then I'm going to say F4 on the keyboard, making it absolute, closing up the brackets and divided by the standard D. The standard D, which is this one, F4. And I think that's it. We can say OK. So let's go ahead and decimalize it, home tab number group. If you really want to know it, you have to look at the decimals. You don't really know it until you get to see their decimals and then you can understand the Z. All right. Anyways, we're going to say there's that, the feet. And then this is the Z feet. The Z feet, that sounds like a funny name. The Z feet sounds like the name of a punk band or a rock band or something. Here come the Z feet. Like a zebra foot is not a hoof. That would be a hoof, not a foot at all. If it was a zebra foot. Anyways, this is going to be equal to, before I do that, we're going to say, oh no, that's right. Here we go. Here we go. Stop talking about zebra hooves. Get to business. This is going to be one minus each data set minus the mean for the feet now. And then F4 in the keyboard making an absolute closing it up divided by, and then we're looking for standard deviation. Standard deviation is that one, F4 in the keyboard. And that is it. So I'm going to say enter. Let's add some decimals, home tab number, decimalize it. Okay. So there we have it. And then I can copy that down, double click and copy that down. So now we've taken, we're basically looking at our formula. We took each set of data for X and Y, subtracted it from its perspective, its perspective mean, and then divided it by the standard deviation, which is the Z score for the X's and Y's. So now we need to basically sum all that up, multiply it together, you know, and divide by the N minus one. So let's do that this way. I'm going to do it, sum it up here. I'm going to say this is Z inches times, I'll do that with the asterisks Z feet. It sounds like a French guy is saying it. It's Z inches times Z feet. I don't know. I don't know many, I don't know French too well, but I would like to learn it anyway. So we're going to multiply this out. This is going to be Z inches times Z feet. And then we're going to say that's the best I can do for a French accent. It's probably horrible. I have, I don't have much reference here. Home tab. I watched the pink panther when I was a child, the detective where the accent. Anyways, let's add some decimals here. And then we'll, we'll copy this down. And then now, so now we have that. So let's make a skinny Z, a skinny Z for Z column. All right, stop that. Stop with the accent. It's horrible. People have turned off the video. They went to the next lecture and they're not even going to know what to do because they just wanted to hear you shut up. Okay. Here we go. We're going to sum up, sum up the inches times. Actually, I could do this with a formula just to make it fancy. If I want to put this in here so I can change it, I can say equals. And then I'm going to say brackets. I'm going to say it's the sum in the brackets. That's the text and then tie it to what's in this cell. And then say it's going to be the sum of Z inches and Z feet. I probably should put a space in there too. Or maybe like another bracket and a space bracket. And I got to tie that in with a and. So you can get kind of fancy with that if you want to kind of copy your worksheets over. But I'll put that in the outer column equals the sum of this column. Control shift down. So we just summed that up. And then I can, I'm going to take the rest of my formula is I divided. So I did this whole top bit now divided by n minus one. So I'm going to say here in minus one colon. Let's make a small n minus one colon. So I'm going to say this is going to be in just n just n not minus not minus one. Stop trying to figure out what I'm trying to do Excel. You don't know what I'm doing. I'm too complex. My mind is too complex for you to try to read it. So we're going to say this is count just because you got some AI going here. Control shift down. I think you can read my mind. You ain't no human being. You're not. You're not a human being less. You don't know what I'm thinking. Not even human beings know what I'm thinking. Less one. Okay, calm down. Excel wasn't trying to be offensive. In minus one. So this is going to be a one. Let's put an underlying font group and underline. And this will be in minus one without the colon. I'm going to pull this to the outside. 310 minus one. Hold on a second. Equals 310 minus the one. And so let's indent this home tab alignment indent and then indent. And then this will be the R or the correlation. And so this is going to be of course division now because now we have this whole top bit and now we have this this bottom part the numerator and the denominators. So this is going to be equal to the 309 divided by the 309 which comes out to one. If we look at the decimals home tab number, it should be exactly one. So just a quick recap here. We basically to reformat this formula. We're summing up the Z scores. We took the Z scores of X, the Z scores of Y for each data point here. We multiplied them together. Then we summed up all of the multiplied together Z scores giving us in essence the numerator. Then we got to the denominator which is n or the count of the data which we did with this count formula. You could have counted any of these columns. We're not summing up the data here. We're just counting how many rows there are. And you'll recall when we first spit out the data set, we made 310 as our random number generator. That's how many numbers we made. And then we subtract one and minus one. That gives us the 309. And if we divide the numerator by the denominator, we get to one. Alright, let's underline this. Now, there's some neat stuff in Excel that can kind of analyze this data faster for us. It's often useful to do this kind of process because you can then see the Z scores and the relationships with the Z scores. But let's try it this way now. I'm going to make a skinny D over here, skinny AD. And then let's say this is going to be my automatic calculation. So I'm going to go to the data tab. We're going to go to this analysis again and go into the data analysis. We showed how to turn this on in the prior presentation. This time I want to go to the correlation. That's what we're looking at, correlation here. Let's go into that and the analysis. I'm going to find my data set. Now I have to have my data next to each other. So I'm going to select this item here. And so I'm going to pick my data. I have to pick my data so it's next to each other here. So I'll start with the inches and the feet of our original data set that we created. Control shift down and then control backspace. And there's our data set. So I'm going to say, okay, and then columns. I want to in columns and I did include the labels. That's what this is doing here. So I included the labels. So I want to make sure to check that off and then the output range. I'm going to change this to an output range. And the place I want to put it is I'll just say right there in a one. Let's say, okay, and then it spits this little worksheet out for us nice and neatly. So I can then highlight this. I like to make this my format header format again font group black, white. We can round. We can round this home tab alignment. Let's wrap this. And so there we have it. And so now this this shows us nice and quickly the the correlation, right? So here's what we're really looking at. So this is this box with this is looking at here. This number one here is looking at the feet and the inches correlation. So that's the one that we're kind of looking at and it's got a one and that's a perfect correlation. So when we're looking at these correlations, the one would be the perfectly correlated. That's why we have a straight line now just to just to note this kind of data generator also let's just play with another one that we that we might commonly use that will spit out some of the calculations we've looked at in the past, which would be into the data. Let's go into the analysis and data analysis and this time. Let's look at the descriptive statistics one just to get familiar with that. So I'm going to put the input range. I'm going to go back on over again. I'm looking at the two original columns that we had. So I'll pick up these two columns, control shift down control backspace and then grouped by column. I'm going to include the labels again because we had the labels up top and then the output range. I want to put it in is going to be let's put it over here now. We'll put it right there. Boom. And then down here. I want to have the summary statistics. That's what I'm really looking for later on. You might have these confidence level for the mean. We might talk about that later 95. We'll keep that as the confident level. We'll talk more about that possibly in a future presentation, but we'll say, okay. And then it gives us our data. So this should give us some of that familiar data that we've we've been working with and and you could see, you know, the mean, the standard error, which, you know, might not always be applicable depending on the type of data, the median, the mode, the standard deviation, the sample variance, the kurtosis. We might talk about later skewness. We might talk about range minimum and then the maximum and then the sum and then there's the count. There's that 310 that we came up with. Now, this is a great tool, but it's also it's also a little bit limited because it spits it out a one time. You can see that there's no formulas in these and the when it spits out this data, so you can't really see exactly what it's doing in terms of functionality. It's not it's not giving us a function that will change. It's not going to be dynamic. This isn't going to change when we change cells in Excel. So it's a it's a good tool to give to give a kind of a glossary data, but when you're building a complex model or something like that, you might want to do your own calculations, of course, so that you can change and see how these statistics would change as you as you kind of play with your different models. But great, you know, quick, quick spit out of information. Just want to have there so we can check it out. So if we're analyzing data, then of course we might look at the data here. We might we might start if we knew nothing about it. We might do that correlation count format to give a quick look at to see if they're correlated and then we might do our statistical analysis like this. And then once we have that information, once we have these things in practice, that might then lead us to do to build our model in a way that would be more dynamic so that instead of using these tools, we would do our recalculation kind of method so that we can dig into it a little bit deeper might be in depending on what you're doing that might be something that a workflow. So I'm just going to add some formatting here. This is my data inputs. I'm going to say a home tab. I usually put some brackets around this and I'll make it blue because that's what I usually do. If you don't have that blue, it's in the more colors over here, standard color wheel right there. And there it is. I think this is I use that blue because that's what the Excel is fun guy used to do. But I think he's abandoned it for the green now home tab blue, which is fine except that the green gives me traumatic memories of the time the dark times before Excel or when there was Excel, but no one used it and then they still wanted you to do stuff with like a 10 key calculator and you're like why? Why are you doing this? There are tools. There are tools out there these days and you're ignoring them. It's like forcing someone to dig a hole and then fill it back in again for no reason for no reason. Hold on a second. I think I don't want to delete the let's let's do it this way. I'm going to say control shift down put the brackets around it. So I don't use the green because I don't like to be reminded of the of the dark times with the with the spreadsheets, mainly because my handwriting is terrible and you want me to put numbers you want me to put numbers in those little tiny boxes. The boxes are tiny. I can't even see the boxes. Isn't there a way to zoom in? Can I zoom in to the worksheet for trying out loud? I don't see how people lived before the before Excel was widespread, but apparently they did. So there was life on the planet and they had numbers even before the spreadsheets went on to the digital platforms. I know it's tough to believe tough to take in, but it makes us thankful for what we have when we contemplate these things. And so there we have I'll do it one more here. Control shift down down and boom boom and say home tab. Let's make this black and white. All right, there it is. Should I spell check it? I should probably correlation. That's the key word that we're working on and you spelled it wrong. That's the whole point of the prep. Okay, it's fixed now. Get over it. Get over it.