 Statistics and Excel. Correlation calculation with strange result. Get ready, taking a deep breath, holding it in for 10 seconds, looking forward to a smooth, soothing Excel. Here we are in Excel. If you don't have access to this workbook, that's okay because we'll basically build this from a blank worksheet. But if you do have access, there's first a word from our sponsor. Yeah, actually, we're sponsoring ourselves on this one because apparently the merchandisers, they don't want to be seen with us. But that's okay whatever because our merchandise is better than their stupid stuff anyways. Like this CPA thinking cap, for example. CPA thinking, CAP, you see what we did with like with the letters, and this CPA thinking cap is not just for CPAs either. Anyone can and should have at least one possibly multiple CPA thinking caps. Why? Because based on our scientific survey of five people, all of whom directly profit from the sale of these CPA thinking caps, wearing this CPA thinking cap without a doubt, according to the survey, increases accounting productivity 10 fold. Yeah, at least. Apparently the hat actually channels like accounting energy from the quantum field ether directly into your head, allowing you to navigate spreadsheets faster. It's kind of like how in like the matrix when Neo learns kung fu, or at least that's what the scientific survey saying. So get one because the scientific survey participants could really use some extra cash. If you would like a commercial free experience consider subscribing to our website at accountinginstruction.com or accountinginstruction.thinkific.com. Three tabs down below example practice blank example in essence answer key practice tab having pre-formatted sales so you can get to the heart of the practice problem the blank tab it's a blank worksheet so we can practice formatting these cells within Excel as we work through the practice problem. Let's go to the example tab to get an idea of where we will be going what we will be doing we're looking at correlations once again two different data sets to see if there's a mathematical relationship or correlation between them are the data points moving in alignment with each other to some degree and if there is a mathematical relationship the next logical question would be is there a cause and effect relationship if there is a cause and effect relationship the next logical question would be what is the causal factor in that relationship. This time we're looking at some two generic data sets with few data points so we can focus in on the heart of what we're looking at which is the unusualness of the results of this particular scenario and we'll do our normal kind of calculations for the correlation and we'll get a result that's a little bit less than what we would expect from it the point being that we can't depend totally on any kind of statistics or one statistical calculation generally we still have to analyze what is happening so that we can get the most out of whatever data that we are looking at. So let's go back to the blank tab and build this out. So I'm going to first format the entire worksheet by selecting the triangle up top right clicking on the selected cells we're going to format those cells as we do every time I like to make it currency negative numbers bracketed and red no dollar signs no decimals to start off with we're doing our generic data I'm going to say control up and scroll in a bit generic data so I'm just going to call it x and y x and y let's make the whole thing bold too because I like to work I like to be bold not all the time but just when home tab font group bold when you're recording that's with it that's what my producer tells me gotta be you gotta be bold when you're when you're doing your screen cast so we're going to go to the we're going to go to the home tab let's go to the home tab alignment center this let's also hit the drop down and make this black and white all right and then so there we have it and I can make them a skinnier let's make these skinnier too we don't need it that wide because we're just going to have some numbers in there and then we'll list this thing out so the first data set is just going to be one two three four five and then zero and the second data set is going to be one two three four five and seven so if you were to look at these two data sets and you're trying to say is there a mathematical relationship between them just looking at the two data sets you would say well it looks like kind of they are moving together for a much of the much of the process here so that would be our our general sense if we were just to like take a look at the data let's do the calculations with it let's make a skinny c over here skinny c there it is and let's copy over the x and the y headers x tab this equals the y tab format paint let's just format paint this one over home tab clipboard format painter paint that on over here and do our mean calculation and our standard d and so this is first one's going to be equal the average just like we've seen before so i'll do it fairly quickly because this is not new stuff if you've seen some of our prior practice problems on this and if you haven't then you should go take a look because they are excellent and then we go to the home tab we go to the number group let's decimalize this thing let's just copy it to the right this time with the fill handle dragging that fill handle to the right which will format it and give us the relative formula let's do the standard d now standard d for the sample standard d.s so this is going to be tab picking up the x data control shift down enter decimalizing it home tab number group decimalize fill handle grabbing that fill handle dragging the fill handle to the right and the standard d for y calculating as you can see let's go ahead and select these two cells up top and skinnerize them skinnerizing the cell and then we will copy over our correlation formula so we'll do our correlation calculation right here so we'll bring this down and put it right there so we can see what we are doing and that's too big let's make it a little bit smaller okay i'm going to make a skinny g now skinny g skinny g this is going to be x well let's just copy over the x and y data do our normal calculation on this i'm just going to copy it over paste it right here and let's insert another column i'm going to select the entire column y or i right click and insert and then we're going to say this is going to be z of x so we're going to calculate the z scores calculate the z scores for all the x's all the y's sum them up divide by the count n minus one the count being one two three four five six minus one all right so we're going to say this equals the z scores calculated brackets this number the first x minus the the mean i'm going to say f four on the keyboard dollar sign before the letter the number the e the two so that when we copy it down that cell doesn't move down closing up the brackets and dividing it by the standard d which is the one point a seven f four on the keyboard once again dollar sign before the e and the three so when we copy it down that don't move down let's hit enter let's decimalize home tab number group decimalize double click in the fill handle to drop it down let's do the same for the y's y because that's how you do stuff this is going to be the z of y let's say and then let's format paint it home tab clipboard format paint brush format painting calculating the z scores for all of the y's equals brackets the first one minus the mean for the y f four on the keyboard dollar sign before the f and the two so that you don't lose that number when you move it down let's divide it then by the standard d for the y f four on the keyboard dollar sign before the f and the three don't you see let's hit enter and put some dots with some decimals here home tab number group decimalize double click in the fill handle to drop it down okay now we've done the z's for the x and the y's we'll now multiply each of those z's together so we're going to say z of x times z of y and i'll format paint brush it again home tab clipboard paint brush painting it down and then we're going to say this equals z of x times z of y decimalize it home tab number group decimalize and once again dropping it down with the fill handle by double clicking and then let's make all of these a little bit more skinnerized so i'm going to go from h to l h to l so we're going to go there we go and so let's make them a little bit thinner and so okay so then uh we have the whole top bit but i need to sum up this column that'll be the numerator and then the denominators n minus one so let's make a skinny m and do that last calculation in the end in the end so i'm going to call this calculation the r or correlation with an e even though it sounds like an a when you do it there's this r re even though it's sound when i say it i say it with an a sound but that's english spelling for you whatever sum of z x z x times z y you just don't say it right you don't say things good that's why whatever i say things good this is going to be the sum let's put in the outer column equals the sum brackets of these i know how to speak and i speak american here anyways home tab let's add some decimals did i sum this sum in this up home tab number group decimalizing it and well it comes out to zero that's the point of our practice problem i got distracted so it sums up to zero and then we're going to say this is going to be the n minus one let's put some a colon for our sub calculation and then this is going to be in space space this is going to be uh this is going to be less one and this is going to be n minus one so n is going to be the count so we'll say equals count brackets and then i'll count these oh hold on a second so we'll just count those meaning there's one two three four five six rows less one let's put an underline here for formatting sakes for formatting sake can't for formatting sake this is going to be six minus one or five let's put an underline under it for formatting sake and then let's indent these for fancy formatting sake home tab alignment indent home tab alignment indent and this is going to be the r correlation which will be equal to the zero over the five which is of course zero even if we decibelize it so that's the point that we want to get to we're like okay wait a second it comes out to a correlation of zero which would mean zero correlation but when you look at it intuitively we probably want to say well it looks like these two things are moving together quite consistently until of course that last point and that's where we have to we can't that's where we have to say we can't just rely on any one statistics oftentimes uh because we have to we have to then look at it from different angles so if we were to look at this in different angles we would say i don't know if that's telling me the whole story about the correlation or relationship between these data points given the fact that this outliers throwing everything off obviously if i remove the outlier they would be perfectly correlated right so and that's just the the the beware story let's go ahead and and plot this thing out if i grab this and i go into my insert up top and we go into the charts and we do a scatter plot you can see what's happening here so we can see uh what is happening uh hold on a sec i think i messed it up you messed it up okay i didn't want to grab that excel you know what i'm doing stop being stupid oh okay so let's get rid of this and then let's label this thing plus button here access titles starting with the x which is always the left data by default so there's the x and then the y which is equal to the y now remember that here we know i don't know if there's a causal factor or anything because we have random data that is happening here so we're just looking if there's a mathematical relationship between them and we wouldn't really know you know which data set should be on the x versus the y you can plot it either way but usually the independent variable we think about on the x but if i plot this out if you look at it we're now saying that looks like a straight line that looks like a perfect kind of correlation but if you add the uh this one out here mathematically our correlation calculation comes out to zero correlation even though there's just a clearly a fundamental uh relationship between these lines so if this was a real data set in real life would we really want to say that there's no relationship between these two data sets probably not there's probably something going on with this one data point that messed everything up there's you know so that's why again anything we're looking at with statistics we can't really look at it just from one angle typically we want to look at it from pictorial standpoints uh data standpoint analysis and other things and this is what often people frustrate it frustrates people with statistics because they say well that makes it confusing and you can't really nail anything down that way and whatnot because people can lie about it and stuff but again that's the same with words like anything that you communicate with it's the same people say stuff and you can't just listen to what they say because a lot of people don't say what the what the truth or or even if they're trying to say the truth that may not be right so you have to kind of dig down and that's just the way to so any case if i add a trend line over here you can see it's a perfectly straight line it's like what because it should you would think it would be uh there'd be some kind of relationship but no and then i'm gonna hit the drop down straight line let's make it orange like i normally do i like to have the trend line be orange because that's the trendy thing for lines these days if you're going to a the party a line party then you should be dressed up in orange because that's what trend lines that's trendy okay we get it so there's that let's go ahead and do this calculation then uh uh with our data tab set so we can do it this way so here's our data analysis if you don't have that it's in the file tab it's in the options it's in the add ends i'm doing this fast because we've seen this before the excel add ends and then there's the tool pack if you want to run with the cool cats you need the analysis tool pack all right so then we're gonna say then uh we're gonna say data and add the tool pack and we want to calculate the correlation data points are gonna be these data points and i'm gonna say okay i pick the label so i'm going to pick the label and check that off and then where i want to put it not there i'm going to put it where does it need to be it's going to be right there in r r1 say okay and then boom so you can see there's a zero correlation between the x and the y here now if i did the same one less the last data point let's see what happens right i'll get a much different result so i'm gonna say data output data analysis correlation and let's say that i take my data but i pick it up from here to here i don't include that last point what do you think's gonna happen let's see labels output where do i want this one needs to be in a different spot you can't put it on top of the other one excel put it over here and then okay so now you get a one which is a perfect correlation so this is this is the the the without the last point so total data set without with without last point maybe i should put this down here and or maybe i should put this right up here that's what i should do and then i'll make this a little bit wider and there we have it okay let's make this smaller okay so that's the point that's the point so we need to we need to we need to look into it a little bit uh deeper oftentimes still can't just rely on so if i for example if i was to use this thing and i just spit out this number and that's all i did then uh then i would say that was zero correlation but if i just look at the data with this small bit of data say well that like sounds kind of funny if i grabbed the data i'd be like huh that looks like it's straight line but it's got a zero coral that seems kind of weird and then if i actually did the z scores like this i'm much more likely to say it looks maybe there's something more going on that we should be looking at all right let's make it nice let's put some blue borders around this home tab font group i'm going to go to the blue drop down if you don't have that blue drop down it's in the standard blue and let's put some borders around it and then we'll do that all the way across blue bordered border blue border blue border blue and let's do it here too for the border blue border blue and here too for the border blue border blue and then here too for the border blue let's make this top bit black and white let's go to the black drop down and the white for the the header and this one's going to be black, white for the header. All these could probably be thinner because you just have a one in there that doesn't, ones don't take a lot of space. And then I can select all the skinnies. I'm holding down control and then letting go of control, holding down control so we can even out the skinnies. So we have uniform skinniness. All the skinnies should be uniform. Okay, and then I can check the spelling because I think I got correlation correct this time. Look at that. I know how to spell correlation now. You spell it stupidly because that's the way that English does it with an E there when even though you say it with an A because that's how that's how you say it in American. I speak American. Anyway, you speak English. I speak American, okay whatever.