 statistics and excel perfect negative correlation get ready taking a deep breath holding it in for 10 seconds looking forward to a smooth soothing excel first a word from our sponsor yeah actually we're sponsoring ourselves on this one because apparently the merchandisers they don't want to be seen with us but but that's okay whatever because our merchandise is is better than their stupid stuff anyways like our cpa six pack shirts a must have for any pool or beach time mixing money with muscle always sure to attract attention yeah even if you're not a cpa you need this shirt so you can like pull in that iconic cpa six pack stomach muscle vibe man you know that cpa six pack everyone envisions in their mind when they think cpa you know as a cpa i actually and unusually don't have tremendous abs however i was blessed with a whole lot of belly hair yeah allowing me to sculpt the hair into a nice cpa six pack like shape which is highly attractive yeah maybe the shirt will help you generate some belly hair too and if it does make sure to let me know maybe i'll try wearing it on my head and yes i know six pack isn't spelled right but three letters is more efficient than four so i trimmed it down a bit okay it's an improvement if you would like a commercial free experience consider subscribing to our website at accounting instruction dot com or accounting instruction dot think of it dot com here we are in excel if you don't have access to this workbook that's okay because we'll basically build this from a blank worksheet but if you do have access three tabs down below example practice blank example in essence answer key practice tab having pre-formatted cells so you can get to the heart of the practice problem the blank tab it's a blank worksheet so we can practice formatting the cells within excel as we work through the practice problem let's take a look at the example tab to get an idea of where we will be going what we will be doing we are once again thinking about correlations between different data sets to see if there's a mathematical relationship or correlation between them if there is then the next logical question would be is there a cause and effect relationship if there is the next question would be what's going to be the causing factor uh in the relationship so right now we're going to be focusing in on the negative correlation a perfect negative correlation last time we looked at a perfect positive correlation just like with a perfect positive correlation the perfect negative correlation is not something we often see in nature because we're usually looking at two variables that might be moving together but not be perfectly moving together but we want to look at that perfect situation first so some things where that will be the the case are things such as you're you're talking about the distance traveled versus the distance remaining obviously if you're traveling somewhere and you're trying to get to an end location as you get closer and closer to the location the distance that you're traveling is going up and the distance that is remaining for you to get there is going to be going down in a negatively correlated relationship so that's what we'll use with our example to see it we're going to first create the distance traveled and this time we'll just use a random generator to to generate it instead of what we did last time with the positive correlation where we used a normal distribution random number generator and then and notice that the random number generators will not give us really a difference between the correlations but we might be looking at data as we saw last time which tends to conform to a bell shaped curve if we were measuring something in nature like a snake or something like that which also could have a positive correlation to a second data set and this time we're looking at something where if you were looking at a curve related to it or trying to approximate the line because we're looking at just randomly generated data it might conform more to a uniform distribution but what we're looking at is to see the the relationship between the distance traveled and the remaining distance so we'll map that out the distance traveled and then we'll calculate the remaining distance we're going to imagine that our distance is 100 uh miles let's say to get somewhere so 100 minus whatever the distance traveled would be the distance remaining and we'll imagine that we don't know that these two data sets have that relationship right so then our question would be if I had these two data sets could we see the relationship could we figure out the relationship or how can we analyze whether there is one we'll take our mean we'll take our standard deviation we'll look at the charts noting that these charts will be going more towards a uniform distribution instead of a bell curve that we saw last time and then we will plot out our correlation calculation both kind of working the math as well as using excel to spit it out automatically and we're going to get to a negative one this time because it's negatively correlated exactly all right let's go to the blank tab and let's do it enough talk man that's enough talk let's get down to the brass tax but I don't want the brass tax or sharp don't step on them let's click let's take the whole sheet here we're going to right click on the sheet and format the cells and we're going to go to the currency and we're going to go to negative numbers and let's make this uh no dollar sign and start out with no decimals I'm going to go to the home tab font group make the whole thing bold you don't have to but when you're on when you're on camera with a screencast you have to be bold that's what you have to be bold that's what all the greats say in the theater of screencast excel you must be bold so it's like I'm in the great tradition of whatever here we go home tab let's go to the alignment center this thing let's wrap it and let's go to the font group black and white and then let's just generate some random numbers between let's do it between one and 100 we'll just have the random numbers so this is going to be equal to and I'm going to say then we'll have rand between rand between there's our formula we have seen in prior sections the bottom number is going to be we'll say uh should we go to zero let's go just go to let's go to zero comma and the top is going to be 100 so it's going to give us random numbers between zero and 100 we'll say okay I'm going to copy this one down now so I'm going to copy it down and let's go down to something that's not exactly 100 let's like go to 200 something to 206 and it's going to be 205 because there's a header so it should that'd be like 205 data points now the random number generator going back up is always going to keep recalculating it's keep reshuffling so if I want it to be solid and not reshuffle now that I have some random numbers generated I'm going to copy the whole thing select in the column control c or right click and copy put my cursor in c and right click and paste it 123 don't you see and there it is and now I can format it let's just copy the formatting for this one home tab clipboard format and then paint that formatting right there with my paintbrush I'll make a skinny bee skinny bee that's what they call me sometimes some skinny bee uh over here skinny bee all right so then we're going to say that the distance remaining distance remaining I'm going to copy the formatting over here in in this in c uh home tab format painter put that into d so if we're saying the total distance is 100 miles the distance remaining is always going to be 100 minus the distance traveled so if I travel 94 miles the distance remaining is going to be 100 minus 94 inverse relationship happening double clicking on the fill handle to pull that down so if I went 65 miles the inverse relationship would be 35 if we went 14 miles the inverse would be 86 and so on so we can see that relationship of course because we made the data set but if we had these two data sets and didn't know the relationship then we might do our analysis here to see that they're perfectly negative correlated which might give us some indication of what is happening with it if we didn't know what was happening because we're imagining that we didn't actually just make it so that we could make it do what we want so we're going to say let's make a skinny e column and let's say this is going to be let's copy the same this is going to be equal to the distance traveled and this will be equal to the distance remaining let's format paint it here selecting these two home tap clipboard paint brush make the same labels up top all right let's do our standard mean calculation this is going to be equal to the average average and of the distance traveled control shift down and enter so there's our average we normally do let's decimalize it to see if we can get a little bit more understanding about it you got to think about it with a little bit you're just covering the surface and we need to get a little bit in depth understanding about the psychological the psychology of the mean right there and get into the mean's head this is going to be the standard deviation s for the sample remember it's dot s this time we're working on the dot s is not the dot p's and this is going to be for the distance traveled control shift down enter so there's our standard d let's decimalize it home tab number group decimalize and let's do the same thing for the for the distance remaining equals i could copy that over the other cell but i'm going to do it again just so we can do it again average just because it's good times average tab picking up the distant remaining control shift down and enter decimalizing it home tab number group decimalize and then equals the standard deviation s for the sample as opposed to p for the population distance remaining control shift down and enter and decimalize home tab number group decimalizing it all right now we might we might then create a chart from it like we did before we might say well what does this thing look like if i just looked at them one at a time control shift down out control backspace we get back out to where we're looking insert tab let's just look at a histogram and see what that looks like let's make a histogram boom what does that do what's that looking like like and this is going to be a distance traveled okay and notice it should it's if we had a whole lot of data sets this will conform not to what we saw before which looked more like a bell curve but it's going to conform more to a uniform uh distribution because if we had a bunch of random numbers it's they should all be coming up somewhat evenly if we did it on to infinity right so even though we have a different kind of uh it's gravitating towards a nor a different kind of standard distribution or a curve if a line was to represent that it would be the uniform distribution as opposed to like a bell curve it's still going to have the correlation here which is going to be a perfect negative correlation in this case between that data set and this data set which will also have kind of a uniform distribution so i'm going to say control shift down and say insert and say let's go to the charts and make well before i do that let's go here and control shift or control backspace so i can put it where i want to put it and then insert charts make a histogram again and then this one is going to be for did okay now i can't see what i'm typing distance remaining okay so it also looks like somewhat of a uniform distribution they're not exactly the same but they both look like they're going to do this that they might conform to a similar type of curve which again would be an indication that there might be some you know relationship between them if this one looked like a bell curve you know it was more in the middle uh and tapering off to the sides and this one looked more like a uniform distribution that might be an indication you're not going to find uh uh a real relationship in terms of the correlations for example so we'll then say okay well let's just copy this i'm not going to make like a uniform distribution like we did last time with the bell curve i'm just going to say let's do the correlation calculation so we'll say here's our formula that we'll be working with now also before i do that just note that we might then just plot this and say if i select these two and say control shift down and control backspace i might say well let's plot these together which would be the next logical thing to do and i can say let's insert and we're going to go to the charts and we want to do this dots so scatter plot and it's like wow that's correlated all right it's a perfect negative correlation it looks like now i'm going to do the same thing we did before and that i'm going to remove the title because what i really want to know is where the x and y's are on the chart so the whenever we do these charts the one on the left is always going to be the x so i always think of x first so if i was to say plus over here add the axis titles this i can click on the axis title and say equals and i can point to the first one which is going to be the default x and enter and it'll do it with a formula there it is in the formula bar up top i can do it over here equals and this is going to be the distance remaining and enter so there we have that one obviously we don't really need a trend line because they all the dots are falling on a line but if i later on will enter the trend lines here so let's just practice doing that and more options on the trend line i'm going to go to the bucket and i like to make the line like solid like i like things to be solid man it's a solid line and then there we have it okay and then so there we have it now notice that normally we have the independent variable on the x but we might not know which one is the independent variable it might be useful to plot it the other way so we have the distance remaining on the x so you could see it either way the relationship is still there either way so so we so you could do it the other way let's just practice that now the easiest way to do that would be to actually reverse the columns to have the distance remaining on the left but you might have this in a larger data set or something like that you don't want to do that so let's just do the same thing and then we'll then we'll change we'll change the relationship afterwards so we'll select the data insert and then we're going to go into charts scatter plot and there it is there's the same thing but this time the leading the title adding the data labels i want the axes this time the x axis i want it to be equal to the distance remaining and i want this axis to be equal to the distance traveled now it's currently reversed so we need so in order to flip them i select the data chart design up top data area or group select the data and then i'm going to go into my distance remaining edit it and then here we have it we're going to i'm going to leave the name alone x series i want to change that it's currently picking up the one on the left i want to pick up the one on the right so i'm going to delete it put my cursor in the six control shift down control backspace back up to the top close it and i think that's good now the second one selecting it delete it be careful because it gets a little finicky excel gets a little tricky here sometimes and then this is going to be the one that we want on the y which is going to be the one we want on the y is going to be this one now control shift down and control backspace and there we have it so now we've got they're flipped i think i hopefully i got them flipped okay it looks quite similar because of the nature of our graph here and so i'll say okay so you can see it's still negatively trending so it looks it looks very similar but we flipped the axis so it's not like you might think at first that it would flip to be like if you flip the axes that it would flip to be like a perfectly positive correlated no it's still perfectly negatively correlated a downward sloping line and if i added my trend line i could say more options and let's put a line in here let's make it a solid line and let's make it orange orange orange you glad that i added the trend line because i would have no idea how close that was to a line without the trend line that's sour chasm because it is they all land on the line and obviously if we think about this this makes sense because if we're looking at the distance the distance traveled as or the distance remaining as the distance remaining goes up then the distance traveled is going down you can look at it the other way as well if you're looking at the distance traveled as the distance traveled goes up the distance remaining goes down right distance travels goes up to 80 distance remaining goes down to 20 distance traveled goes up to 100 distance remaining goes down to zero so you have that negative perfect negative relationship all right so now let's let's let's see this mathematically with our formula so i'm going to pull this to the side get get it out of the way get out of the way we're done with you we don't need you here and i don't want to delete you though because you're still we might you're still it's still good but uh we'll put you over on the side here all right so then what we want to do is let's make like a large eye over here i'll pull this to the side and make a large eye and then i'll squish these up into the area so they're in there and so they have their space everything has to have their space everything has just like when you're eating you have to put everything needs its own area you can't just you can't just squish everything together you know it has to the plate is there to space out the stuff on it okay so now we're going to pick out i'm going to pick the same data so we'll take this data control shift down and copy it right click and copy and i'm just going to put that same data over here and right paste it i think i could just paste it normal because it's just data no well this one has a formula in it will that mess us up no that's fine that's fine that's okay so i'm going to make this a little bit larger let's let's i'm going to insert a column between the case so i'm going to i'm working my math here which just means we're going to take the z score of x all the x's and the z score of all the y's multiply them together sum them up and then divide by the count in minus one and then so to do that i'm going to do another column selecting column k right click and insert and i'm going to call this is the z the z of distance traveled traveled and this is going to be equal to the way we calculate our z brackets the 94 miles traveled in this case minus the mean which is 53 92 f4 on the keyboard making a dollar sign before the g and the two so it doesn't move down when i pull it down divided by the standard deviation for the sample which is over here and we're going to say f4 on the keyboard dollar sign before the g and the three so it doesn't move down when i copy it down and enter let's decimalize it so we can recognize it home tab number group and decimalize a couple decimals and then we'll just fill handle double click drag it down there it is muy b to the n bn as they say in espanol and so we're going to say this is the z distance remaining and home tab uh clipboard let's do the paint brush and paint brush it so that's good a little wider and so then we're going to do our z score z score z score for the distance remaining this is going to be equal to brackets uh the six so that's six miles remaining minus the mean which is this one f4 on the keyboard dollar sign before the h and the two closing it up divided by the standard deviation of a sample and it's going to be f4 on the keyboard dollar sign before the h and the three and enter and decimalize it home tab number group decimalize and then double click the fill handle to drop it down dropping it down like dropping a beat when i'm making my music video uh and i'm trying to trying to put down the bass beat so then i can play like the guitars on it or something dropping dropping the bass anyway i don't know what i'm talking about so then i'll multiply these two together so now we can sum these two columns up and we've got the distance between each data point and the middle and uh uh divided by so we've got the z right we've got the z's i can sum them up but now i need to multiply them together so this is the z i'm just going to abbreviate the travel times asterix the z remaining i'll just call it rem remaining and then we'll say home tab clipboard format paint to here boom and multiply it out this equals the z distance travel times the z distance remaining enter let's add some decimals home tab number decimalize and double click on the handle to drag it down fill it on down handle and we get the negative numbers given the nature of what we're looking at here all right so that so that is good so now let's make a skinny o column and then i'm going to say we'll do our correlation calculation correlation we have all the tools necessary to get the job done so now let's just take those tools home tab font group uh here and do the job black and white so we're going to say then this is going to be the sum of the z the z travel times the z remaining the sum of that column so we'll say that's going to be equal to the sum and control shift down enter let's decimalize it home tab number group decimalize and so so now we've done the whole summing the top bit the top part now we just need to make the numerator so the numerator i'll just put a subtitle here n minus one colon to indicate that i'm going to do a little sub calculation in the middle and by the way turning these formulas into like i would call this like a tax worksheet kind of format just a worksheet is actually quite useful so it's too because then that lets you see all the steps that are being made in the in the problem and it could be useful in practice when you're if you're trying to put some kind of worksheet together that you might use multiple times and run different things but in case here's an an n the n is just the count so i can count any of these that we want i'm just going to use our count formula equals count tab and i'm just counting the rows so i'll choose the first one control shift down it's just counting them so how many rows were there 205 you'll recall that 205 when we first started the practice problem and generated 205 data points so then less one because that's what it has here in the numerator minus one so there's the one and that's going to give us the n minus one without the colon putting that in the outer column this equals 205 minus one let's do some fancy indentations for the sub calculation selecting these three home tab alignment indent and then i double indent in here home tab alignment and indent again so that's our standard sub calculation colon representing there's going to be a sub calculation we pull that into the center column and then we indented and then when we finished the calculation to get to what we had with the subtitle we put it on the outside so now we have the numerator and the denominator and the outer column which is a standard tax return kind of format which may kind of turn you off with the tax return format but it's actually a good a good like strategy or format when you're trying to work through practice problems even though taxes are not enjoyable for most people so this is going to be the r or a correlation maybe i should have put the r up first here r correlation and so we just divide that out because we got the numerator and the denominator home tab font group underline and let's just divide it out obviously 204 divided by the 204 is one but now it's negative one which means now it's perfectly negative core negatively correlated and so that's what we would have that's what we expect so i was like okay perfect negative correlation that makes sense now let's do that same thing with the with the with with excel just telling us it's negatively correlated so to do that i can go to the ins to the data to the analysis now if you don't have that analysis tool remember you just go to the file tab you go to the options you go to the add ins you have the excel add ins go to that and then add the analysis tool pack the super cool pack of tool so then here we are and if you don't have if you don't have the tool pack you are a tool i don't even know what i mean anyway it's correlation here's the correlation we're going to say okay and we'll find this one and let's put this down and we'll say we're just going to take these two i'm going to i'm going to take the headers in control shift down and our data so we're going we have to have them side by this side you can see i'm going to say backspace i think i've got the right data so it's still summon that up yes all right and then i'm going to say okay and then labels checked off because i had the labels in there so you have to have that if you included the labels when you summed it up or or drew the location of where the thing is okay that wasn't very well said but i will leave it in okay and then where the thing is when you did the other thing let's put it here and say okay and then okay so there we have it so then it tells us it's perfectly negative correlated i'll format this let's make some uh black and white home tab font group black white let's uh center it uh and wrap the text okay so that looks good and then of course i'll just going to do that last analysis too let's put let's pull this underneath now so we have room for our charts can nice go there nice and cozy like underneath so they're they're like not too far away from everything else and they feel like they're part they're part of everything that's going on you know they're not like out in the distance in the cold let's also then do the other data analysis data analysis and do that uh description descriptive statistics just to practice with that because it's a cool tool it's a cool tool let's go same thing same set of data selecting the data and then i'm going to say where do we want to put it i'll tell you where to put it excel you put it right there that's where you put it that's where it goes that's why you put it there summary statistics and then i'll do the confidence intervals although we're not going to focus in on them now say okay so then and then let's make this a little wider boom and it gives you the the mean the standard error the median the mode standard deviation and that of course can give you some indications about the relationships between the data sets but remember that this is kind of static data so it does it's not dynamic so if you're you know you can't change it as easily so it's any case let's do let's just format our stuff now like we normally do i'm going to go all the way back to the beginning control shift down i'm going to make this blue home tab font group make it border blue if you don't have that blue by the way it's in the more color you don't have to do this blue by the way it's that blue right there you can color it whatever color you think is fine home tab font group i like the blue i've grown up with the blue the blue has has in i've find it it has endeared my heart and i uh so i stick with it home tab border blue but that's just me you know you can do whatever you want i do enjoy other colors as well home tab font group blue borders and then let's do this one i tend to like to use the other colors for other standardized areas see and so i have it all worked out in my my logic system in my mind which isn't always that logical it has a it has this plan of which colors are supposed to be where and uh uh that's just so i just that's why but you don't have to follow that and here we go and then this one home tab font group and black and white color coding is not my strong suit it's not my strong suit that's why i use the same colors because then i don't have to think about it let's put some borders around that let's do a spell check on it if we would check the spelling correlation i can't i can't spell that word i can't spell it it's a hard word to spell that's why you have spell check though so that's cool no worries all right