 Statistics and Excel. Statter plots with car related data. Get ready, taking a deep breath, holding it in for 10 seconds, looking forward to a smooth, soothing Excel. Here we are in, first, a word from our sponsor. Yeah, actually we're sponsoring ourselves on this one because apparently the merchandisers, they don't wanna be seen with us. But that's okay, whatever, because our merchandise is better than their stupid stuff anyways. Like our Accounting Rocks product line. If you're not crunching cords using Excel, you're doing it wrong. A must have product because the fact as everyone knows of accounting being one of the highest forms of artistic expression means accountants have a requirement, the obligation, a duty to share the tools necessary to properly channel the creative muse. And the muse, she rarely speaks more clearly than through the beautiful symmetry of spreadsheets. So get the shirt, because the creative muse, she could use a new pair of shoes. If you would like a commercial free experience, consider subscribing to our website at accountinginstruction.com or accountinginstruction.thinkific.com. Excel, if you don't have access to this workbook, that's okay because we'll basically build this from a blank worksheet. So you could start with a blank worksheet. But if you do have access to this workbook, three tabs down below. Example, practice blank. Example, in essence, answer key. Practice tab, having pre-formatted cells in it so you can get right to the heart of the practice problem. Blank tab, just having the data within it so that we can practice formatting as we work through the practice problem. Let's go to the example tab to get an idea of the end result. We're gonna take our data. We're gonna sort our data in a table. Then we're gonna be making scatter plots from the data. Now the scatter plot will typically be taking two sets of data and trying to determine whether there'll be a relationship between the sets of data, the independent variable typically on the X or horizontal, the dependent on the Y or the vertical. We'll create the scatter plot a couple different ways. One way, we will just take the data and copy and paste it and that way we can easily put in the scatter plot. And the second way, we'll try to take our full data table over here and select the cells we want for the scatter plot. We'll also do another one comparing in a similar fashion, miles per gallon and acceleration and have a similar scatter plot. All right, let's go to the blank tab to get into it. Now we have a lot of data over here. So if you don't have access to this data, I believe we got this data from Kaggle so you can go into there and search for the data related to automobiles and possibly find this one there or you can copy the data set from here if you wanna type in the data set. It's quite a long data set or you could do your own research for data set from a similar set of statistics here for this type of data set or you could try to create the data set although this would be a more complex one to do that with. All right, so I'm gonna scroll up top and I'm gonna delete this item. So we have our data on the left. So I pull this data in, it's formatted nicely because it's all in a vertical table for us already. So what I would like to do is put a table around it now and then I can sort my data more easily. So before I do that, however, let me format the cells like I do basically every time. I'm gonna put my cursor on the triangle up top, selecting the entire sheet, right click the selected area, formatting the cells. I usually go to currency and then I go to negative numbers, bracketed and red, get rid of the dollar sign and then you gotta look and say, do I need decimals? And for most of these, we don't need decimals except for possibly the acceleration. So what I'm gonna do is take the decimals off by default and then maybe go into this column and add them back in for this particular column. And so I'm gonna say, okay, let's do that. That's by underlying format. I can then select this data set, G, the entire column, if I wanna add decimals, home tab, number, possibly add a couple decimals there. So now we have it nice and number formatted. Then I'm gonna put my cursor somewhere in the table. And by the way, let's make it bold too. I'm gonna select the entire thing again, home tab, font, bold it. And then I'm gonna center the header rows so that Excel knows that these are headers, home tab, alignment, and center them. You might also wanna wrap the text, alignment, wrapping the text so that now you can see the whole thing and even though you've got it squished up, even though they're kinda squished up here, so you got it wrapped. All right, let's put our cursor in somewhere in the data set. I'm gonna go to the insert tab, tables, and add a table as we have seen in the past. Now the dancing ants are going all the way from A1 to H39. So notice only two cells are necessary to grab this whole table. Also note that when there's missing data, like this blank piece right there, sometimes you might wanna go in and put a zero there or something so that you don't have nothing in the cell because when you have nothing in the cell, that could mess up your tables and whatnot when you create the tables. So I'm gonna say, okay, so there is our table. All right, so we wanna make a scatter plot now and the scatter plots are found in the insert tab and then the charts group and then we've got our dots here, here's the scatters. So now the question is, well, we need two sets of data. So when we were looking at the histograms, we were usually looking at one set of data and thinking about the midpoint of the histogram and the spread. Now we wanna take a look at the relationship between two sets of data. So first let's think about horsepower and the miles per gallon. And so then the question would be, well, do you want the horsepower on the X axis or do we want it on the Y? Do we want it on the horizontal or on the vertical? The horizontal is usually gonna be what we would call the dependent and the vertical, the independent. Although you can kind of see it the other way, too, we're plotting the two against each other. But so the easiest way to do this, if I want the horsepower on the horizontal, making it the dependent variable traditionally, I'm gonna copy that and I'm gonna put that over here in J. So I copied the whole column, I could select the whole column like this and control C or right click and copy. I'm gonna put that in J1 and right click and paste. So there we have it, I can paste it like that. So there we have our horsepower and then I wanna pick up the miles per gallon. So I'm gonna pick up this one, I'm gonna select the entire column, put it in my cursor on B, selecting the entire column. You could say control C on the keyboard or right click and copy. And then you could say control V on the keyboard or right click and paste. So there we have it, so there's our variables we want. Now again, if I wanna put a table around just these variables, I can put my cursor here and insert tab and in the tables group table. And hopefully again, the dancing ants go all the way down to the bottom of the table. They do, I'm gonna hit okay. So there is our table. So now we can sort our table. They're kind of together now. These two data sets are kind of together. So I can then sort it this way. I can say from A to Z on the horsepower or from Z to A on the horsepower. And I can take the miles per gallon and take it from A to Z or Z to A in this fashion as well. So we can sort it in that way. And now if they have them side by side here, it's really easy for me to enter the scatter plot because now I can just simply select these two items. Now, of course, there's a couple of ways you could do that. You could select the entire thing, scrolling down like this. And then if you do that, once you enter the scatter, you don't wanna be down here because then your scatter plot will be input down here and you'll have to drag it up. So you wanna go all the way back up to the top. But you might want, if you have a table, you can kind of, you can click, if you click where you have the dropdown, it's only selecting the data. So I could click that way and it selects the data, but really I want the whole table. So if I double click here, it selects the whole table and then I can drag that to the right, which is what I want, right? I want the whole thing. So I'll include the headers in it. And as I input the table, it should be able to determine what is the header versus the data. All right, so then we're just gonna go to the insert tab. We're gonna go to the charts group. We're gonna go to the dots here and then I'm just gonna do the standard scatter. So there's the scatter. I'll pull that over to the right and let's pull this cell to the right a bit and I'm gonna go over a bit more and we'll make it a bit larger. And so there we have it. So let's check this thing out. So I'm gonna remove the title. And so now it's gonna be important to kind of list. You kind of, the axis these titles are very important here because we've got to be determining what did it plot on the X down here versus the Y. Now you can get an idea for this because if I go to the horsepower and I sort it from Z to A, it's going to 230. So 230, clearly this side of the X is the horsepower. That's typically how it will be if you sort your data with the X axis on the left and the Y on the right in the columns, it'll basically automatically then create the scatter plot in the format that you want. So then we can then go in here and say let's add the axis titles. So these are gonna be important. So down here we have the horsepower. Notice if I click on it, I'm not really inside it. It doesn't really remove the title here. But if I start typing, I can see it's showing up up here. So if I type in horse power, then it types in. And I believe you can actually use a formula too. I could just say this equals the horsepower from there. And then now you have a nice link. So if I changed the title name or something, it would change. So that's probably a more efficient way to do this. And so I could say this one equals, you can see the equal sign up top, the miles per gallon. And so now we've got the miles per gallon. So there we have it, pretty straightforward. Now if I wanted to look at or change the dataset, I can go into here. Now if I'm off of this, just like all of our other charts, I don't have my added tabs up top. If I go into the chart, now I've got the design and format tabs. If I go into the design tab, I've got the adding of the elements, the access, the access titles, many of these also being shown in that plus button. I've got the quick layouts here. So we've got the adjustments of the layouts if we wanted to test out these kind of custom layouts. We've got the color changes that we can put into play. Again, we have different kind of formats of the layout here if we wanted to pick those. If we want to switch the columns, switch the columns in row, and then the select data. So I want to select the data. And so this one might be a fast way sometimes to swap the data over the accesses. But if I go to the data here, we can see the data. So now we've got the miles per gallon. And over here, if I go to the edit and go into the edit. So we have the series name, if this is the X and this is the Y. So the X down here, this axis is here. So if I select this item, I could see what is in the X. If I wanted to swap them, this is one place I can go to switch them. If I wanted to switch them, I can say this would be the miles per gallon on the X and then the horsepower on the Y. But because we put it in this format, it pulled it in properly in the proper format. So that looks good. The X and the Y looks like what we would expect. And then you have your formatting tools over here as well. All right, so let's take a look at this side. If I go into my plus button, I've got the axis. I've got the axis titles. I've got the chart title. And if you have the accesses, the chart title might not be as important. I mean, you could have a chart title, but the most important thing here that you wanna make sure that you're putting the access titles oftentimes, because that's gonna tell people what's actually happening. Whereas in the histogram, and you only have one set of data, you might not need the access titles. You could just put the chart title. So if you need the data labels you can put here. Obviously, if you have a whole lot of data, this is gonna be quite tedious. I'm gonna add it for now just so we can kind of tie it in over here just to check it out. The error bars, so you could see again, they put the little bar and whiskers here. That could be useful if you don't have a whole lot of data. It would be not as useful if you have a whole lot grid lines. The legend, if you have multiple sets of data, then you could put a legend in, but we don't need one, because we only have one thing that we're comparing to at miles per gallon. We have one, I should say, independent variable. And then the trend line. Now the trend line is something that's quite common, right? So usually when we do something like this, we're gonna wanna add the trend line. So that puts a line, an approximation through the data. Let's get rid of the numbers again. Those numbers, that's too much. Data labels, grid line, get rid of those, okay. So now we've got this line through the data. Now oftentimes we're gonna want that line to be a little bit more defined. So if I could double click on that line, or I could go on into the more detail from the plus button, and that gives us our information over here. So notice we have a linear line. That's usually what we want. You can test out the different line shapes here that might fit your data more precisely. And one of the ideas here, of course, is to say, what I'd like to have is a line that I can have a function for, right? I wanna have, it'd be nice to have a line that I can create a function for that fits the data, so that then it's easier to do mathematical kind of approximations if you have a line. A straight line would be the easiest one, but some kind of line that you can have a mathematical equation for would be useful sometimes. Now sometimes it's nice to actually have the equation. So if down here it says display equation, so if I go into the equation, that gives us our equation for the line. So I'll make that a little bit larger so you can kinda check it out. Let's go up top and say that I'm gonna increase the size of that thing. So, and obviously, again, if we have the equation for the line, then we can kinda try to use that equation to make approximations and so on, even though the line is just an approximated line, just basically going through the middle of the data to try to look at a trend through it. So if I go into that line again, the other way you can get into that line is hit the plus button and then in the trends I got that linear line. Now I wanna open it back up again. So let's actually go down here to more options. So now I've got that in place. Now if I go to this bucket on the left, sometimes I would like to see that line as a different color quite often. Let's make it red, which will make it stand out. And then is that the color of the line didn't turn red? Let's go to red here on the line and then here we've got the styles of the line. I'd like to make it an actual line. So now I've got a line instead of the dots. So there we have it. That's some of the main items that we would be adding. Now, just to get an idea of this, if we have the horsepower here, remember this is kind of the independent variable and what you would expect then as the horsepower goes up the miles per gallon goes down, which is kind of what we see here, right? So we're gonna say that as the horsepower if I sort this from A to Z is low. So lower horsepower between 40 and like 60 over here. So horsepower is going up from 40 to 60. If I look at the miles per gallon related to that, we start to go down, right? So it's starting to trend down as the horsepower goes up, which is kind of what you would expect in general. And then as you get the horsepower up to like 100, then the miles per gallon are going down. And then as you increase the horsepower, the miles per gallon are going down. And then when you get way out here to really high horsepower, so if I reverse this from Z to A, and I'm looking at like the 230 plot over here, then you can see for some reason, the miles per gallon is actually a little bit higher than some of the ones prior to that. And you can start to look at something like this and say, well, why would that be the case? For example, I don't really know, right? Because then you'd have to get into the analysis part of it. This is just looking at the relationship of the data. But if I was to guess, if you come up with a hypothesis, you might say, well, maybe these cars are being driven on the road where there's a lot of stop and go on the freeway and whatnot, where higher horsepower cars are not going to be quite as efficient. So if you're driving a really high horsepower car around here in this range, not on the freeway, but stop and go traffic, you're probably going to be much less efficient on how far you're going to go. But maybe if people have really super sports power, very cars, they only drive them when they can actually go at a decent speed and not stop and go. And in that case, maybe the horsepower wouldn't be such a bad issue. It's usually the stop and go. But that's just a guess. Then you can get into the analysis on what's the cause. Now note that if you wanted to make the same graph and say, hey, look, I don't want to pull the data over. I want to just make it from this data set. So I can select the data this way. I can go to the horsepower and I can go to the miles per gallon. I could select these two data sets. And so I've got them selected all the way down. I noticed this one is selected. I just want the data in the table. So sometimes that could be a little bit difficult to do. I could say I want then... Let's try the keyboard method. One way you could do this, by the way, you could select the whole thing and drag all the way down like this just to make sure you're getting the whole data set and not going to the whole column. And then you hold control down and then I can put my cursor here and I can select these non-adjacent columns and select them. So that's the most intuitive way. You still need the keyboard to do it and then you can insert that way. Another way you might try is you could put your cursor. Let's say I put my cursor here and I use the keyboard holding down control and shift and then down arrow. This usually takes you to the bottom of the data set but because I'm scrolling down, that you have these blank spots here. It didn't take me all the way to the bottom. It got stopped at the blank. So I'm going to do it again, control shift down, control shift down until I get to the bottom of the data. And that's why those blank spots are problematic in a data set. And then to get back up to the top, I'm going to hold down control and the back space. Now I'm back up at the top. I'm going to hold down control to get back to the miles per gallon. So now I have two non-adjacent cells selected. Now I'm holding control shift again and the down arrow. So now it's selected the whole data down to here. Now I don't want to be at the bottom when I insert the scatter plot or else it's going to insert it way down here at 395. So I want to hold down control back space taking me back up to the top. All right. So now let's do it again. I'm going to go to the insert tab. We're going to go to the charts and let's do another scatter plot. Not that one, this one. Inserting it. And then so there we have created it again. So boom and there we have it. And it actually selected, I believe, so we'll know it's opposite now. See now you've got the horsepower over here and I'd like to switch the rows, right? And why did it do that? Because I had the miles per gallon on the left. So I want to switch now the X and Ys. So there's a shortcut kind of thing up top. So if I go into here and I go into the data it's got this switch right there but it's not letting me do it because of the maximum number of data series. So I'm just going to do it manually. I'm going to go in and say, all right, let's go to the data and I'm going to look at the data in the edit. And now you can see your X and Y data. So here's your X data. If I click in it, it's the miles per gallon. I want it to be the horsepower. So I'm going to delete what's in there. You could put in equals and then select the item or you can select this thing and then click over here where you want it to go. I'm going to try to select the data, putting my cursor where I have the arrow drop down. So it selects that data set and then I'm going to select this again and that'll allow me to see the whole box. And then in the Y, I'm going to do the same thing, delete the Y and then click this item to select the series I want, which is going to be the miles per gallon. I'm going to put my cursor where I hear see that arrow select and then there we have it. I'll click this again. So now I've switched the X and the Y. I can say okay and okay. So now we have the same basic scatter plot and what we would typically want to do is hit the plus button and we're going to say title accesses. I'll do this quicker because we've seen this before. This is going to be equal to the horsepower. So I'm going to do it this way this time. I'm going to say equals horsepower. You can see the table up top. So now if I change the headers, it'll change automatically access title down here is, I'm sorry, that was the wrong one. Wait a second. This one should be the miles per gallon and this one should be the horsepower. Now I lost that one. This one should be equal to the power of the horse. The power of the horse. And then of course we can add our regression line in the same fashion, adding the trend line. And then I want to look at linear, more options. But I can go to, I can add the equation if I want. Boom. The bucket and then I'd like to make it a line and red. You also have this glow thing over here that I think I put on the example. If you wanted to go into here, they've got some neat kind of shade and then the glow is kind of interesting. So they've got these default glow settings. So then if you wanted that dotted line that's not so pronounced so that you could still see the other dots in there maybe instead of a linear, but then you add that glow to it, then that lets it stand out or maybe you have just the blue line with a bit of a glow to it. So it kind of stands out but doesn't cover anything else up. All right, let's do the same thing comparing now miles per gallon as I'm gonna make a skinny J, put in my curse between J, I and J or skinny I. And so there we have it, a skinny I because I want to be skinny. And so we're gonna say, and then we're gonna say, let's do the same for the miles per gallon as the independent variable, the x-axis and the acceleration. So I'm gonna do this one faster. So I'm just gonna take this whole column, right click and copy. And the easy way to do this is we're just gonna put our columns of data side by side over here. I'm just gonna right click and paste. So there's our miles per gallon and then acceleration. So I'm gonna put my cursor here, right click and copy and put my cursor in Z, right click and paste. Okay, and then I'm gonna select the entire thing, putting my cursor in here, insert. I'm gonna go to the tables tab, insert a table and say okay. So it looks like it got the table all the way down, looks like properly allocated table. I can sort now by miles per gallon, Z to A or A to Z, Z to A, Z to A or now I can sort this way as well. And now if I want to insert a scatter, I could select this whole thing. Maybe I want the titles in there too. If I scroll down, you gotta be careful that it doesn't select the entire thing. So sometimes it might be easier to like select these two, control shift and then down and then control backspace so that we've got the data properly selected and then insert tab and charts and we want to go to the scatter and scatter, boom. So now we've got our trend. So I'm gonna delete this and then we gotta say okay which way are the X's and Y's going? Because I lined them up this way, I would expect this to be the X, right? This would be the X side and you kinda check by looking at the bottom and top so it goes from zero up to 50. So if I sort this from Z to A, like all right, it goes up to 50, that kinda makes sense. This one only goes up to 30. If I sort this from Z to A, that kinda makes sense. So it looks like this one is the miles per gallon which is what we would expect. I'm gonna hit the plus button here and say that I want access titles. Put my cursor in the access title equals the miles per gallon. No, this one is the acceleration and then this one equals the miles per gallon. We can add a trend line. So now this one is trending upwards. You can see that was what I was trying to get a trend line going the other way. This one trending down, this one's trending up. We're gonna say plus button, trend line. I want a linear trend line, more options and then I could say that I would like to see a formula for it. So let me check out the formula. There's our formula. Can make that a little bit larger. Home tab on the formula, increase in the size of it, pulling it up somewhere so we can actually see it and then back to this trend line and let's say that we want to make it, let's say we just want to make it glow, right? If I leave it at that same like color but then I go down here and say, let's go to the glow and give it that yellow glow. So now you've got that trend line that's not really covering anything up but you can still kind of see it and maybe I'll do a different glow. We saw yellow before, let's do an orange. So there we have that. And so there's that relationship. Now let's build this one more time but this time taking the information from the table on the left. So now I'm gonna go to the table and I'm gonna select miles per gallon. So I'm gonna try to do the easiest way to select these two columns at the same time. The most intuitive way is to put your cursor here and drag all the way down because you want to make sure that you select all of the data and not the entire column. But I think the faster way when you have a lot of data, I'm gonna close this out, is to put your cursor here, control shift down, taking you to the bottom of the data, holding control backspace, taking you back up, then I'm holding control down to select the other cell. So I have non-adjacent cells, cells that are not next to each other selected at the same time and then holding control shift and down to get the two columns. You might also try, by the way, I'm gonna go control backspace, putting your cursor just on this one and then this holding control and this one. So now I've got these two non-adjacent cells by holding control and then hold down control shift and down, but then it really only hits the second one. So you can't even control backspace. I'm gonna put my cursor here. And then control shift down to get both of them holding control backspace. All right, so you can experiment with different ways to do that, but going back up, we're gonna go to the insert tab, the charts and make the good old scatter. And before you enter the scatter, by the way, you might wanna position where you want it to go, right? So I might try to kind of move over here, even though I still have those other ones selected, so that the scatter hopefully will be generated somewhere over here, right? So I can go insert and then charts and then give me the scatter. And so now it's, I can just drag it right here. So I kind of put it in the middle of the screen where I'm currently viewing, you know, which is nice. And so then I can just do my standard thing here. I'll do it fast cause we've done it a few times and we're running long on time. So we're gonna say that let's add some axes. So the chart title, the axis titles. So this is gonna be equal to, and I'm gonna say the miles per gallon. And then this axis, scrolling to the right, this axis is equal to, we're gonna say the acceleration, acceleration going back to the right. And then we can add a trend line. If we so choose trend line, linear, more options, we want this to be, we wanna see the formulae down here. Give me the formulae. There it is. Let's make it larger. Home tab. I gotta be selected on it. Home tab, font group, increasing the font size, grabbing it, pulling it up a bit, clicking on that trend line. So now I have the trend line information on the right and I'll go to the bucket. And this time let's change the color of the trend line to like orange maybe. And then make it glow, middle tab, glowing. It's nice when stuff glows and it's the green glow, green and blue. I think that's not good for color blindness. Is it? I don't know. I don't know what I'm talking about. And let's make it, I did orange before. Well, we'll just do the same. So I'm gonna close that out. So that's the general idea of the scatters.