 statistics and excel histogram versus bar chart got data let's get stuck into it with statistics and excel first a word from our sponsor yeah actually we're sponsoring ourselves on this one because apparently the merchandisers they don't want to be seen with us but but that's okay whatever because our merchandise is is better than their stupid anyways like our trust me I'm an accountant product line yeah it's paramount that you let people know that you're an accountant because apparently we're among the only ones equipped with the number crunching skills to answer society's current deep complex and nuanced questions if you would like a commercial free experience consider subscribing to our website at accounting instruction dot com or accounting instruction dot think of it dot com you're not required to but if you have access to one note we're in the icon left hand side one note and excel presentation tab 10 30 histogram versus a bar chart we will also try to put the transcripts in one note as well and if you go to the view tab the immersive reader tool you can actually change the language on the transcripts if you so choose and either read or listen to the transcripts in multiple languages desktop version of one note here we have the information on the left hand side our data imagining it to be yearly salary or income of employees at a corporation we've already ordered the data from smallest to largest in other words instead of being in order by like alphabetical order of the employees we've taken the data and put it from smallest to largest which is a typical thing to do as a first step in sorting the data when we're trying to pull information from it now we want to think about the relationship between a histogram and a bar chart so we've taken a look at some histograms and prior presentation here's one that we built from this data set in a prior presentation so we want to actually use the bar chart and think about like a bar chart and how we can use excel's function of a bar chart to create in essence a histogram and that will give us a better sense of kind of the relationship between what the similarities and differences are so up top we just have some of our standard data up top so you'll recall that the mean or the average is can be done in a manual method or it can be done with a function so if we used an excel function in order to take the mean or the average we can use the average function so equals average and then we would just select the data so here's the data in the table and that would give us the average if we did the manual calculation we can then say that it's going to be the sum meaning we add up all the data to do that we can use the trustee sum function equals the sum of all the data and then we can count the data meaning counting one two three four all of the data and in excel we can use the trustee equals count function to do that and then select all the data which is nice and then of course we're going to divide that out if we used excel we would take this cell which is referenced by the d5 divided by this cell referenced by d6 3630 400 divided by 51 gives us the average 71 184 which matches what we got with the function and then if we take the median we can do that in excel by saying equals the median or the second quartile we could use quartiles if we want but the median is more common and if we were going to do it in a manual method remember it's just like rocky's trainer uh the boxer rocky's trainer if it's the old movie but you hit the one in the middle when you see three of them out there you hit the one in the middle that's the advice of so so rocky's trainer was a was a fan of using the median hit the one in the middle so then if we look at our our histogram the major components of the histogram are the buckets down below that we've discussed in prior presentation so you can see here the definition of the buckets are now 2000 apart and we have 15 of them so one two three four five six seven eight nine 10 11 12 13 14 15 buckets and they're separate and they're spanning 2000 55000 to 57000 so you could see that in this range we only have the one data point in it and then we have nothing from 57 all the way up to uh to what 67 and then at 67 to 69 we have seven 69 to 71 19 71 to 73 15 73 to 75 seven now note that if you were to try to select this data in excel and create a bar chart instead of a histogram then you're going to run into problems because the bar chart what it wants to do is try to put all of these data sets kind of on the x axis and then and then basically say how often they each one of them appear but you can't have all of these separately listed uh because because uh they might you might not have two numbers that are the same right like if you if you had a small set of data where you had a lot of repetition and whatever it is that was coming up then you could use the bar chart and it would basically give you the proper x axis but here what we want to do is say a range we have to take this information and make a range from it like 55000 to 57000 so that we have the appropriate bucket and then we can have a something like a hit this histogram looks a lot like a bar chart so you might say hey I could do that myself like I could just make my own buckets I can make these ranges and I can count how much are in the range and and then I can use the bar chart function to create a histogram and that if so let's see what that would look like that will give you an idea of the similarities between again a bar chart and a histogram and also sometimes it's useful to do that process to actually make your own buckets create your own histogram because if you wanted to put like two histograms on top of each other sometimes that's easier to do with like a bar chart than a histogram so let's see how we can construct this this histogram using like a bar chart method if we take our information on the left hand side we could start with the smallest point which would be 55 so the 55 is the first bucket I'm going to bring it down to 49 however just for that first bucket and then I'm going to go to to the 57 and I'll show you why when we do the function and then from there we're just going to keep on going up by 2000 so you can see these this range goes from 57 to 59 to 61 to 63 so all I do with a formula here is just take the prior one starting at this point 57000 plus 2000 to 59000 plus 2000 to 61000 plus 2000 to 63000 and if I had a formula here that just said equals the one above it plus 2000 I can drag that formula down and it'll give us uh it'll be able to to give us all of our numbers so these are the beginning points except the start the first one at 55 I started it at 49 so I should have said these ones 55 then at 57 to 59 to 61 to 63 now I'm going to take the ending points so the ending points end at 57 so there's the 57 and then again I'm just going to add 2057 plus 2059 plus 2000 to 61 in excel I would do that by saying equals the cell above it plus 2000 and then I can simply copy that down that's not the only way you could do it you can also type in a couple of these numbers it'll make a trend and then you should be able to copy it down and excel will be able to see the see the relationship the pattern so in any case there's the two now now in excel if I'm going to if I'm going to use excel to do this I would like to then get a bucket that's labeled like this 55000 comma 57000 or 55000 to 57000 something that looks like this you can type that into excel one way to do that is simply of course to type into the cell 49000 to 57000 note however you have to be careful when typing numbers or when you're trying to type something in that has like an equal or a plus or something like that where you don't want it to actually do a function a calculation or change it to a number format but simply put in what you typed so sometimes you need to put an apostrophe before you type something in to tell excel hey look I just want you to put what I typed in there I don't want you to try to make a formula from it however even if you type it in there it's still kind of tedious to do so we've put these two in here pretty quickly these two columns we were able to copy the columns down even if we have large columns what we would like to do is make a formula so I can copy the formula down so when I have formulas that have like text in it they can look pretty complicated but once you do them a few times they're not that bad so if you wanted to do this with a formula instead of typing it it would look something like this equals we're going to pick what's in cell in this cell which is represented by c17 here and then when I want to put a text field I have to put quotations around it so this is going to say I want you to take that what's in this cell I don't want you excel to do a formula I don't want you to add to it or subtract or anything like that divide but instead I want you to just combine it that's why we have and I want you to and take that cell and and then in quotations type the text of just a dash that's not a minus sign I don't want you to subtract the two cells excel I want you to just put a dash between them and then and the other the next piece which is d17 that represents this cell so it looks a little complicated you could just type it in there you can see in theory what we're doing but in excel it would be great if you can get faster at doing these things then of course you can do these conceptual concepts quickly in excel which helps you to really understand things faster because you're able to practice with them faster so then we can then do a formula to try to say what's in that bucket now clearly I could just go into my data and I could say well the 55 thousands in that bucket and then I can go to the second bucket and say it's 57 to 59 and I can go in here to 57 and count the ones up till I get to 59 and so on and so forth but that's tedious so clearly we want to be able to do that in excel so this is another complicated looking formula but if if you're able to get the formulas down then it's quite useful so you can see what we're doing in concept and then if you if you were to do this in excel you you can do a count ifs formula so what we're telling excel I want you to select this entire range and I want you to count the numbers meaning I don't want you to add them or subtract it just count them if and there's an if with an s because we want you to have multiple conditions there's going to be multiple conditions what are those conditions going to be well I want you to count everything that's greater than 14 49 000 up to and including 57 000 so so those are the two conditions now note that when I start to add this formula down you might say well wait a second I want stuff from 49 000 to 57 000 57 000 to 59 000 59 000 to 61 000 what if for example the number is exactly 59 000 which bucket is it going to be in is it going to be in this bucket or is it going to be in the bucket down below and the answer we typically going to want here is I want to include it in the upper bucket so I'm going to say this bucket is 57 000 to 59 000 this bucket is actually everything above 59 000 up to and including 61 000 and by the way that's why I put 49 000 here instead of 55 000 as the lower range because I want it to be below the lower the lower range so I can say that I want you to pick up everything that's above this number and if I put 55 000 then it would be right on the line equal to so I want it to be 49 000 on that first one and that's why we have that difference and I'll have that difference in this first bucket than I have up here okay so let's do that then I'm going to say I'm going to say count if's and then we selected the range now this is my argument so we said count if's I selected the range with this argument there's the range and then comma to the next argument so now because we're we want to have greater than we have to use the the quotations again so we've got to use the quote greater than quote and then we're going to combine it with and therefore we need the and the uh c 17 and so that's going to be saying I want everything that is greater than what's in c 17 the c 17 representing this 49 000 so this is the table being selected comma you want to count everything that's greater than uh what's in c 17 which is this 49 000 now if I just hit enter there it would take everything it would basically count the whole thing I think it's 51 numbers because it's all greater than that lower limit but we have the second condition comma condition number two which is going to select the entire table again so we're just redoing the thing entire table comma the condition here quotation marks needs to be less than or equal to so notice that that first condition was I just needed to be greater than this number I'm not saying or equal to I just greater than this one I needed to be less than or equal to so we have the two symbols and then and and then the d 17 which represents the 57 000 so that formula although complex looking gives us our pop our calculation quickly and then we can copy that formula down and it will do this it will do the same uh calculation pulling the relative references this number in this number in other words if I go down to here for example we have the same formula that we just copied down and it says table count the same range if it's greater than c 23 instead of c 17 because it it took the relative reference down here so that's the 67 000 and then comma the table second condition less than or equal to d 23 instead of d 17 and d 23 is the 69 so it's saying take anything that's above the 67 000 not uh and then up and two and including 69 000 so you can see by doing that we've created our own buckets so now we've got our buckets in table format so 1 7 19 15 7 1 and 1 1 7 19 15 7 1 and 1 clearly in table format even if I understand exactly what's happening it's not as visually easy to see than when I make it into of course the histogram so here's a histogram that we made with basically the bar the bar charts so now this one is you can see we just basically selected this data here selected these two datas and made this the the x axis being this list of numbers and then and then you've got our histogram that was created with the bar chart and you can see if I compare that to the prior histogram this is just a copy of the histogram we had up top it's basically the same kind of calculation now when you do the bar chart normally like the the these it's not as wide so these are skinnier oftentimes so you can but you can customize it and make them wider eliminating the space in between the bars if you so choose and you have a little bit more flexibility with the bars it's easier to kind of flip the x and y axis and stuff like that and again with the bar charts if you had a second set of data and you want to put two kind of histograms on top of each other and try to color them differently and make one color like striped or something like that so you can see one histogram on top of the other then those are things that sometimes it's easier to do when you convert the data so that you can make a bar chart out of it now note that this whole process up here of making this table you could you could make that like this bit a little bit easier using an array function so we could do a similar process let's do it well let's just reveal this so this is just the end number so instead of having the beginning and end of my buckets of these buckets we just have the end number so I just said 57 000 plus 2000 is 59 000 I copied that down 2000 2000 increase all the way down and then in order to get my numbers over here instead of using this more complex formula we can use an array formula which is a it's kind of a newer a newer kind of technique which has that spill factor so in this cell we said equals the frequency and then we picked up our table this refers to the table and then and then uh basically we uh we then referred to our d 33 to d 47 representing our data set and then when you hit enter in the first cell it'll basically spill out the numbers down below it so so whenever so if I was to click in any of these numbers below it it'll be referring to the lead cell the cell that I basically put the the array function in so these are kind of fancier newer ways to do it I still think it's useful to know multiple ways to do it but really neat really neat quick ways to be able to put these together and sometimes they can do it a lot faster but the point is that the histogram and the bar chart are kind of related in some ways and like I say it could be useful to actually create your own buckets sometimes uh and then create your own bar chart type uh histograms in order to increase the flexibility of what you might be able to do with the histograms