 Statistics and Excel. Misleading Histogram. Got data? Let's get stuck into it with statistics and Excel. You're not required to but if you have access to one note we're in the icon left-hand side one note and Excel presentations tab of 1050 histogram misleading data. We're also attempting to upload our transcripts so if you go to the immersive reader tool you could change the language to whatever language you so choose and either read or actually listen to the transcript in multiple different languages using the timestamps to tie them into the actual video presentations. Desktop version of one note here we have our data our information on the left-hand side where we are imagining we took a random sample of the population completely random and either test for or asked them how many ovaries they have. You can imagine a similar type of situation where we take a random sample of the population and either test for or ask them how many testicles they have for example. Now this information on the left-hand side we are imagining is sorted by the people that we asked in the random sample so it's not sorted in any way that's particularly useful to us however we could probably derive some information just from sorting through or looking through this list of data but let's go through our normal procedural analysis when we have our information the next thing we will typically do is sort that information so if we sort that information now from lowest to highest we end up with a lot of zeros we get a one here and then we have a lot of twos so when we just sort it like that that could probably in and of itself give us some information about what is going on here now the next thing that we might do is of course take our statistical stats the normal one being the average you'll recall that the average will be taking the sum of the entire thing and then dividing by the number of items that are there we can use an average formula in excel which would simply be the average of this series of data but what it's actually doing is adding up all the data and then dividing by the number how many data items that we had and that gives us one point oh six so it's an average like somewhere close to one around one now clearly if someone gave you only that data point and said that they knew something about human beings because they took a random sample of human beings they tested how many ovaries they have and they come to the conclusion that human beings have around one ovary one point oh six ovaries right you can imagine that that might be a little bit misleading if you're only looking at that one piece of data and if there's a doctor unfortunately the state of the United States medical area there's a lot of good doctors out there but there's a few of them you can almost imagine them saying hey look we've got to implant an ovary into you here because you're like you're short an ovary should have an average of one ovary right you can imagine I can I could unfortunately I can kind of imagine that happening if you were to then plot this in a histogram though then it would look something like this and this might give you a more a bigger picture of what has happened if you take the average like okay one ovary but if I plot it then now I'm saying okay well in this bucket I've got zero to one and this bucket two to three and then one to two in the middle so you'll you'll note that most people when they see histograms they start to imagine that the histogram is supposed to if you take a larger sample size the histogram will get more and more like a like a bell shaped curve but that's not always the case there's that's only for certain types of data that that sometimes it will start to mirror a bell shaped curve which we'll talk about later but you might have you know any number of shapes that the data might take and that's why you need to kind of look at the spread of the data in this case the central we don't have the center point here and everything's happening happening out on the the sides of the graph which makes sense of course because obviously what we're looking at is a test that's really kind of determining a man or a woman right that's what the test is really looking at so we're really basically getting a spread in between men and women is the general would be the general idea so clearly if there was a medical procedure or something that was said based on having around one ovaries that we need to give you an ovary or remove an ovary or something like that that would be a problematic conclusion from you know a misuse of just looking at one angle of the data this is clearly a very extreme obvious example of this kind of thing but but that's the point