 Hello there, we're now going to discuss histograms. So what a histogram is, is it's a graph in xy form of a distribution of data in a data set. Basically we're going to take frequency tables or frequency distributions and create histograms from them. The graph consists of bars of equal width drawn adjacent to each other. So it's kind of like a bar graph sort of, except the bars are next to each other. The bars still represent frequency of various categories, in this case the categories are numeric. The x-axis will represent your data and the y-axis will represent the frequency. Histograms are used to analyze the shape of the distribution of data. So for instance I have a frequency table that has exam scores, frequencies and relative frequencies. I have a histogram that shows these exam scores. Notice the y-axis has the frequency and the x-axis contains the score. So if you look between 50 and 60, which is your first class, 50 to 59, notice the rectangle goes up to 1. That's because the first class has a frequency of 1. The next class goes from 60 to 70. They labeled my x-axis using lower class limits here. The way you label your x-axis can kind of vary based on, once again, textbook preference or teacher preference. So 60 to 70 had a frequency of 4, notice that's what we had in our table, and so forth. So you see an example of a histogram on your screen. A relative frequency histogram is a histogram where y represents the relative frequency. So the y-axis will contain the relative frequency, remember that's the percentage of occurrence of a specific data value or a group of data values among the data set. So for instance, class 1 is 60 to 61.9, we're talking about athlete heights. The relative frequency is 0.05 or 5%. We're going to do decimals in this case for our y-axis, 0.05. So if you look at your first class, in this case they used the boundaries, so they used class boundaries to label their x-axis, class boundaries. They used class boundaries to label their x-axis, remember if your data go to one decimal point, your first class boundary would be 60 minus 0.05, which is where the 59.95 came from. Not really important. What's important is that for the first class make sure you draw a rectangle going up to 0.05. For the second class make sure you draw a rectangle going up to 0.03, third class 0.15, fourth class 0.4 and so forth. So the rectangles represent the relative frequencies in decimal form. So consider the frequency distribution of days missed by students in their statistics class during a semester. Create a histogram. So I have a title, I'm just going to call this days missed. And I only have six different data values, so there's no need to have classes. In other words I can label my x-axis from 0 to 6. There's no need to have classes when you only have six data values. So number of days, that's my x-axis. And then frequency, let's do a little bit of sideways action here. Frequency goes as high as six, so I'll go all the way up to six for my frequency labeling on my y-axis, one all the way up to six. So if you look at zero days missed, there's a frequency of four. That means over zero you need to draw a rectangle that goes up to four. The rectangle should be centered over the data value zero. Number one, draw a rectangle that goes up to five. So the rectangle should be centered over one and go all the way up to five. Over two, draw the rectangle that goes up to six, it should be centered over two. Over three, draw the rectangle going up to five. It should be centered over three. Over days missed of four, draw a rectangle going up to three. And then five, draw a rectangle going up to two. You don't need a six here, there's not anyone that missed six days, so it doesn't need to be there, it's gone now. So here's a histogram, and this is a histogram where you did not have classes, you literally just had individual data values, which in this case would be number of days, zero through five. Well consider the frequency distribution of hours worked in a week by 50 college students with jobs, create a relative frequency histogram. So in this specific case, notice that we have classes anywhere from five all the way up to 44 hours a week, that's how much these students are working. So what I'm going to do is label my x-axis, so hours worked, or just hours. My title will be hours worked. So hours is on the x-axis, and then we'll put relative frequency along the y-axis, which will do it as a percentage for this one, because that's how they gave us relative frequency as a percentage. All right, I'm going to label my x-axis using my lower class limit, so I'll have all the way from five on up, so you have five, 10, 15, 20, 25, 30, 35, 40. Still not enough to encompass all of the classes, so I now need to go up to 45. My relative frequencies go as high as 30%, so I'll do my y-axis and do them in fives as well. All right, so I have five, 10, 15, 20, 25, and then 30, 30 is where I can stop. My first class is five to nine, so from five to 10, because that's how my x-axis is labeled, I'm going to draw a rectangle that goes up to 4%, so from five to 10 we go up to 4%. You can even shade the rectangle if you want, if you like to be fancy and artistic. Second class goes up to 6%, so from 10 to 15, draw a rectangle going up to 6%. Then from 15 to 19 or 15 to 20, because that's how our x-axis is labeled, 14%, our rectangle should go up to 14%. Then from 20 to 24 or 20 to 25, because that's how our x-axis is labeled, the rectangle should go all the way up to 28%. Next class goes up to 30%, that's our peak. This is relative frequency we're doing now. Next class, all the way down to 10% now, and then 4% and 4% for the last two classes, 4%, and 4%. Look at that, you got yourself a fresh, nice, lovely relative frequency histogram right there for you. The rectangles represent relative frequency in this case. Once again I labeled the x-axis using my lower class limits, so that's why the first rectangle goes from 5 to 10, even though it's really from 5 to 9, there's really no perfect way to do it honestly. So you can construct a histogram, but understanding how to interpret it is totally important. So when graphed a normal distribution has a bell shape, characteristic of the bell shape bar is followed, as we just discussed in our frequency table video. The frequencies increase to a max and then decrease, remember they start low, they get high, they get low again. The graph is symmetric with respect to the bar with maximum frequency. So pictured here I have a nice perfect bell shaped histogram. You start low, you get high, you get low again, and notice everything's kind of symmetric with respect to the bar with maximum frequency. So a distribution of data is skewed if it is not symmetric and extends more to one side than the other. So we have three different types of histograms that are not normal. First, there's skewed to the right, or positively skewed, or as I learned it to be right tailed histograms, and they have a longer right tail, so literally there's a little tail that hangs off to the right with the way the histogram shape skewed to the left, or negatively skewed, or negative skewed is also known as left tailed, and that's when the histogram has a longer left tail, and uniform means everything is about the same height. All the bars are roughly the same height. So let's play a little game that's called name the shape of that histogram. So which of the histograms would be uniform, A, B, C, or D? I would say B, because all the bars are about the same height. Next, which of the histograms is normal or bell shaped? Which one starts slow, gets high, gets low again? That would be A, normal, or most often called bell shaped, means the same thing. Part C would be what? Would it be skewed right or skewed left? Would it be right tail or left tailed? And notice the little tail trickling off to the right, that means it's right tailed. Statistically the proper word is to say skewed right. Part D, notice this little tail to the left, that means I'm left tailed, that means I'm skewed left. So there's some logic to why they like to use the words right tailed or left tailed. You've got the little tail hanging off to the left. Lastly let's interpret a histogram. I'm going to give you a histogram and you have to find out some information about it. So how many students were part of the study? The study was trying to find the hours spent playing video games on weekends. So I have number of hours on the x-axis and number of students on the y-axis. How many students were in the study? We'll add up the frequencies of each of the bars. Remember the bars represent frequencies, number of students. So you had two, three, four, seven, nine. So add all of these together and you're actually going to get twenty-five. Alright, what is the class width? How wide is each rectangle? Well, zero to five, five to ten, ten to fifteen, fifteen to twenty, there's a difference of five between each of those. The class width is five. Give the lower and upper class limits of the first class. So that's my first rectangle. So like I said, depending on how the x-axis is labeled, it might make it a little confusing on what the class limits are. Well first, there's no debate that the lower limit or the lower class limit is zero for the first class. It's the upper limit or the upper class limit that causes perhaps a little bit of difficulty. So the upper class limit, because the rectangle ends at five, you might say five, which may or may not be right because that might be where the second class starts and you can't end the first class where the second class starts. So some people may also say four because if you're dealing with nice pretty whole numbers, the first class will be zero through four, the second class would be five through nine. It's just the way the histogram was developed. So the upper class limit, there's a little flexibility and your homework will represent this as such. You could put four or you could put five. So there's a little bit of flexibility because sometimes histograms, it's a little confusing what they do on the x-axis. But anyway, that's all I have for you for now. So thanks for watching.