 Hello, in this video we will construct frequency tables. So why do we need to construct frequency tables? Because it does take a little bit of time to do that. Well it allows us to summarize large sets of data, it allows us to analyze the nature of data and see trends that are possibly occurring, and we have a basis for constructing important graphs such as histograms. So when you construct a frequency table you need to first determine the number of classes. If they don't tell you how many classes you need, then use the number anywhere between five and twenty. Your second step will be to calculate the class width. You will always want to be sure you round up your class width. The formula for class width is maximum value minus minimum value divided by how many classes you want to have or how many classes they want you to have. Third, look at your minimum data value, then choose a starting point just below that minimum data value that is a nice pretty number. So if your minimum data value is like forty two I would like pick a starting point for my first class of forty. Fourth, using the first lower class limit and class width proceed to list the other lower class limits. So if I picked forty as my lower class limit and my class width is ten, forty plus ten gives me fifty, plus ten again gives me sixty, seventy, eighty and those are going to be my lower class limits. Then you enter your upper class limits and you'll actually go through your data, cross it out, put a tally mark in the appropriate class to help you calculate the frequency. So this is the process you will use to construct a frequency table. So data was collected for the height and inches of sixty elementary people or sixty people in an elementary school. Create a relative frequency distribution, include a cumulative frequency column, please use eight classes. So I have here all the different heights and inches. There are sixty heights total for sixty people. So first step is to identify how many classes you want, how many groups of data. In this case I'm going to do eight classes because that's what they told me. Second, you want to do your class width. You need to calculate your class width. This is literally the only serious calculation you have to do for this. Maximum value minus minimum value divided by number of classes. So in this case my maximum data value appears to be seventy-two and my minimum data value appears to be thirty-seven. Sometimes you have to take a double take and look at the data a few times to make sure you definitely pick the maximum value and you definitely pick the minimum value. I have eight classes. So this is basically giving me thirty-five divided by eight. So thirty-five divided by eight is four point three seven five and my class width needs to be a nice pretty whole number because my data values are nice pretty whole numbers. So it's very important in this step that whatever value you get for your class width calculation you round it up. And I'm not talking weed killer, I'm talking about rounded up numerically to the next value. In this case that would be five. So my data includes whole numbers I'm going to keep my class width as a nice whole number. So we have enough information to now construct our classes in our table. We said the minimum data value was thirty-seven and the maximum value was seventy-two. We said my class width would be five. So in this case thirty-seven is my lowest value. Pick a nice number that's just below that I would go with thirty-five. You could start right at thirty-seven but it kind of bugs the OCD a little bit honestly. Then you identify your other class lower class limits by adding five. Thirty-five plus five is forty. Forty plus five is forty-five. You're literally adding five each time to get your lower class limits. Fifty. Fifty-five. Sixty. Sixty-five. Seventy. Sixty-five. Seventy. Then your upper class limits. Every class has a width of five. So thirty-five, thirty-six, thirty-seven, thirty-eight, thirty-nine. Thirty-nine is the fifth data value. So cut off the first class at thirty-nine. Add five, you get forty-four. Notice forty-four cuts off just in time for forty-five to take over in the third class. Add five to get forty-nine. Add five to get fifty-four. Add five to fifty-four to get fifty-nine. Add five to fifty-nine to get sixty-four. Add five to sixty-four to get sixty-nine. Add five to sixty-nine to give you seventy-four. It's really important you add five each time to the previous classes limit. You could also write your classes in interval notation, which is how your homework likes for it to be done. You could write thirty-five comma thirty-nine bracket. You could write forty comma forty-four bracket, forty-five comma forty-nine bracket, fifty comma fifty-four bracket, and so forth. So your homework really wants you to use the bracket notation, the interval notation here. So what about the rest of the table? My recommendation is to look at each of your data values, thirty-nine, for instance, cross it out and put a tally mark in the class where thirty-nine belongs. Fifty-nine, cross it out, put a tally mark in the class where fifty-nine belongs. At the end of the day, after you go through all sixty data values, you'll end up with two tally marks in class one, which is a frequency of two, and how to find a relative frequency. Remember how to find that two over sixty. The frequency divided by the total number of data values in the entire data set. Two over sixty is point zero three, moved at the decimal to the right, two spots to give you three percent, my cumulative frequency is currently three percent. So I want you to take a moment, pause the video, and I want you to fill out the rest of this table, go through the data values, and tally up the frequencies for each height class, and then fill out the rest of the table. And then just a minute, after you unpause the video, the answer will appear for you to compare your results. So I have my completed table displayed now, so make sure you check what you have versus what's currently displayed for the frequencies, the relative frequencies, and then the cumulative frequencies, or relative frequencies I should say. And remember I told you on a previous video that a hundred percent ideally is what you should get for the total cumulative frequency, but because of the rounding issues occasionally you could get ninety nine percent, or you could get a hundred and one percent, so please be aware of that, you won't always get exactly a hundred percent. Alright so that's one example, and this is a data set that has nice pretty whole numbers. So using frequency tables to understand data, in later modules there will be a reference to data with a normal distribution, one key characteristic of a normal distribution is that it has what is known as a bell shape, and I will now define what that is for you. So bell shape or normal distribution means the frequencies start low, they get high, and then they get low again. So they start low, they get high, and then they get low again. The distribution is approximately symmetric meaning frequencies of classes before the one with maximum frequency are about the same or mirror the frequencies of classes after it. So for instance I have here displayed a completed frequency table here, this is the frequency table we made with the heights with the students. Use the relative frequency distribution from the previous example to determine whether the data have a normal distribution. So in this specific case, look at your frequencies. Do they start low? Do they get high, and do they get low again? Yes, 2, 7, 12, 16, 12, 5, 3, 3. Next, if you look at your peak, which is 16, your highest frequency, which is 16, do the frequencies for the classes below the 16 mirror those frequencies for the classes above. Well you got 12 and 12, 7 and 5, 3 and 2, and then you have this 3. I would say so, and around the bell way it looks like our data do have a normal distribution. So the answer here is yes, frequencies start low, get high, and then get low again. That's what we mean by normal and there's plenty more discussion that you have about normal as we get a little bit later in the course. So now, because making frequency distributions are so fun, let's do another example that's not nice pretty numbers. Because in your homework, you might be faced with one that has decimals. So create a frequency table for the following data. Well, how many classes will I use? Let's use 10 classes, so that's how many rows I have in my table. Next order of business, where is your minimum value? It looks like the minimum value appears to be, double check me here, 4.51. And then the maximum data value appears to be, 6.97. Looks good to me. So my class width will be maximum value minus minimum value divided by 10. I got 0.246. Remember we said we need to round this up to a value. Since I'm dealing with decimals, it's OK for my class width to be a decimal. So I would say in this case, use a quarter or use 0.25 as your class width. I guess a decimal number should probably be a decimal, honestly. So if you look at your lowest value, it's 4.51. Well, 4.51, pick a number that's just below it that's nice and pretty. Well, I would go with 4.5. Then you add 0.25, and that'll get you to 4.75. Then you add the class width again, plus 0.25, and that'll give you 5. Add the class width again, add 0.25. That'll give you the lower class width of the next class of 5.25. Then I'll just fill in the rest, 5.5, 5.75, 6, 6.25, 6.5, and then 6.75. I will do interval notation for these classes, or these bins as they're called sometimes. So I'll put a bracket around each of these lower class limits as they're called, and I'll put commas after each of the lower class limits, and now I need to do my upper class limits. So the first class, or second class starts at 4.75, so that means my first class better cut off before then. Well, you should use a value to the same number of decimal places I'm going to use 4.74. That's going to be my cut off for the first class. So basically, if you take 4.5 and you add 0.25 to it, that gives me 4.75, I just need to go a smidge lower than that and cut off the first class there. And then I'll go to my second class. And if you add 0.25 to the first upper class limit, 4.74, that's actually going to give you 4.99. Then you go to your third class, add 0.25, and that's actually going to end up giving you 5.24. And you continue going down the line, you'll get 5.49, you'll get 5.74, you will get 6.99, 6.24, you will get 6.49, you will get 6.74, and then you'll get 6.99 as your last upper class limit. So these are the classes. Like I said, it's easy to find your lower class limit, 4.5, add 0.25 every single time to get the other lower class limits. And then your upper class limit of each class should be just below the lower class limit of the next class, class that follows. So now that you have your classes are bins designated, I want you to take the time to tally up the data values in each class and list their frequencies. So please do that now. You can pause the video and work on it. So here's the solution. I have my classes labeled. I have my tally marks and I have my frequencies. Notice my classes encompass every single data value. Notice every class has at least one data value in it as far as the first one and the last one. There's one in the middle that doesn't, but that's fine. So my first class, my lower values, there's at least one in there. And my upper class, my highest classes, my highest values have a frequency of two. There's two data values there. So you don't want to sit there and spread your classes out so much that you have tons of classes without data values in them. You usually want your first and your last class to have at least a frequency of one. So that's a nicely created frequency table. There's a couple other ways you could have done it, but I feel this is the best way to calculate everything in the display. So that's constructing frequency tables or frequency distributions, whatever you want to call them. Thanks for watching.