 I will start by asking actually a question. In the class, is there anyone? Then Danny, Mr. Singler please make sure that you are muted. So, okay. So study unit one until study unit three. Study unit ones it's introduction to statistics, We talk about the introduction to statistics, we describe what statistics is all about the branches of statistics which is descriptive and also which is inferential. We speak about the population and the sample which is the subset of the population. We also speak about the variables and also the data and the types of variables that there are two types, numerical and categorical. You need to be able to define all those things as well instead of you need one. And then also we speak about the levels of measurements. So you need to be able to know each and every unit in study or every section in study unit one with regards to those concepts as well. And when we talk about the types of measurements or levels, scales of measurement or levels of measurement, you need to know the order in which they happen, which one is the lowest and which one is the highest order, things like that. And then when we go into study unit two, that's when we talk about how do we visualize the categorical or the numerical. Remember categorical are those that you put into categories. We also call them the quantitative variables or data and numerical which are quantitative which we can measure or count its numbers. You need to know what type of charts or graph you can use to visualize that data, including for categorical there are three remember the pie chart, the pie chart and a table. You need to know the properties of those ones. And then when we talk about numerical variables, you need to know that you can summarize them in terms of the frequency distribution and when we talk about the frequency distribution, you need to know the properties and how to build that frequency distribution. And you need to know how to plot the stem and leaf plot where your stem or leaf based on whether you are using the 10th stem and leaf, the 100th stem and leaf, the leaf and so forth. And you need to know how to visualize the numerical data by means of a histogram or a frequency polygon or a cumulative frequency polygon, which is also called an ORGIF. And you go to study unit three where we summarize the data in terms of measures, measures of central location. There are three of them, the mean, the median and the mode. You need to know how to define them, how to calculate them or how to find them. Like, for example, the median, you need to know that you need to find the position first and when the position is 0.5 or 0.5, you need to take the average of the two numbers that the position is located between and so forth. You need to know how to calculate the measures of variation, which are your range, your standard deviation, your variance, and the variance and the standard deviation. You need to be able to calculate for the population and also for the sample because they are different, especially also in terms of the formula. There are two distinct formulas for the population and for the sample. For the measures of variation, you also need to be able to calculate your coefficient of variation and also with the measures of variation, which is not part of the measures of variation, but we call the measures of locality or not locality of the quotas, the quotas. You need to know how to find your quotas, your quotas one, two, and three, and you need to know the properties of those quotas, but your quotas two is the same as your median, right, but you need to know how to find the position of the quotas. Remember, when it's 0.25, you round down, when it's 0.75, you round up and you find the position and then you find the value. You need to be able to know how to calculate the interquartile range, which is your quartile three minus quartile one and so on. Then you need to know, based on measurements, you also need to know the distribution. Remember, you can tell about the distribution of your data based on the mean and the median. If the mean and the median are equal, they are symmetrical. Otherwise, then you need to know when the data is left skewed or right skewed and so on, including also when you're using the quarta. So those are all the things you need to know before you start tagging any questions related to study unit one, two, and three. Right. So let's see if you know those things. By looking at the questions relating to study unit one, two, and three. And I hope this one hour, 30 minutes, it will cover all the, we will be able to cover all the questions. So the first question states, so here you will have to talk to me because some of you are not seeing this for the first time. This will be the third time that you are seeing some of these questions. If you watch the videos, then it will be the fourth time that you are seeing the same question. Okay. So which one of the following statement is correct regarding some of the key terms in statistics? A states a statistic is a summary measure calculated from a population. B, a population is a set of all elements in a particular study. C, a statistic is a property of a population. G, a sample mean is a population parameter. E, a sample standard deviation is a population parameter. Which one of the following statement is correct? B, do we all agree? And everybody went quiet. When I have the discussions with grade 12 learners, I get more energy than when I do with university students. Joking, guys. So you all agree with B? Yes. So those who are not sure, let me help you understand. Some of the great statistics is a measure, right? And then we also have what we call a sample mean and we have a sample standard deviation. In study unit one, we talk about the population. We say if you have a population and if you take a subset of this population, you create what we call a sample. And then we said if you take the measures within, you calculate some of these measures from, let's use the measures of certain location and measures of variation, which is your mean, which is mu, which is your standard deviation, which is the variance, you have your variance sigma squared and you have your population proportion. These are the measures, right? In the population, you can calculate all these measures. And these are the symbols representing the measures. In the sample, you get the mean, which is x bar, your standard deviation, which is s, and your variance, which is s squared, and your population proportion, which is p. And how do we define those measures? We call those measures. If they come from a population, we call these measures parameters. If they come from a sample, we call these measures. Statistics, if it's one statistic, if there are many, like I have them, there are statistics, the same way. The mean is a parameter. Mean and population standard deviation are parameters. So coming back to the statement, it says a statistic is a summary measure from the population. We know that from a population, it should be a parameter, not statistic. This should be a parameter. Number C, it says a statistic is a property of a population, but we have already defined that. We know that a property of a population will be a parameter as well. A sample mean is a population parameter or a sample statistic. It is a statistic. It's not a population parameter. It is a sample statistic. Oh, you can just say it is a statistic because I'm repeating what they wrote there, population parameter. A sample standard deviation, sample standard deviation is a statistic. Also that incorrect, it will be a sample statistic. You need to know these properties in order for you to be able to answer the question. So if you are able to eliminate A, C, D and E, therefore, we haven't discussed what A, B is. B says the population is a set of all elements. We know that a population is big. Always think about it as everybody in South Africa. It's huge to count. So the population is a set of all elements or elements in the particular study that you are interested in studying. And from a population, you can take out a sample. Those are the things that can catch you just by not knowing the definition or the properties of a population and a sample. Sometimes, because I'm not sure if in the exam, you're going to be writing the same exam as October, November, they might get new questions. Sometimes they do ask you to identify which of these symbols belongs to a population parameter or something like that. You need to know that. You need to know the symbols as well. And what those symbols mean, you need to know that because you're using these symbols every now and then in your wake. So you need to know how do you write population mean? How do you write sample mean? And so forth. Okay, so we spend so much time on question one. Let's move to question two. Unless if there is a question or a comment. No questions. Question two, which one of the following statement is in or is correct regarding some key statistic or key terms in statistic? I think this is a repeat of that question that we had. They look exactly the same. Yeah, they are the same. Not really. Is it? I think also this question. Isn't it E? That's the correct one on this one. Yes, they look exactly the same, but they are not the same. So number A says a statistic is a summary measure of the calculate, summary measure calculated from a population. And that is incorrect, right? A sample is a set of all elements in a particular study. Is that correct or incorrect? Incorrect. That's incorrect. A statistic is a property of a population. Incorrect. Incorrect. Correct. A sample parameter is a population parameter. Incorrect. Incorrect. A sample standard deviation is a statistic. Correct. And that is correct. So you can see that almost same question, but with a little bit of changes, they can trick you like that and you just need to know how to define the properties that you drew on that first slide really helped a lot. Yeah. Was it simplified? This one. Yeah. How do you move into variables? You need to know how to define your variables. Remember, you have two variables. So I guess probably I should just do that. So we have what we call qualitative, which are categorical. And we also have qualitative, which is numerical. And with categorical, you can put it into categories. And numerical, you can either count or you can measure. So now, if for numerical, if it's something that you can count, we call that discrete, right? And if it's something that you measure, we call it continuous. Those are the things that you need to know and remember. And the other thing you need to know the levels of measurement. There are two, there are four types of measurement. Under categorical variables, there is nominal and there is ordinal. Under quantitative, there is the ratio. And there is interval, yes. There is what we call a ratio and also an interval. So those are the things that you need to always remember. You need to be able to know what does a nominal variable looks like. And what are the properties of a nominal variable? That a nominal variable, are those variables that you... There is no order, right? Or nominal, there is no order. Or now, there is order or rank. You are able to rank the thing like nominal is like male and female. Ordinal is like a satisfaction level, low, medium, high, like that. What else? Ratio, what are the properties of the ratio? That's a question. Let's see if you still remember. Ratio, a ratio can never be below, is it a ratio or an interval? An interval, yes, a ratio can never be below zero. So a ratio, we can say zero, an interval can go into negatives. There is what we call no true meaning of zero because an interval, you can have a temperature of minus four degrees and that is an interval. Right? So in terms of ratio, not only it produces the order, so because you can use a ratio to create orders of numbers, right? You can take your numbers and order them from zero, one, two, three, four, five, six, seven, eight, nine, ten. And you can take the difference between the numbers, the two numbers. And there is always a true meaning of zero because zero means it does not exist. That's the other thing. So in terms of ratio, zero means there is a true meaning of zero. So this one, there is a true meaning of zero because zero means it does not exist. And for continuous, there is no true meaning of zero because it can go into negative. No true, no true zero. Let's make zero perfect, no true zero. Because for interval, you can go into the negatives. Okay, so those are the things that you need to remember about study unit three. So let's go and answer this question. Which one of the following statement is correct regarding variables and descriptive statistics? We can go through statement by statement and do a process of elimination. Statement A, you will tell me whether it's correct or incorrect, and then we move to B. A, qualitative variables use labels or names to describe attributes of an element. A is true. True. That is true. So in the exam, when you get to that, you can even stop because that's the correct answer that we are looking for. But let's look at the other questions. We know that they are incorrect, but quantitative variables have either a nominal or a ordinal scale of measurement. Incorrect. Measuring attributes of an element result in a quantitative variable. Measuring attributes of an element results in a quantitative discrete variable. So when we measure attributes, is it discrete? True. That's what the patient says. That is incorrect. That's true. That is incorrect because remember, when we measure, it's continuous. When we count, it's discrete. So that will be incorrect. Quantitative variables are also referred to as categorical variables. Incorrect. That is incorrect. And you can see from there, I put them close to one another. My handwriting has gotten worse during December. I'm blaming it on all the wines from Stellenbosch. Sorry. Okay, so let's continue. Descriptive statistics is the process of using data from a sample to make estimate test hypothesis about the characteristic of a population. Drawing conclusions, a population based on the sample. What is that? What are we defining? What is the process of using data from a sample to make estimate and test the hypothesis? Remember, there are two branches of statistics. We have what we call a descriptive. Like it says, it describes, right? And then we have inferential. So descriptive, it summarizes the data. Inferential, it makes inferences or it makes estimates or hypothesis. Like the name says, it infers. It's inferential. It infers the results of your sample to your population. The descriptive, it says it describes your data and we know what descriptive it's those. How do we visualize the data? How do we summarize the data in terms of tables and charts and in terms of calculating measures of central location, measures of variations and so on. So this question is incorrect because this is talking about inferential statistic because inferential statistic is a process of using data from the sample to make estimates and test hypothesis about the characteristics of a population. So remember always descriptive statistics. We use it in chapter one up to also the study unit one up to study unit. I'm going to say almost the unit five because on there we are also describing the data in terms of some of this, even though with probabilities in study unit four, we might be doing some inferences, but there are no inferences there. We just summarizing the data by calculating the proportions and then inferential. I can say we do that from study unit six because once we start there going forward, but it usually starts from study unit eight. I would say that unit eight where we look at confidence intervals. That's where we start making inferences more because then unit six and seven, there are more about the basic understanding of probabilities, especially for normal or anti-tative variables. What is for numerical variables and the concepts that we use in normal distribution and sampling distribution, those are the things that we do in your study unit eight, nine and ten. Eight, nine. Hypothesis and confidence interval because in hypothesis we use sampling distribution, the z formula to calculate the p values and all that. So inferential statistics from study unit six almost, you can say it's based on inferential statistics. Okay, so if there are more questions. Yeah, just on question C, measuring attributes of an element results in quantitative discrete variable. So if we then look at that, that's what you've just written on the top there. So discrete would be if it's like a whole number. So yes, so. And if you are measuring. Let me let me give you an example of how they would have said this. Yeah, they would have said counting attributes of an element results in and it will result in that. So the discrete are your whole numbers. But now if you are measuring height or weight, it doesn't need to be a whole number. Nope, nope, nope. A height you are measuring it right that the minute you talk about measure measuring something. And also remember your height is depending in in in centimeters or in how do we measure height in meters right. So the person's height we measure it in meters the minute we compare we convert it to something else. It changes units right but then the height is measures is measured in meters. Therefore you are one point something is it centimeters or meters I don't know how these things are working now. Take a measuring tape and measure your height you are doing what we call a continuous or you will be getting a continuous variable the same way. Let me give you an example age. What is age is age. Discrete or continuous. It is discrete. No, age is not if they tell you that what is described whether age is discrete or continuous. You always need to know that age is continuous because we measure age in terms of minutes hours seconds that you were born and days. In the hospital when a child is born we say they are born at 12 or three on this day to that day. When a person dies, hence we put a date to it to say the person died at this time to that time or this time in whatever they estimate things like that. But we measure age. It's just that for simplicity purpose and for ease of communication. We use age in years because who will talk about age as today I am 23 years old 23 seconds 54 minutes something like that. 23 I'm 23 years old two hours and 30 seconds. Nobody will say that right because we always round it up to a year. So sometimes a variable can change from being a continuous into a discrete if they tell you that what is age in years. Therefore, the unit of that age has changed from being a continuous because age is continuous. But age in years is measured as a whole number because in years it means you are 24 years old 30 years old three years old things like that. But you need to be very careful about that. I think what's confusing me then is the is the is the ratio the level of measurement for ratio and sitting under the screen because if you say there's no true zero. Sorry, my bad. That's why yeah I wrote levels of measurement and I'm not putting them here to say they are linked to one another on there. Nope. Thank you. Okay, so the other thing anything that can take a decimal. Oh, sorry, not here. I must not write it here. I must not write underneath here because then you get confused as well right. It's also here. So yeah, this will be a whole number. And this is anything that takes a decimal like money. Money money money money we measure money in cents rent and cents right, but if they say just in rent, it can convert. But also this is based on who is teaching you statistic and what their understanding is I've seen so many things before. Some books they say, as long as it's money, it remains whether you convert the units to rent, it will still be continuous because it's money money is money. Because at the end, you will still put dot zero zero or dot zero one or dot zero to zero or something like that. It's rounding off. It's like the same way as we talked about age we say doesn't matter whether it's in years or it's continuous ages continuous. So think about it in that way. Okay, moving on, which one of the following statement is correct regarding variables and descriptive statistic. Okay, so we're going to go through this again. It looks almost exactly the same as the previous one. So a quantitative variables uses labels or names to describe attribute of an element. We're looking for the correct answer. Quantitative labels. Is that correct? Incorrect. Incorrect. Is incorrect. That is incorrect. Qualitative variables have either nominal or ordinal scale of measurement. Correct. Is correct. Correct. Correct. That is the correct answer. Measuring attribute of an element results in a quantitative. So since they say measuring and discrete, so it will make it incorrect. I'm not going to continue with the rest of them because they are similar to what we looked at previously. Quantitative variables are also referred to categorical variables. We know that there are numerical variables. And this is, this should be inferential statistic is a process using data from a sample to estimate and test hypothesis. Not descriptive. I'm just going to skip that one as well. Okay, so now we're talking about levels of measurements. Which one of the following statement is correct regarding variables and their skills of measurement. And now here you need to think about which one is the weakest and which one is the highest. Right. So a nominal scale of measurement is the strongest form while the ratio is the weakest form. In some way in the notes we did discuss this. So it starts somewhere here at the bottom it ends right there. At the bottom are all the categorical at the top is all the numerical and looking at both of them. The weakest one will be. So if this is weak. And this is strong. And we know that at the bottom. It's all quantitative measures. At the top is all. 20. 85. Measures. So if we know that the bottom is. Nominal. I've already gave you the answer to number one. What is this? Which one is the strongest? Your ratio. Then you can also plot whichever the in-betweens are. I don't have to explain which ones are the in-between. Because we have the ordinal and the interval. So. We know that this is incorrect because it says nominal is the strongest and ratio is the weakest. B. The mean cannot be determined for an ordinal scale of measurement. The same. We cannot calculate the mean for the ordinal data. The interval data or scales of measurement is the higher level of measurement than the ratio. Incorrect. A nominal and ordinal measure, ordinal scale of measurement. The similar properties in addition order. Or ranking of the data in nominal scale is meaningful. If. You look at nominal and ordinal, you cannot add or subtract, right? Because those are labels. How do you add and subtract labels? You can't. So they share the same properties. If something that is not right. With the statement, it's based on the second all. Or ranking the data in the nominal. Is meaningful. We cannot rank data in the nominal, right? Because there is no rank. Nominal means no. None of them has the highest or the importance or the rank or order. We can only rank the measures. Which are the data points in. If it wasn't a questionnaire, how many number of males and females and other. We can order that but doesn't mean that may males or men are more important than the other categories. So that. Statement is. Incorrect. They don't have they don't always have similar. Properties only in addition order because they both can be added. But nominal, you can rank the data. Sorry, ordinal you can rank the data nominal you can't rank the data. We discussed that. Right. It's yeah. No order meaning no rank. Ordinal day is ranked so they cannot be the same. Properties if I can also help with the other properties in terms of summarizing of nominal and ordinal. So you can calculate the frequency distribution of nominal and ordinal because you can just count how many people have responded in each category. In nominal. You cannot calculate the median. Ordinal because the median is that number in the middle. So if you have no medium and high medium will be your median. Because it's in the middle. You cannot add or subtract that is the only thing that they share together together with the frequency distribution. You cannot calculate the mean the standard deviation of the nominal and ordinal because these are categorical data's. You cannot calculate ratios. I have a sometimes the measures in somehow some way, but not the category themselves so you cannot take the categories and calculate the measures from there. So those are the things that you need to always be aware of. Yeah. Number E quantitative discrete variables results from measuring attributes. Is that correct or incorrect. Incorrect. That is incorrect because they should be called she was because it is measuring. You see how they ask the question the questions are exactly the same. On the other questions they said measuring attributes of an element number C measuring attributes of an element in. In a quantity results in a quantitative discrete variable. In this question they say quantitative discrete variables results from measuring attributes of an element is the same question but asked different. All you need to remember all you need to be aware of especially when you read the question identify the key weights. Like I did by underlining the weights discrete measuring discrete is counting continuous is measuring. You see how different was last year's paper compared to 2021. I have never seen the last year's paper but all I can tell you is the same question paper. Is the same question paper also is the same question paper from your assignment because all your assignment questions are last year. Okay, no wait. Let me not say that because it's not the same. All your assignment. Questions for this year, our last year exam papers. And last year's assignment. So it means students who are doing stats this year. The probably they might be using last year's exam paper. We will see how different the exam paper is. And also do not bet on that you're going to get the same exam paper as last year's. It might be different because it's a supplementary exam. And because you already been exposed to this exam. So the questions might change a little bit. Moving on to question six. I can believe that we are still on question six and it is seven 20% seven. Which one of the following statement is more mostly correct with regards to variables and their skills of measurement. The nominal scale of measurement is the weakest from form one ratio is the strongest. Correct. That is correct. Do I need to go through all of them again we've we've touched on all of them. Then we can move on because I can see the questions didn't change much. I'm not going to repeat that. You will get these notes anyway and you can go through them as well. So let's move to question seven, which is totally different from those ones that we were looking at. A majority of the rural schools in South Africa are in the Eastern Cape, KwaZulu Natau and Limbo Provence. The Department of Basic Education requires your assistance in gathering some of the information to effectively manage their school transport system in some of the rural areas. Which one of the following statements is correct with regards to the types of variables. We discussed the types of variables so I don't have to summarize this. So let's look at A up until E. You tell me which one is correct, right? The number of schools requiring LENA transport is a quantitative continuous variable now. And think about it. Correct. There's other key words. Number of schools that is 1, 2, 3, 20, 140, 50. Is it a continuous? We are counting there. This is a measuring. So that is incorrect. Incorrect. Whether or not a school is classified as a special needs school is a qualitative nominal variable. Correct. That's correct. That is correct because we talk about classification. You are a special needs school or not. There is no rank in how you define which one is which. The travel time. This is the key word. Travel time from the LENA's home to school is a quantitative discrete. Incorrect. Incorrect. Incorrect. Incorrect. Incorrect. Correct. Incorrect is continuous. Because we measure time. The size of the bus required to transport LENA's is a qualitative nominal variable. The size. The size is. And now the question is, does not end there. say this is a 220 kilogram bus, right? But the question says that is the mini bus, small coach, and articulated bus. So they took the numerical size of a bus and converted it into a categorical type of a question. Because if they start using things like mini bus, coach bus, and articulated, then it becomes a categorical. So is it a nominal or ordinary? You ask yourself, and is there an order? Yes, there is an order. So this will be incorrect, because it says it is nominal. Okay, and that's how you're going to identify this. They are very easy to check and validate. The age at which Lena first enrolled at the school is a quantitative discrete. Incorrect. Remember, it's age. Age is continuous. Refer to the scenarios in the previous question. Okay, we're talking about the scholar transport. Which one of the following statement is correct with regards to the types of variables? A, the list of schools requiring Lena's transport is a qualitative nominal variable. A list of school, it means they have named or labeled every school. So this will be correct. The distance from the Lena's home to school is a quantitative discrete. Distance we measure. So it will be incorrect. The number of Lena's, that is your key word, at each school is a quantitative continuous. So the number of Lena's, it means you are going to count every Lena. So that will be discrete. And which makes this question incorrect. The travel time from the Lena's home, travel time from the Lena's home to school is a quantitative discrete. Travel time we measure the time. So if we measure, it's continuous. Incorrect. The level of school, which is primary, junior high and high requiring Lena transport is a qualitative nominal. Now, you ask yourself, level of school in terms of those, is there an order or no order? So there is an order, right? So if there is an order, it should be Odina. So therefore, E is incorrect. It's easy, right? Now we summarize the data. This is data from the recruitment company. I'm so lazy to read the entire statement. So here is the data that shows the 70 sample of job seekers. What you need to do is you need to construct a pie chart and choose which pie chart represents this data. All what you need to do is the data, it looks like it is sorted. If I look at this, so it makes it easy to count and divide. So I'll give you a hint in terms of how you answer questions like this. The first one is office. So office is the gray one. So if you take, you can see that office has different percentages except for BB does not have a value. Okay, we need to change. Let's take, not all of them have values. Okay, let's take office first because I can see where office is. We count how many office are they? One, two, three, four, five, six, seven, eight, nine, 10. So you say office is equals to 10 divided by 70, multiplied by 100. Are they 10? Did I count correctly? One, two, three, four, five, six, seven, eight, nine, 10. So they are 10 divided by 70 because they said they are 70 job seekers. Which is equals to 15 percent, right? So already I can eliminate A. Oh, sorry. I can eliminate A because A is incorrect. The other one I need to use because I cannot eliminate B because I don't know what the value of B is in terms of here. She didn't put the number there. So let's go to the next one. Which one? Let's take four day a week. Four day a week on this 14 it doesn't have. Let's take remote. Remote is populated on all three of them and they are different values. So let's count remote. Remote is one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 18, 14, 15, 16, 17, 19, 20, 21, 22. So remote is 22 over 70 multiplied by 100. And the answer is for some reason I get, I'm not sure if you all get the same. I get 31 percent. And there's no 81 percent here. And there's no 31 percent. They are all 32, 29, eight, whichever one they used. I don't know. Okay, let's take four day a week. I just want to double check between 32 and eight. One, two, three, four, five, six, seven, eight. One, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, 16, 17. So it's 17 divided by seven. 24. It's 20. 24. So if it's 24, do you think this will be okay? Then we'll have to calculate blended as well. I thought we would solve time because G does not have a value. How am I going to know? Oh, does that mean the bigger the size is different to 24? It might be more than. Okay, let's calculate remote quickly. Oh, we did calculate remote. We got 31. Let's calculate blended. 24, six, eight, 10, 12, 14, 16, 18, 20, 21. Blended is 21 divided by 70. 30 percent. That's blended, 30 percent, 30 percent. So B is the correct one. If we look at this because it's 24, I'm going to assume that 31 is 32 and I'm going to assume that this is 14 and this because I don't know. But that's how you will calculate. Okay, that's how you will calculate. You just count how many they are divided by the total, but we'll give you your answer. And you can choose two of the categories and do a process of elimination. You don't have to calculate the whole entire high charge. Okay. The data below shows the daily electricity consumption in kilowatt hours for 50 randomly selected households. Use 15 as a lower class. Hey, this is that one, other one. Use 15 as a lower class limit and construct a histogram for your daily consumption with six classes. So if six is your 15, so if I take 16, we need to first find the range of the data. That is the range of the data is 46. The range is 46 minus 16, which is equals to 80, right? That is correct. And divide that by the number of classes. It's five. So the width will be five. So they say if we take 15 plus five is 20, so the upper limit here will be 20. And they do be 20, 25, and 25. Okay. But here is the challenge. It says 27 to x with x as the upper class limit of the third. What is the relative frequency? Let me just double check because then the logical used is different to what they want you to do. Let's say it's 15 plus six. That will give you 21. So if we start from 15 to 21, the next one will be 21 to 27. Then they shouldn't have said six classes. This is the width because it's the size from here to there. It's what we call the width. The classes are these categories that we are creating. Okay. So that will be 27 to 27 plus six is 33. It's 33 because they say the third class, right? Because then this one would have been 33 to whatever. 33 plus six is 39. So you have your class intervals, your class intervals, and you have your frequencies or your count. Let's use that as a count. So we need to count. So they say count any of those that falls between 27 and 33. And also this is very confusing because it needs to close somewhere and open somewhere or open somewhere and close somewhere. Sometimes I don't understand. So this will mean it includes x. This will mean it does not include 27. So I don't know whether to be include 27 or we don't include 27 in this. And we include 33 because there is 33 as well on this. Or we don't include 33. Okay. So first let's calculate without including 27, but we're going to include 33. So we don't include 27. We start counting from 28. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21. So if they are 21, we're looking for the relative frequency. Relative frequency it means 21 divided by, they said how many they are, 50. So take 21 divided by 50, it's 42. So it's not that. So let's include 27. So if we include 27 into the count, then it becomes 23, 23 because there are only two of them. So 23 divided by 50 is 0.46. Okay. If we don't include 33 into the mix, there are how many. And I really don't even know why did they leave the open bracket like that. Otherwise we'll have to do it without 33 and 27 and count. But for now, let's exclude 33. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 18, 15, 16, 17, 18. They are 18. 18 divided by 50 is 0.36. But I'm not sure if it was the answer. So let's exclude the 3 and 27. 1, 2, okay. So I included 27 in this one. So if I take away 2, I'm left with 16, 16 divided by 50. It's 0.32. It's not part of this. So therefore it was from 27. So this should be from 27. 27 we include, sorry, include 27. Something like this, what I can't remember now. Do we use the bracket to take close of up to something like that? So it should not include the last one. So if we include that 27, we should not include the 3. So that's your answer. Option B. We have 15 minutes. Okay. The data below shows is the same information. Now they say with X as the lower class limit of the fourth, what did we get there? 39. But they have 38. Do they do the calculation? If they use 38 as the fourth. So 38 plus, 38 minus 6, it's 33. Okay. So let's see. Going back to the, so it's 33. So we include 33, but we don't include 38. We do the same thing as we did previously. So we're not going to include 38. So we're going to start from here up to here. So 33 starts 1 up. So that will be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 30. 18 divided by 50. 0.26. 0.26, which is not on there. So do they include 38? Me, I don't know how they calculate these things because it's very confusing of them. Do they now include 38? So if they include 38, it's 14 divided by 50. It's 0.28. Okay. So this is question 11 and question 10. I just want to skip quickly to the answers because we do have the answers at the end. Question 10 and 11. As I remember, we did it this way in class, right? And that's what I did. We divided and we divided and we divided and we did the same. Okay. And I see it was the same thing that we did. Okay. So we said it's B and C. What did we say now? Did we say B and C? B and C, yes. We can write the same. Yeah. Okay. I'll send you the, I'll give you the notes. Don't worry. We just do a refresher of what we discussed this one. Okay. So question 12. Consider a sample with a size of 30 observation in an ordered array of these observations. Which one of the following statement best describes how you determine the courthouse? Remember the courthouse? You have your first courthouse, which is Q1. You need to first find the position, n plus 1 divided by 4. Then you have courthouse 2, which is n plus 1 divided by 2, and courthouse 3, which is 3 times n plus 1 divided by 4. So these are positions. So once you have the position, you need to go find the values and so on and so forth. But now they didn't give you the data on this one. They just give you your end and they are asking you questions based on the knowledge that you know about how do you calculate courthouse. So the first courthouse is given by observation 3. So you need to, they're asking you about the position here. And the third courthouse is given by the observation 11. So you will have to use your courthouse position to find out whether those are correct. You know what your n is. You just use your n into courthouse 1 and courthouse 3. The first courthouse is calculated as the average between observation 3 and 4. So what did you get? Did you get 0.5 as the answer? If you did, then C will be the correct one. And the third courthouse is calculated. So you do the same with D. E, the second courthouse is given by observation in 8. So let's see if any of this is correct. So courthouse 1. So let's calculate courthouse 1. We have 18 plus 1 divided by 4. What is the answer? 14 divided by 4? 3.5. 3.5. 3.5 position. So it will be on the thread, 3.5 position. So therefore it means in order for us to find courthouse 1, we will take the value from, so your courthouse 1 will be the value from or the average. Let's put it this way. It will be the average of value 3 and and 4, right? That's how you will find courthouse 1. That is how you find your courthouse 1. And then let's check what courthouse 2 is. Courthouse 2 is 18 plus 1 divided by 2. 14 divided by 2 is 7. It's 7. Let's check. So it's a whole number where the the position is. So it will be on the seventh position or on the observation number 7. Courthouse 3 is 3 times 18 plus 1 divided by 4. What is 14 times 3 divided by 4? 10.5. 10.5. So if it's 10.5, therefore courthouse 3 will find it by taking the average of position 10 and 11, right? Because it will be between 10.5, which is 10 and 11. So now let's answer the question. It says consider this. How do you determine the sample? So we're looking for the correct one anyway. So A, the first quartile is given by observation 3. It's not, right? You will not find it on observation 3 because it is the average of the two numbers because the position is on point 3.5. So that will not be correct. The third quartile is given by observation 11. So let's go to the third quartile. It's also not given by position 11 because it's point 5. If it was point 2.5, it would run down to 3. If it was point 3, let's say this one was point 10.25, we would have said it's on position 10. If it was 10.75, we would say it's on position 11. Those two would have been correct for each one of them if it ended with a point 2.5 or a point 7.5. But it ended with a point 5. If it ends with a point 5, we take an average. So that will also be incorrect. The first quartile is calculated as the average between observation 3 and 4. And that is what we said, right? So this is the correct one. The third quartile is calculated as the average between 11 and 12. It should be 10 and 11. Yes, it should be 10 and 11. The second quartile is given by observation 8. We know that it will be on observation 7, not 8. So that is incorrect as well. We left with 7 minutes and I think we have more questions. We will finish them off on Sunday. Don't worry. I will give you this. You can also wait on your own as well. So these are the data that you are given. So you need to remember that the mean is the sum of all values divided by how many they are. Divide by how many they are, which is the sum of the X. If these are X observations, the sum of observations divided by how many they are, which is your N. N is counting how many they are. They are 1, 2, 3, 4, 7, 8, 9, 10. They are 10. So you just have to add all of them and divide by 10. You need to add all of them and divide by 10. Or you can use your calculator and summarize the data. So that is the mean, the median. You're going to find the position by N plus 1 divided by 2, but first you need to order your data from lowest to highest. If I look at this data, it's not sorted. So in order for you to calculate the median, you first need to search your data from lowest to highest. And then you find the position. And after finding the position, that will give you your median. The mode is the most frequent or appearing number. Which number appears more than the other numbers? Which appears, which number is repeated or repeated number? So that should be here. So if it's 33, therefore that is eliminated. That is eliminated and that is eliminated. You are left with 2. Just calculate one of the two, either the median or the mean. I think the mean will be the easiest to calculate. The correct answer is A. If you add all of them, what is the mean? That's 329. Also in the exam, when you are writing your exams, don't spend time calculating everything. Just do your process of elimination. Save time because you have other questions to worry about. You need to make sure that you save your time. So let's look at the last one and then we are done for the day. I will give you my number that you can contact me on. Okay. Consider the following data. So this one is also those complex calculations that you need to do. But it's not that complex, complex rules. You can do this on your calculator. So maybe probably next week, when we start on next week on Sunday, when we start, we will give this kind of questions again. And I will show you different methods of calculating this. So what this means, it means you need to take 29 times 29 plus, because it says the sum of your x squared. x squared means the same as 29 times 29 plus 30 times 31 times 31 plus 31 times 31 and for all of them. Otherwise, then you can use your calculator. Your calculator is very clever. You can put your calculator to state mode and calculate using the state mode function. Let me see if I can get the state mode function. Otherwise, you can use Excel. We will do that on Sunday. We will do both on Sunday because it saves time. Also, this one, we can do it on Sunday because this says you need to calculate the mean and then you subtract every value from the mean and squared and time. So you calculate the mean of this data. We found it was, oh no, this is different. You calculate the mean of this data, you find that mean, which is the second part, and then you subtract this observation from the mean and you square the answer and then you add. So what you will do is the first step, because I'm sending you this, I'm giving you things right now. The first step, you calculate the mean, which is the sum of observation divided by n. So let's say our mean is a, right? You will find your mean. Number two, you're going to say 29 minus a and then you're going to square the answer plus 30 minus a and you're going to square the answer, plus until you do all of them, you get to 34 minus a and you square the answer. And when you are done, then you will find the answer to that. That is the summation. Otherwise, you can use Excel and I will show you on Sunday how to do that if you are still struggling. But you have the whole of this week to ask on any WhatsApp group that you phone or you are in for people to help you answer this question. Yes. Okay. And then this is the variance, you can learn how to calculate those. Then we will do them on Sunday and we will continue with assignment two after that. So all these questions will do them on Sunday. Remember, you do have the answers. I will suggest to follow the answers and do the questions. And once you are done and satisfied, you can come and check because knowing the answer and going and answer the question, it doesn't help. It doesn't help you study. I'm even tempted to remove this because I don't like giving options because you need to know how to calculate certain things. You need to know how to do things right. You cannot memorize the options because you might get a different question in the exam and you don't know how to do that. So bear with all that. And also, when you calculate because the other things that you need to be careful of is things like sample variance. Remember to use the right formula. When you are using your calculator, remember sample variance is sx on your calculator. When you are using your manual calculations, remember that it is given by the square root of the sum of your x value minus the mean squared divided by n minus 1, which is that answer that you would have gotten here because this is the same data set. You just do the square root divided by this. You can calculate this manually because what they did is they gave you this one to calculate the sum of x squared, not the mean. Okay, so this is different. So this one is sum of x squared. It's not the same as this. So you can take this to calculate your sample standard deviation as well. The coefficient of variation, which is CV, is calculated by the sample standard deviation divided by, we will do this on Sunday. Don't worry. I'm just giving you a hint. This you apply the empirical rule. Remember one standard deviation, two standard deviation. Remember this is one standard deviation, two standard deviation, and three standard deviation, which is 99 percent. So the mean plus or minus your S, the mean plus or minus one, two times your standard deviation, which is S. And the last one is the mean plus or minus three times your S because they say use the empirical rule to calculate this. You just use this to calculate this. You test this to by using the mean. If the mean is equals to the median, then you can either say symmetric. If it's less than the median, it's skewed. If it's if it's less or it's greater than, they will be skewed. So one of the two to answer those questions. And remember, don't get confused with the thousand because they told you at the beginning. These values that you are looking at, this is 105,000, therefore, that's why if you get the answer of 102, you just multiply that with a thousand to get two hundred and twelve thousand and things like that. Yeah. So that will be the same. So you repeat the same thing that I just gave you. One standard deviation, two standard deviation, and three standard deviation. You apply the same concepts. Yeah. That's the same thing. And then the upper limit of the box plot and all that. Remember, upper limit of the box plot is Q3. It's quarter three. You just always remember that. The lower limit is quarter one because they forgot the box plot. A box plot, remember, it is that that is the box plot with your five number summary, your small, highest, Q1, Q2, and Q3. That's your box plot. You always need to remember that, that the beginning, if they talk about the box plot limits, they are not talking about the smallest and the highest, but the smallest and the highest, really, they're talking about the box. Okay. And these are the things that you also need to go and study. And that completes today's session, the one hour and 30 minutes. And we did a launch in the back. And I'm going to stop the recording. Thank you for coming. I don't go anyway. Just wait for me. Just going to end the show. Did I stop the recording? Did I stop there?