 So I'm about to record. I just wanted to start recording before I continue. Okay, it is recording. So today we're going to concentrate on activities and I will just do a recap of what we discussed on Wednesday. I will run through because we need to remember all those concepts and what we discussed in terms of the data visualization study you need to in order for you to be able to answer the questions. And remember today also if you have a question from your assignment and you need clarity on that feel free to ask and don't forget as I go along and things don't make sense or you get lost and you want clarity on some things you can raise your hand if I will be able to see it if I don't see it and you've been raising it for long just unmute yourself and call out my name and then I will stop and then you can ask your question. Okay, so let us continue and please also remember always when you join the meeting to always mute yourself because it creates echo and sometimes it disturbs other people can hear. Okay, so what we discussed on Wednesday we went and we looked at what is statistics we looked at the concept of statistics we looked at the types of variables and then the different levels of variables. We also spoke about what statistics is we said statistics is about transformation of data into meaningful useful information where you can use it to make decisions and I'm not going to go into why we study statistics we know we study statistics to solve problems and and we also know that a lot of businesses or government are using statistics to make those decisions we we also looked at some of the examples that we that statistics are used in in this day and age in terms of the coronavirus in terms of the election and the weather we also spoke about two branches of statistics we said there is a descriptive statistic and also there is a inferential statistics and we describe what descriptive statistics is all about it's about describing the data in terms of tables charts and summarization of that information we also spoke about what inferential statistics is all about and we said inferential statistics is where we infer the information we collected from the sample to the population that we are interested in and we said with inferential statistics we do estimation and also we can do hypothesis testing we explain some of the concepts that we use in statistics like the population and we said the population is all elements of study that you are interested in and if you calculate the mean the standard deviation and the mode the median and so forth those measures that you calculate from the population data we call them parameters we also said because the population is way too big sometimes you can reach everyone in the population therefore we have to sample or we have to create a sample by using some of the sampling methods and we said a sample is just a subset of your population group and we said once you have selected your sample and you calculate things like your mean your median your standard deviation your variance and those measures we call them statistics I'm not going to go into the exercise but then we continue we said when we talk about the population or the sample it's representing individual or items and those items have characteristics that defines those items for example we said we discussed what the variable is because a variable is a characteristic that describes the population of your interest or your sample and we said also the population can be something that can be observed or it can be measured and we said if it's observed it means you can see and if it's measured you need to take something and measure it and then we also spoke about the measures that we we get from those variables because then the measures are the also the the the the data is the values that are associated with that variable so for example we used gender as an example and we said in gender we have male and female as values and those values we call them data and we also looked at the types of variables and we said there are two types of variables there is the qualitative variable which is also called the there is a qualitative which is also called the categorical variable and there is the numerical variable which is also called the quantitative and we said for categorical variable which are qualitative are variables that you can put into categories you can group them and numerical data is data that you can either count or you can measure and we gave examples of those kind of data then we also defined the levels of measurement or the scales of measurement and we said in terms of the levels of measurement because they classify the types of variables that you have in terms of for the in terms of the four scales and we said all categorical data set variables all categorical variables have the lowest level and all numerical variables have the highest level because you do you can do a lot of other manipulation of those information so we said nominal is the lowest and ratio is the highest level of measurement then we describe what each level of measurement is and we said nominal is for categorical data and there is no order there is no logical order or natural order and you cannot use it for any comparison and then we also looked at um ordinal data which is also a categorical data and with ordinal data there is a natural or logical order and we also defined what an interval is we said interval comes from a quantitative data or numerical data and we said you are able to see the difference between the measures because it's a numerical value but it does not have a true point a true zero point because zero is another number which refers to another level um and we gave an example here of temperature because zero isn't like in temperature zero is a cold temperature because there is a minus um a negative temperature that that temperature can reach so any variable numerical variable that can take a negative number then it will be an interval variable then we also describe what a ratio is and we said a ratio it also from the quantitative variables you are also be able to check the difference between the two venues and it has a true meaning of zero because zero means nothing or zero means it does not exist that thing does not exist and we then also describe some types of um order of operations that you can do on those levels of measurement so we said also for nominal data you can do count and you can do mode for ordinal data you can do count you can do mode you can do mean not mean but the median you can find the median because the median of the ordinal data will be the value in the middle because it's in the ranking order so you the one that is in the middle will be your your your modal your modal sorry your median point because it's that one in the middle then for interval you can do everything except calculating the ratio because interval has a negative we cannot calculate the um a ratio of a negative number and because and also it does not have an absolute zero so uh that's interval and the ratio everything else applies so you need to know this um when you do your assignments as well then we moved on and we looked at how we summarize data in terms especially the numerical data and the quantitative data in terms of tables and chart and we said for categorical data we use the three including also Pareto we use a frequency table which is just a table we used the graphs a bar chart pie chart and a Pareto diagram to visualize categorical data and we describe how a frequency table looks like and we said a frequency table um I must say I must repeat this as well a frequency table is your frequency is your count how many number of values fall within those categorical values and the relative frequency is your frequency divided by the total so for example yeah they didn't give you a total you just have to calculate the the total but sometimes they do tell you that this is out of 100 and then you know that your total which is your sample size it's equals to 100 and to calculate your relative frequency you're going to use your your frequency divided by the total which is your n which is your sample size um to calculate the relative frequency when you add all the values under the relative frequency when you calculate this the total time of the relative frequency the answer should give you one because relative frequency it is in decimal form so the answer should give you one if we calculate a percentage frequency then because we take the relative frequency we multiply that by 100 the answer will be in a percentage form the total of the percentage frequency will be equals to 100 and that's how you complete the frequency table and from a frequency table you can um summarize it and be able to to speak about the things that you see in the frequency table to say there are 44 people who frequently goes to to capital bank or there is 0.22 people who goes to capital bank or there is 22 percent or 22 percent of the people goes to capital bank or frequently goes to capital bank and that's how you you interpret the values that you see on the frequency table okay so we also looked at the bar chat and we said a bar chat the bars represent the category the height will represent the frequency or the percentage and we also looked at oh and also with the bar chat the bars there is space between the bars the bars are separated by spaces or there are gaps between the bars and then we also spoke about a pie chat we said a pie chat the slices of a pie represent the category and the size represents the frequency or the percentage of that part of those categories then the Pareto chat we said it is your bar chat and your cumulative uh polygon or frequencies or percentages with it so it is your normal bar chat and if we add the percentages of every cumulative every bar chat then we can create a cumulative um percentage graph on top of it and as a line chat and this makes it a Pareto diagram then we also looked at the frequency distribution of how we summarize data based on or how we summarize quantitative quantitative data so yeah because it's numeric data we said we need to order the data from lowest to highest in terms of the ordered array and once we have the ordered array we can also do a stamina leaf plot from that data but we can also order the data and create what we call a frequency distribution because also when you build a frequency distribution table it's a table you need to sort your data from lowest to highest and with that frequency distribution table it you can create after you have created that you can um summarize the data in terms of visualizing it in um as a graph as a histogram a polygon and or give and we ended up doing uh only the frequency or we did the stamina leaf and the frequency distribution table and today we will continue from where we left off so just to summarize what we did also in terms of this we said an ordered array means ordering the data in terms of from lowest to highest value and and we also added things to say when you order your data you are able to see um outliers and we define what an outlier is and we said an outlier is a value that is way outside of the normal range which is bigger than the rest of the other values or smaller than the rest of the other values which is far apart from the other values it's an unusual observation that you will notice then we said from the ordered array you can create a stamina leaf plot and I also did um show you some examples of different types of stamina leaf plot and we also um discussed that there is always one stem with many leaves and we said when you read the stamina leaf plot the value for example this is not six but it is 16 17 17 18 18 18 and we repeat all the values from the table and I said also later on when we do measures of measures of central tendency and measures of variation we will use examples where you get a stamina leaf plot because in your exam or assignment they might give you a stamina leaf plot and ask you to calculate the mean the median the mode they might also ask you to calculate what is the smallest value what is your highest value what is your fifth largest value and so forth so you need to know how to do that from stamina leaf plot but we will look at those examples later on um and then we did this example just to show you how to um take stamina leaf plot and create a normal data like the one that is shown on the table as well then we also looked at the frequency distribution table and we looked at how do we create this frequency distribution table so we say the frequency distribution table we need to start the data from lowest to highest we need to calculate the range which tells us the distribution of your data and you need to select how many classes you want to create but also remember in your module they will not expect you to calculate or or create a distribution table you just need to know the concept and the logic of how we built the frequency distribution table almost getting to the end and then after you have selected the number of classes then you can use your number of classes divide the range with that number of class and you will be able to know what is the class which which tells you how big your your class your category that you're going to create from this numerical data how big should they be and here we we created the class interval or the width to be 10 therefore it means the distance from the smallest to the highest or the upper to the the lower to the upper limit of that class interval um the difference will be equals to 10 which is the range of that so and then we define the range so going back you go to your data you look at your last number and you can select another digit that we want to start from so here we started with 10 so we set anything from 10 but not to 20 anything from 20 but not to 20 so um and then you define all your five classes which are now your categories that you're going to use to create your table and then once you have defined your classes then you can assign the observations that falls within those classes and we said you assign them by counting how many there are which will be your frequency by saying how many of those ones that falls between 10 but not um uh 10 but not more than 20 and then you start counting there are one two three and then you say your frequency is three and you do for the rest of them and then you add all of them they should give you 20 because we know that we're using a sample of 20 days so the total of the frequency should be the same as the total that they have given you as a sample to calculate the percentage frequency we say the frequency divided by 20 it will give us so three divided by 20 gives us 15 six divided by 20 gives us 30 and when we add all the frequency the percentage frequencies should give us 100 because it's 100% if they would have asked you to calculate the relative frequency remember that relative frequency is uh your frequency divided by 20 without multiplying it by 100 it will be a decimal to calculate accumulative frequency with the first class interval we take the frequency of the first interval and to go to the second interval we take the frequency of the first at the frequency of the second to give us the cumulative frequency because with this value of nine we say we want to know all the days that had um all the winter days that had temperature of less than 30 regardless of whether they were in the first class or in the second class so any time any day that had a temperature of less than 30 so there were nine of them because there are three that had 10 to 20 and there are six that had 20 to 30 so there are nine and you go on and complete the whole so for example when you get to 14 it will be nine plus five gives us 14 and the cumulative percent oh sorry the cumulative frequency when you get to the last class interval the value of the last class interval should be equivalent or should be equal to your total frequency count and then the cumulative frequency percentage it's uh your cumulative frequency divided by your total frequency sorry so it will be three divide by 20 we give us 59 divide by 20 will give us 45 14 divide by 20 will give us 70 and the last one will give you 100 because it's 20 divide by 20 which is 100 and you can interpret the values that you see and you will see when we do a lot of other exercises we will be completing some of the tables so that you you get used to how we complete the table and how we answer the questions okay so once we have now we move on from where we left off once we once we have created a um a frequency distribution table then we can create what we call a histogram which is just a bar chart that takes the frequency distribution and put it into bus so a frequency a histogram because we use the class intervals so your bars of your your histogram will be represented by your will be represented by your class interval so the bars are represented by the class intervals and the height will be represented the height can be represented by either the frequency the relative frequency or the percentage of a histogram so 10 but not less than 20 so that is 10 because this is the mid point 5 so this one will be 10 this one will be 20 this one will be 30 40 50 and with the histogram you will notice that there are no gaps because when one interval starts and end the other one starts and it will end and where the other one starts it means the previous one ended at that and the new one starts so there are no gaps on a histogram and on the horizontal we can show the midpoint or we can show the exact class intervals you don't have to show the only the midpoint but you can show the exact interval so it will be 0 10 20 30 40 50 60 and like that okay and then the height I've already explained that it can be the or the height will represent either the frequency the relative frequency or the percentage with the frequency distribution later on I'm just going to mention it yeah but this you must put it at the back of your mind we go into discuss this in more details when we do the measures of variation so with a histogram you are able to show the shape or tell the shape of that histogram whether the the data is normally distributed or what we call it it's symmetrical therefore it means the mean the median and the mode of this data set are the same we will discuss this in more detail don't worry about it but for now you just need to know that for a histogram that looks like this the shape looks like this we call it where it's balanced we call it asymmetrical or normally distributed histogram a one a symmetrical histogram that looks like this it's what we call a uniform distribution because everything is the same there is uniformity in terms of the bus the third one yeah you can see that the tail goes to the right so therefore this we call it the data of this histogram is skewed to the right therefore the tail is to the right it's skewed to the right or we call it it's right skewed this one the tail skews to the left we say it is left skewed and there's other types of the histogram but we will discuss this in more detail you don't have to worry too much about it as yet okay we're not going to do this exercise because we're going to do a lot of other exercises later on don't worry about that then from the frequency distribution table remember that you can create a midpoint which is your median value of those two the lower boundary and the upper boundary by taking the middle value of this so you can say 10 plus 20 is 30 divided by 2 it will give you the midpoint you need to create what we call the midpoint so between 20 and 30 the midpoint is 25 30 and 40 the midpoint is 35 and so forth we know the frequency for each class so when we create a polygon we use the class midpoint and the frequency so on your horizontal we put your midpoint and your vertical we put the frequencies so we say 15 so the first one the midpoint is 15 so at the beginning it's zero because there is nothing so it will start there at zero this line is zero and 15 and three so we go away and create that so this is what we call a frequency polygon so we use the class midpoint and the frequency to create it to create an orgif which is a cumulative percentage polygon we use the lower class boundaries remember the first one which was the polygon we use the midpoint for a cumulative polygon or what we call an orgif we use lower class boundaries so we're going to use only the first intervals class interval your lower class boundaries which is 10 20 18 40 50 60 we use that and we also take the lower percentage of your cumulative frequencies so remember at this point it would have been 15 but we need to take it lower than so we need to bring it down so we go into start there and say it will be zero and then we move everything to the next class boundary so and that's how you create a cumulative percentage polygon and this is useful to when you compare at least two or more group because then you will have two lines let's say we need to compare the ages of students who do night school or who attend at night and those who attend at day school and we want to compare you can use the orgif to do that because then it will show you the distribution of those two groups a scatter plot graph is a graph that shows a relationship between two numerical values so with the scatter plot you need to have one numerical value that you want to compare to another numerical value and we're going to use this when we do the last study study unit 11 when we do regression line regression analysis that's when we use the scatter plot to check the relationship between x and y so with a scatter plot for numerical we compare two independent variables together so and then you just plot them so this one is 23 and 125 so let's say 23 is here and 125 is there that will be the block 26 and 140 we can assume that 26 is here and 140 somewhere there that is the block and you plot all of them and that will show you the data as a scatter plot and later on we can describe what that data tells us later when we do regression for now you just need to know the scatter plot is another graph that displays numerical data and it's used to compare two numerical data sets okay and that concludes what we were supposed to do last week if you have any question i haven't been checking on the if there are any hands there are no hands um any questions remember if you are able to type in the chat you can type in the chat and if you have any questions feel free to stop me and ask if there are no questions then we can go to today's class because then i am going to be quiet and let you speak because i have been speaking a lot okay so let's go to today's session i need to share my entire screen for this purpose because i want to okay so please i need to know how many people are able to are able to type in the chat as well so please try and and and put as many comments there in the chat so that i can see how many people are able to to comment and when we do the exercise when you do the exercise uh sviso oh sorry sorry okay so when we do the exercise if you are able to like the answer let's say for example uh hendrik i'm going to use hendrik because i'm not sure i'm going to be able to pronounce that hermeias i don't know if i'm pronouncing it correctly so let's say this is the answer to the question you are able to just like that answer and then the more people like the answer the more you can see if that is the right answer so please let's use the chat function for more into it and i will also i will also check the chat as often as possible when we do the activities because we need to be engaging so okay so let's do the activities for today uh i have about 30 activities that we're going to do let me just explain the process we might finish early because this is content we don't do a lot of calculations so it might go quickly um i'm going to give you a second like a minute to think about your answers and then we're going to discuss the answers together so how i prefer us to do it is to answer the question step by step we explain every line even though we know what the answer is but we're going to make sure that we explain every line whether it's correct or wrong we're going to explain so that um this way one you will be learning because you there will there might be things that i have not discussed with you which are true and they appear on the question because there is so much that someone can say in in two hours um and then there will be those that are not true but we need to say why they are not true and which way is the correct one because you need to know the correct one as well so we will discuss questions in detail we're not i'm not going to ask you to say which option and then you say option one so the options you can put on the check on check you can say it's option one the answer or it's option two and then when we discuss we go into details right i will show you the first two questions that we do and then the rest of them then you're going to do it your own like yourselves so i will just be writing what you are saying um i'm not going to be talking a lot i will come in when i need to clarify things that are not set or i need to add more to what somebody else has said right are we in agreement and this is where everybody and mutants say yes so yes thank you yes yes i want yes this is the kind of session i want people to talk to you yes okay so let's do this okay your first exercise complete the following sentence is um is a statistical method that draws conclusions about the um based on the um computed from the um so think about it if you are able to write on the chat think about it type on the chat the answer like say it's option a option b option you don't have to unmute as yet then when we come to discussing this question we're going to answer the question together have we thought about it so let's try and complete the sentence who wants to try uh hi tell me here yes let's complete the sentence uh descriptive statistics is a statistical method that draws conclusion about the population based on the statistics computed from the sample based on the statistic computed from the sample is that what you are saying okay anybody else anybody else see the answer complete the sentence if you say it see therefore it is inferential is a method is a statistical method that draws conclusions about the population based on statistics statistics computed from a sample because we use inferential statistics to infer the information we collected and summarized from the sample back to the population whereas descriptive statistics we describe either the population or we describe the sample we don't draw conclusions about the population based on the sample when we use descriptive statistics and a and b would have been incorrect so those two the answer is c okay another moment which of the following is not the goal of a descriptive statistic okay that moment is came and gone who can describe what descriptive statistics is I just gave you a description of what descriptive statistics is so what is descriptive statistic no one alone in this room collecting summarizing and presenting data yes so if descriptive statistics is about summarizing and presenting data for all those five statements look at them and say and let's go one by one and say why they they are not or they are goals of statistics is a a goal of descriptive statistic yes yes that is a goal because we summarize in the data b yes yes b is also a goal because it's to display the aspects of data that you have collected see yes see also yes because we're reporting on those numerical values that we have created we have collected as well d no he is incorrect it's in fairness he says we are estimating which is therefore g will be the answer that we are looking for because we're looking for not a goal of a descriptive statistic so because we are estimating then it means it is incorrect this is a goal of an inferential statistic e presentation of the data in a form of tables chart summary statistics is a goal of descriptive statistic why am I doing all this way the reason why I need to also explain why we need to do it this way the reason why I'm doing this is so that you can understand more in terms of the things that we discussed on Wednesday the other thing since you're not writing the exam you are still studying you are in a studying process when you answer questions try and go through each and every statement and ask yourself with regards to that statement whether is it true or false and why is it not true based on the knowledge that you have and if you are unsure about that statement put a question mark on it and then come back to it later on once you have exhausted all your options and you can use your study guide you can use google you can use anything to check that answer out there is nothing that stops you from using any other resources because this is not an exam you are right you are doing your assignments you are studying so use as many resources that you have at your disposal do not rely on people giving you answers that you don't know how they got so try and do the answers yourself okay let's move on which one of the following statement is incorrect with regards to a statistic take a moment to think about it remember those who are able to to type in the chat you can type your answer and everybody else who is able to interact on the chat if they agree with your answer they can like it if they disagree with your answer they can type their own answer and people can agree with their own with the other answer let's be interactive that one minute came and gone okay which one of the following statement is incorrect with regards to a statistic a statistic is a measure that comes from where from the population anybody else from a sample a statistic is a measure that comes from a sample remember that so if we know about that so number a and remember the measures are like your mean your median your standard deviation your mode and so forth so number a a sample standard deviation is a statistic is that correct correct that is correct a statistic is an estimate of a population parameter okay why are you saying it is incorrect let's go back here I think the let's go back here yeah we said inferential statistics is about drawing conclusion about population based on statistics that are computed from a sample and we said inferential statistics has two methods that we can use we can use estimation and we can also use hypothesis testing remember that that's what we discussed in the first class so coming back here based on what I just said the statistic is an estimate of a population parameter it is correct because we use the statistic to estimate what the population parameter will be remember with the estimation if we look at the confidence interval whether it will fall in with it and with hypothesis testing we infer the data as well by looking at when once we calculate the test statistics and we can say whether we reject or accept the null hypothesis and we say the information we use from the statistic which is from the sample we can infer the results and say we are 100% sure that the population is obese so this is correct a statistic is an estimate for a population parameter because a population parameter also is like your standard deviation or your mean so we use a statistic to estimate the population parameter see a statistic is a summary measure calculated from a sample is that correct yes yes correct that is correct number d a population mean is a statistic yes yes yes what is it correct yes it's correct yes it's correct population mean is a statistic no that's incorrect that is incorrect because measures that come from a population like your population mean your standard deviation and so forth we call them parameters next so a population mean is not a statistic e a statistic represents a property of a sample and that is the last one we have and that is looks correct so the answer that we are looking for here is number d so when you answer the questions please go back to the notes as well and check if your understanding is still the way you know things ask yourself those questions okay so move on to the next one which one of the following statement is incorrect a variable is a characteristic of an item oh sorry i'm jumping the gun i'm giving you a minute to look at every statement take a minute to look at each and every statement in the meantime you can ignore number five if you are not sure because we never spoke about that but if you know how to answer that you can still use it in your answer so we'll answer it just now i will explain it just now okay so your one minute gone let's go option one a variable is a characteristic of an item or individual being measured is that correct yes correct yes that is correct this is correct because a variable is a characteristic that define an item whether observed or measured and in this instance they're telling us about the measured one number two a sample is a potion is a potion of a population selected for an analysis is this correct yes it's correct yes it is correct because it is a potion or a subset so this is correct number three in a pie chart the size of a segment varies according to the percentage of each category so the pies the size of your pies or the segments varies sorry according to the percentage of each category is that correct it's correct yes that is correct number four and histogram describes better qualitative data than a pie chart that's not correct is that correct no no okay now since i can hear the mumbling around we have two charts here a pie chart we use pie chart to to describe or to present it in to present what kind of data in a pie chart what kind of data can we use to describe there are three charts in that category we use categorical or qualitative categorical or what we call qualitative qualitative data and remember your bar chart has bars in between ne a histogram when one finish the other one starts so what kind of data do we use to display this histogram qualitative no we just did it now now now the last bit numeric was left if we use numerical data yes or quantitative data so the statement reads as follows a histogram describes better qualitative data than a pie chart is that statement correct no no it's not correct it is not correct because histogram only describes quantitative data okay so that isn't why I said we can ignore number five is because remember this questions come from your past exam paper and in the exam paper we assume that you already have done all the sections so wherever we include the questions you should have already know the knowledge about it so we didn't discuss what a mode is in on wednesday we're going to discuss that in full detail but a mode is the most appearing number the most frequent number the most highest not not in terms of of quantitative data we use only two things to describe it to in terms of numerical data we say the mode is the most appearing number the number that appears more than the other number so when we have one two two two three four and five two appears more than the other numbers when we talk about the quantitative sorry the qualitative data which is the categorical data or when we talk about the histogram because on the histogram we already created class intervals which then creates categories then on the histogram or on a bar chart the mode of the data is the highest bar so if I look at these two charts that I have I can see that this one is the highest and this one is the highest so the mode of this histogram will be this graph and the mode of this bar chart will be this graph that is only in terms of categorical information or a histogram or class within information if we're going to talk about numerical data which I will explain more on wednesday mode for numerical data is the most appeared number or the number that appears more than the other or the most frequent number not the highest number but the one that appears more than the others that is the mode but we will discuss that just wanted to clarify that one line okay your next question is exercise five what is a summary measure that is calculated from a sample to describe a characteristic of a population you have one minute on this oh actually you don't even have to have one minute we can have a half a minute because it's not a sentence that you need to read the key weight here is summary measure from a sample that's the key weight you need to use okay so let's touch at the top since already someone is giving us an answer what is a box plot a box plot is a graph also on wednesday if we get to it when we do the quanta we will discuss what a box plot is a box plot is a graph that displays numerical data okay data we know what data is data is a value associated with a variable values associated with a variable remember that it's those values that comes from a variable parameter is a summary measure from a population a population a population all elements or items of study so all individual or interest individual interest that needs to be included in your study a statistic is a summary measure that is calculated from a sample to describe a characteristic of a population and that is number five question six which one of the following statement is incorrect with regards to qualitative and categorical data we have a minute to think about your answer okay remember the last slide on the scales of measurement the one that describes the order this will be relevant for it so let's start with number one which one of the following statement is incorrect with regards to qualitative or categorical data number one categorical data is measured on an ordinal or nominal scale is that correct yes it's correct that is correct the frequency or the count of each element can be determined is that correct yes it's correct it is correct because with qualitative and or categorical data we can count the elements because once they are in the category we can say how many they are they fall into the category so we can determine that can we find the mode yes i just explained but i use the bar chat remember bar chat is just to take a graph that shows the categorical data i use the bar chat and i describe what the mode is not so long ago so can we describe the mode can we determine the mode of a categorical data yes we can yes we can because if we look at this bar chat this highest bar is what we call the mode yes if you got like a graph that has um the same percentage say all all of those four categories are the same can you say that you can determine the mode in that case so let's say you mean like this yes yes you can determine the mode because in this instance there is no mode we can still find the mode so when we do chat when we do measures of central location we will discuss what the mode is they even with numerical data you can find the one mode no mode two modes three modes or multiple modes you can find those so no modes still so they all of them are the same so there is no bar that is bigger than the oh that is taller than the other let's put it that way but for categorical data remember you can you can find the mode so let's go back to let's open again our notes all these notes are posted the one that we are using today in class i just emailed it before the class okay remember when we were doing the levels of measurements we're just going to go to that one slide remember this for categorical data it's nominal and ordina we can find the mode for a ordina we can find the median but we cannot find the mean remember this right going back thank you i just gave you the answer to the last one so we know that this is correct this is correct this is correct can we find the mean we can't find the mean of a quantitative or quantitative or a categorical data so the answer we are looking for um this question it is too ambiguous because none of the above is also still an option and in this instance is still incorrect because if you get statements like this in an exam i feel for you but you can apply your logic in terms of this and ignore number five and only use their other statements so none of the above would have been if none of them are two then this is also correct because it says none of the above but we know that this one is incorrect so you cannot be none of the above so this statement very confusing at times so you can ignore it and only use the others and you will find those kind of statements as well in your past exam papers when you do your practice exam um exercises and so on so forth okay so exercise seven which one of the following statement is incorrect i will give you a moment here as well because it seemed like it's too much info okay so let's go through the statement number one a population is a complete set of object in the study while a sample is a subset of a population is that correct yes correct it's correct that is correct number two a statistic is a property a statistic is a property of a population while a parameter is a property of a sample is that correct incorrect that is incorrect because a statistic is a property of a sample and a parameter is a property of population of a population therefore it means the rest of the other statements will be correct data from a sample because we're looking for the incorrect one and we found it so therefore data from a sample are in a form of either a numerical categorical we know that a variable can either be one of those so that will be correct quantitative data are numeric and can either be uh those that are numeric can either be discrete or continuous they are discrete if we count because they can be count count and continuous when they are measured and qualitative data are categorical data and they can use labels to identify the attributes that is correct okay this should be easy which one of the following variable is not a categorical variable number three number c number c the height of a person is or it can be measured not a categorical it's a numerical variable it's a numerical variable okay which one of the following statement is incorrect easy yes are you saying which one is correct or incorrect oh sorry which one is correct correct sorry i've been talking about incorrect all the time and then i'm assuming also this one says incorrect which one is correct number one is correct number one is correct meritorial gender meritorial status religion are example of qualitative ordinal variable what are qualitative ordinal variables this is a categorical variable that can be placed in order is gender meritorial status and religion can are they ordinal yes they are ordinal means natural order or logical order so are they no they're not no that's not correct the amount of money a person spends in a shopping mall is a discrete variable money because it's in sense sense anything that takes a decimal we call it continuous so they should have they should be nominal and they should be continuous so this is also not correct number of girls sorry yes yes so what is meant by the discrete variable a discrete value is values that you can count number three the number of girls with blue eyes it's correct it's a discrete variable now here is another one the number one the position one finishes in a race is a discrete variable remember position you come first second second it cannot it cannot be a numerical discrete after numerical variables remember that so this should be a categorical variable the number of times a mouse make a wrong 10 in a laboratory is a continuous or is it a discrete discrete it should be discrete not continuous so the only answer which is correct is number three which one of the following statement is incorrect we can go through the statement we don't have to think about them now a quantitative discrete variable results from a counting attribute we just described this just now someone asked is that correct we're looking for the incorrect one discrete variables they come from count counted are the variables that can be counted or measured counted counted so then for this is correct incorrect this is correct so we're looking for the incorrect one a quantitative continuous variables that can be measured is that correct so this one also is correct number three the mean the median cannot be determined from a nominal scale we just looked at this is that correct nominal scale was the first one it's nominal then ordinal what can we what what can we determine from a nominal we can determine the frequency and we can only determine the mode so therefore it means we cannot determine the mean and the median so this one also is correct correct or if you stick lost you can keep that and you can go there to refresh your minds the mean phenomena and the median you cannot find that the last one and ordinal scale is a higher level of measurement than an interval scale incorrect that is correct that one is the answer we are looking for is the incorrect answer number 11 consider the following variables now I want you to go to the the question which one of the following variables above are quantitative and which one are not so let's label each one of them height either a tall or short is this a quantitative or qualitative because they give us the description in terms of this height quantitative as either tall or short is qualitative they don't they are not using numbers yet it's either it's categories tall or short so it's categorical which is quantitative your status as either full-time or part-time qualitative it is a qualitative or quantitative qualitative or qualitative qualitative conditions either poor fate good and excellent it's qualitative the size rating of small big medium it's quality qualitative the size of screen in inches 40 inch three inch quantitative so the question here is which one of the variables above is a quantitative variable or variables Lizzy yes can I check something the option a yes when we say your height is either tall or short uh yes it confuses me because I thought we're looking on the height where we can measure it while you said it's either tall or short I think I've got confused there yeah it's the same thing with the size of a ring they could have used inches as well and say 3.5 centimeter 4.5 15 centimeter round cycle diameter whatever but they used the minute they put categories into it they convert that numerical value into a category and it also to explain something yeah remember oh let's give an answer to this question first it's remember we said age is continuous age on its own is continuous but if they say age age in yes what it will be it's not going to be continuous it's going to be discrete also money money can also change from being a continuous variable to becoming a discrete depending on the context of how the question is put if they say money to rent rent does not have decimals it's one run two run three then you can count that the minute you put sense to it then it becomes measured so you need to be very careful when you read the sentence don't just take the weight that you see first but also you look at what other things they extend the sentence with so he's saying in this case a rent is discrete and sense is continuous yeah so if they say money let's say we said the with one question they said the amount of money you spend in in a shopping mall they never said the amount of money in rent that you spend in a shopping in shopping mall if they would have said that then that question would have changed from being a continuous to being a discrete okay okay which one of the following statement is incorrect let's go through each one of them the age if the researcher oh actually should say the age of the researcher age and we just spoke about it now is it correct you looking for the incorrect one is age correct is it a quantitative continuous variable yes it is that is correct the salary of the researcher is it quantitative continuous variable yes it is yes it is gender of a researcher is it quantitative nominal nominal means there is no order or logical order or natural order yes is that correct that is correct yes the the position of junior mid-level senior of a researcher incorrect ordinal this is incorrect because it is order ordinal I'm not going to go to none of the above one as well because you remember the same thing that we discussed and I expect for all the questions that are going to come now you can just ignore the none of the above the appropriate graphical form to summarize the categorical data histogram we use what it summarizes what kind of data quantitative yes it summarizes quantitative or numerical data scatterplot these are the last charts that we did scatterplot it looks as two numerical values so this also looks at numerical values because scatterplot remember is that one where I have the age here and I have income here that is a scatterplot graph a bar chart if I can remind you a bar chart looks like this qualitative yeah scatterplot categorical the frequency distribution summarizes numerical data so the answer we are looking for is c which one of the following statement is incorrect with regards to the variables quantitative variables uses labels or names to describe attributes of elements that is incorrect qualitative variables have either nominal or ordinal that is correct that is correct variables can be classified as categorical or numerical that is correct counting attributes of an element results in a quantitative discrete variable because we're counting yes it's discrete because of counting it is discrete measuring continuous correct it's correct so the only answer we are looking for which is incorrect is number eight okay I'm not going to read the whole paragraph which one of the following statement is incorrect with regards to types of variables the name of school requiring Lena transport the name of school is it nominal requiring Lena transport is a qualitative nominal variable remember nominal means order so names of schools I think that one is correct that will be correct because there is no order in terms of the the schools that requires transport the distance from the Lena's home to school is it continuous or is it discrete can we count or can we measure do we measure or do we count continuous we measure the distance so this is correct it is continuous because we measure the distance remember we're looking for that incorrect one the number of Lenas at each school are we counting or are we measuring yeah so if we are counting is the travel time from the Lena's home to school do we count time or do we measure time using a clock measure time we measure time so if we measure incorrect continuous so this is the incorrect one the levels of school as primary and junior and high is ordinal which this will be correct okay give it a table with a sample of 500 of people living with autism in South Africa which one of the following statement is incorrect the sample refers to 500 people is that correct yes correct correct because it is given the we're looking for the incorrect one a population refers to everyone living in living with ASD in South Africa that's correct correct this is correct because the population will be everybody that the sample is created is taken from so the sample is of 500 people living with autism disease or ASD in South Africa so that everybody with ASD in South Africa will be the population of interest the age of a diagnosis is a variable a variable is a characteristic that describe the population or the sample so it's age that's age describe a population or the the sample is age is age a variable think about gender gender is a variable which has data female and male think about age so this is a variable and this is data what is age age is a variable because age you are either 21 years of age 18 years of age 14 so age will be a variable and the data will be 18 14 21 31 things like that so these ones are age in months okay and so it is a variable age is so all these values at the top here we call them variables all these values here inside we call them data so age is a variable the mean age of a diagnosis calculated from the sample is a parameter that's correct is that correct no in correct it should be statistics it should be a statistic not a parameter okay we left with almost 12 minutes consider the graphical methods a to d below the question that we're going to answer is which graphical methods are the most appropriate for a quantitative data so let's look at each one of them before we go to the option a bar chart is it quantitative or qualitative data qualitative it's qualitative data a histogram quantitative quantitative a pie chart qualitative a to gary's as well i get it near that quality qualitative it's quantity because we can put it into yeah it's quality okay let's get a plot fine to k kative sorry Lizzie yes it's a pie chart can't you show percentage there pie chart you can show percentage okay yeah but a pie chart is a visualization for categorical data we can count how many falls within and calculate the frequency and then calculate the percentage of those but the visualization so if we answer the question we need the quality quantitative data so it's b d and e only b d and e so it's option three which one of the following statement is incorrect with regards to tabular methods of summarizing the data so yeah you need to think about frequency tables you need to think about frequency distribution tables because they're not saying which one in terms of either the categorical or numerical so they say for summarizing tabular methods for summarizing the data remember frequency table is made up of frequency distributions relative frequency distribution percentages uh percentage frequency distribution and so forth mm we're looking for the incorrect answer a frequency distribution relative frequency distribution percentage frequency distribution are tabular methods suitable for summarizing both qualitative and quantitative data that one is correct correct this is correct because both of them you can do the frequency you can do the relative frequency and you can calculate the frequency the percentage frequency cumulative frequency distribution at tabular methods suitable for summarizing qualitative data that one can be a cumulative frequency it's incorrect that's incorrect it's incorrect it is for quantitative quantitative data so this one would have been the one that we are looking for to create a frequency distribution for a quantitative i'm going to just mute a little bit i'm not sure which one it's none of you that i just muted it's free so oh i'm quiet yeah it's not you are quiet it's not no okay there is there is an echo okay it's fine uh that is correct the percentage frequency equal to the relative frequency multiplied by 100 that is correct correct cross tabulations are suitable for summarizing the relationship between two variables and that is correct all right uh we didn't talk about the cross tabulations so cross tabulation we're also going to do them in um when we do uh probabilities i think in in in uh study unit four and we also going to talk more about them when we do the chi square tests okay which one of the following statement is incorrect a histogram shows data the data set is distributed remember a histogram can show the shape right this one so a histogram shows the date the distribution of the data that's correct okay let's go back to the histogram the next question is asking a bimodal histogram is the one with two peaks well yes it's correct correct if a histogram has a long tail extending to the left it is a positively skewed i didn't talk about positive and negative skewed i spoke only about the right and the left so let's assume here is zero and let's assume here is zero any value this site will be negative negative any value this site will be positive positive positive right yes going back to our question if a histogram has a tail extending to the left it is positively skewed so let's go back to our diagram left say it is negatively okay so this is incorrect when constructing an orgy we need to make use of the cumulative frequency relative frequency that is correct correct consider the following frequency distribution table with the sample of 50 grade 7 lenas which one of the following statement is incorrect now we are given the cumulative frequency what you need to do here you can calculate your frequencies if you want and you can calculate your percentage frequencies because that's what they need frequency and you can also find your relative frequency and all that but if you have your frequencies you will know what happening there what is the relative frequency sorry what is the frequency for the first one because this is a cumulative frequency therefore the frequency here will be four how do i get the frequency for the second one i must say 17 minus four you need to have your calculators 18 what is the frequency for the next one 42 minus 17 so you need to subtract the one above the one above 25 all the time 25 48 minus 42 6 6 6 50 minus 48 2 2 if you add all of them they give you 50 right because the the last cumulative frequency here is 50 so it must also be 50 so what will be the relative frequency let's start with the relative frequency yeah let's do that let's say we're going to split this into two this one will be relative frequency is four divided by 50 we're going to say four divided by 50 0.08 yeah four divided by 50 is 0.08 we're gonna finish just now 0.08 yeah 30 divided by 50 i'm not going to do i'm not going to do all of them but 30 divided by 50 0.26 i didn't hear that 0.26 0.26 i'm just gonna do only those ones so let's look at the questions we will add others as we go along but you get the you get the understanding behind everything that i wanted to do here yeah what is the frequency of classes between six and eight six to two eight so you go to six to eight you look for the frequency that is the frequency it's four so therefore this is correct the relative frequency of nine to eleven you come to the relative frequency column and you go to nine to eleven which is nine to eleven the relative frequency is the percentage frequency of class 12 to 14 so you need to go to 12 to 14 which is 25 you need to go and say 25 divide by 50 multiplied no remember we're looking for the percentage frequency so you need to go and create the percentage frequency you need to take 0.5 and multiply that with 100 and that will give you 50 which is correct the midpoint of class 15 to 17 remember we're looking for the midpoint the midpoint is you add the two the lower limit and the upper limit and divide by two it will give you the midpoint so for 15 to 17 you say 15 plus 17 and you divide that by two and it will give you 16 correct so that is correct the width the width of 18 to 20 you take 20 you subtract 18 and it will give you the width four and the width is four two so it's 20 minus 18 which is equals to two so this is the incorrect one and that's how you will do your frequency distribution tables answers anyway apart from all these things that I have there are so many other questions but you will see that they almost look exactly the same so I was hoping that we will get to this one where we can complete this question and then answer the question so we don't have to do that you can do it on your own and we can discuss it on WhatsApp don't worry about it we have the WhatsApp group to discuss all the other questions like question number 22 not 22 from 21 until question 30 so there is 23 they almost look almost exactly the same as what we have been doing I just remove the options and then make it like question like so that you can think more and not choose the options as you see them then there is also question 24 that you can go through question 25 26 27 and 28 and and with that so all those eight questions that are left we can discuss them on WhatsApp or we can discuss them on on my UNISA before we before you go I just wanted to show you on my UNISA