 gives you the variability of your data set that you have, that it means data is scattered everywhere. And the are, the measures of variations are, we can use the range, we tell us how far apart the data is. We can use the variance, which normally we don't even interpret because we use the value of the variance to calculate the standard deviation and the standard deviation we can interpret because with the standard deviation formula, it takes back the value into the units of the original edge values. Then we can interpret the values. Then we can calculate, yes. I wanted to ask. Measures of variation are they the same with measures of dispersion? Yes. Measures of dispersion is the same as measures of spread, measures of variability, measures of dispersion, they mean one and the same thing. Because variation tells you how dispersed your data is away from the mean or how variable it is, or how scattered it is around the mean. So we also have coefficient of variation, which we can use to find out the value likelihood of the risk, which also tells you the variability of your values as well. The range, we did cover the range when we did the frequency distribution, when we built in the frequency distribution. So the range is your highest value minus your lowest value. So it means when you calculate the range, you need to order your data from lowest to highest in order for you to know what is your lowest number and what is your highest value. And that will tell you the range. The variance, like I said, we don't interpret it, but your variance is your average of your square deviation of values from the mean. It tells you only the average, which is your mean of your square deviations from your normal mean. And we look at how we calculate the variance. So we can calculate the variance for the sample and we can calculate the variance for the population. When we calculate the variance for the sample, remember with the sample, the statistic, we use normal values that we understand. So with the sample variance, we use S squared. And S squared is the sum of your observed value minus the mean squared divided by N minus one. For the population, we use the grid letters and the parameter is sigma, which is the population variance, which is the sum of your observed value minus the population mean divided by N. Similar, notice the formula. For the sample, the formula uses N minus one and for the population, it uses a capital letter N. They look or not exactly the same, but the difference is in the denominator way. It's N minus one and the capital letter N for the population variance. Your standard deviation is the square root of your variance. If we take the variance and we put the square root on it, we will find the population standard deviation, which is yours for your sample variance or your sample standard deviation, it will be the square root of your sample variance, which we use the letter S for the sample standard deviation. And it is the square root of your variance. The same formula, we just put the square root on it. It shows you the variation about the mean and because we are able to interpret it because it calculates and converts back the answers to the same units as the original value. And we will look at the answers when we calculate this standard deviation. For the population parameter, which is the population standard deviation is the square root of your population variance and we use sigma, which is a symbol that represent the parameter for the population. Let's look at an example of how we calculate the standard deviation. We're not going to look at the variance because the formula of standard deviation includes the variance. So we just put the square root on top of the variance and that gives us the standard deviation. So let's just look at how we calculate the standard deviation. On Saturday, I can show you how to use your calculators to calculate the standard deviation, which makes it easier. But for now, you need to know how we compute your standard deviation using a formula. So given the sample data, which it's 10, 12, 14 up until 24, we can count how many there are. There are eight values within this data set. We can calculate the mean, you know how to calculate the mean, the sum of all the values divided by how many there are. If I add all the values and I divide by eight, I get the mean of 16. That part we've covered in the previous section. I'm not going to go into detail on that. Now, to calculate the standard deviation, remember the formula. The sample standard deviation because they gave us the sample data, so it means we use the sample standard deviation, not the population standard deviation, the sample standard deviation. We know that it is the square root of the sum of your observation minus the mean, sample mean squared divided by n minus one. The formula S is equals to the square root of the sum of your observation, X observation minus the mean squared divided by n minus one. That is the formula that we're using. So we know what the values of our X observations are. We can just expand to this because it's a sum, it means adding up. So it means the first observation, we can expand it, the first observation, add the second observation and the third observation up until we get to the last observation. Minus the mean, we can substitute the value of the mean divided by n minus one. Our n is eight minus one and then we have substituted all the values. Then we can do the calculation. 10 minus 16, it's minus six square the answer, minus six times minus six is 36 and you can write the answer down. So I didn't go that route of writing all the answers. So this would have been 36, not 36 squared, but just 36. Let me use a blue-eripped fence. Christopher is raising your hand. Christopher, if I can't see your hand, you need to unmute and let me know. Yes. Yeah, good evening. I'm listening. Good evening, ma'am. Yes. Yeah, I raised my hand a long time ago because you were too fast for me. I hope other students maybe are with you when you're speeding up but I'm struggling to catch up. I know that the formulas will get them on an exam but if maybe you can slow it down, ma'am, please. Okay. We'll slow it down. I don't know where you got lost. So I must go back and I will start from here. Okay, maybe just go back. Yes, from there, thanks. Okay. The range, you order your data from highest to, sorry, from lowest to highest. Your range is your highest value minus your lowest value. So if my highest value is 18, so this data set of mine is huge a little bit. So this must have been there and 12 must be there. This must move there. I think I've stretched it a little bit. This is 18, so this dot and 13. So I only have one, two, four, five, seven, nine, 11, 12 and 13. Those are the points that I have. It's just that my data set here stretched out. So if my highest value is 18 and my lowest value is one, so it's 18 minus one, which equals to 12. That's my range. Standard deviation and variance. The variance, sample variance, which is the statistic, we use S squared is equals to the sum of your observed observations minus the mean squared divided by n minus one, which is the statistic. The population variance, which is your parameter, we use sigma squared, which is the sum of your observation minus your mean squared divided by n. You will get this in the exam, the formulas, like you said. The standard deviation is the square root. If I put the square root here, you will see that is the square root of that because then when I have a square root, then square root there and square root here, two will cancel out that square root. I will be left with S on one side and then the square root on the other side. So the standard deviation is the square root of your variance. And this is the formula for your standard deviation, which is your sample, your statistic. We use S to represent sample standard deviation. We use sigma to represent the population standard deviation, which is the square root of your population variance. We used an example. We are given the sample data, so therefore it means we're going to use the formula for the sample standard deviation because we are calculating the standard deviation. Our sample, there are eight observations within that sample size. To calculate the mean, we've done that. I said I'm not repeating the calculation of the mean is the sum of all these values divided by how many there are, which is eight. When you calculate the mean, your mean is 16. The formula says the sum of your observation squared. Therefore it means we need to expand this. For every observation, we need to add another observation, another observation. So it means since it's observation I, so it means it's that one, whatever the observation minus the mean squared plus, we go to the next one. The next observation minus the mean squared plus. The next observation and we substitute the values. Then we can substitute the value of our mean. I could have skipped that first step and just went into here because I could have said 10 minus the mean, which is 10 minus 16 squared plus, 12 minus 16 squared plus, 14 minus 16 squared plus, and I add all of them. Divide by n minus one, which is eight minus one. I said I skipped that one. Then we step there, which is 10 minus 16, which is minus six, minus six times minus six because it's a squared. You can use your calculator as well. It's 36. So you will calculate the 36 plus. 12 minus four is, sorry, 12 minus 16. I'm already in the answer mode. 12 minus 16 is four, minus four. Minus four squared is 16. Plus 14 minus 16 is two, minus two. Minus two squared is four. And you continue and do all of them. Plus 24 minus 16 you get. So the dot dot means the other values because I didn't include all the other values. So you'll go and find the other values as well. So you go and do 15 minus 16 squared and find the answer, 17 minus 16 squared, 18 minus 15 squared, 18 minus 16 squared. And then the last one was 24 minus 24 minus 16. Which is eight squared, which is 64. So the last one will be 64 squared divide by eight minus one is seven. And when you have all of them, so when you add 36 plus 16 plus four plus the other numbers plus up until 64, the answer you get is 130 divide by seven. 130 divide by seven and you take the square root of the answer you get. So if I divide 130, 130 divide by seven, I get 18.57143 and I take the square root of that answer I get 4.3 and that tells me how far apart my observation are from the mean. So the distance between my observation and the mean is 4.3 and that is the standard deviation. I mean all my observation are scattered around the mean. If this is my mean, if this is the line that represent my mean and my mean is 16 at this point. So if this was an X and Y observation and this is my X observation and this is my mean, let's say the mean is 16 at this point. Therefore, these values would have been scattered around some way they, that's what it means. So how far are these values from the mean? So maybe one is there, one is there, one is there, one is there, one is there, one is there, one is there, one is there, but the distance roughly is 4.3. That's what the formula says. When you calculate the standard deviation. Now we calculated the standard deviation by taking the square root. If we want to know the variance, the variance will just be 130 divide by seven, which is your sample variance, will just be 130 divide by seven, which gives us 18.57 and that is your variance. I'm not going to ask you to do any calculation because this formula takes forever to calculate. So on Saturday, we will have lots and lots of exercises that we will do on this so that then you can learn how to calculate the standard deviation. So in honor for me to complete the session for today, we're going to do the coefficient of variation. And then at Godapast, I think Godapast eight, we'll take a five minute break, breathe out when you do your exercise or activity. And then we will come back at half past and then we do the last 30 minutes we look at the quanta, the quanta house. Okay, so measures of variation. We also have what we call the coefficient of variation, which is a relative variation measure. And it is always calculated in percentages. It shows your variation relative to the mean. We use coefficient of variation to compare the variability of two or more products. So let's say you work for an insurance company. I think we use this example earlier when we were doing the introduction. Let's say you work for an insurance company and you work as an investment manager. You look after the portfolio of investment of people. And you have different kinds of clients that you need to satisfy. Some are those who like to take risks and they are those who don't like to take risks because they don't want to lose their money. So you can check the variability of each on every product that you have or you will like to invest the money of your clients in. The coefficient of variation is one of those calculation or measure of variation that you can use to check the variability of each and every investment product that you want to invest in. And then you can decide in terms of the type of clients that you have, which one you can invest money for the other client and which one you will invest money for those who like to take risks. We use this. And the coefficient of variation is just your standard deviation divided by your sample mean. That gives you your relative variation. Multiply that your standard sample standard deviation divided by your sample mean did multiply that by 100 because we need to get it to a percentage. Let's look at an example. We have stock A and we know that the average price last year was 50 rent. And the standard deviation was five rent. So already they've calculated the average, the mean and they calculated the standard deviation. We just need to substitute into the formula. We know that our coefficient of variation is your sample standard deviation divided by the sample mean multiplied by 100. Our standard deviation is five rent. Our mean is 50 rent. So it will be five divided by 50 multiplied by 100. Then we get 10%. Then we have stock B. The average price last year for stock B was 100 rent. The standard deviation, same as stock A was five rent. Calculating the coefficient of variation, five divided by 100 multiplied by 100% gives us 5%. If you realize, both of these stocks have the same standard deviation, but stock B is less variable, has the less variable variation, a relative variation than stock A. So it means the distance between your values over the year are closer to the average. So it means if this is an investment, let's say this is investment A and investment B. Therefore it means I will invest if I'm that person who doesn't like to lose more money, I will invest in stock B because I know that the fluctuations are not as distance to the average. I will still get my return because it's less variable. Even though the return might be smaller, but at least I will not be losing my money. But if I'm that person who likes taking risks, I can invest in stock A because the variation is high. It means sometimes the price might be higher, sometimes it might be lower, sometimes it might be high. So I might end more money sometimes, I might lose money sometimes. So that is how you analyze and check whether in which stock you will prefer to invest the money of your clients from in case you work in an insurance company. But this does not only apply in the insurance company, it can apply also in any instance, in any environment. You can use coefficient of variation to check the variability of your data. Based on the data that we use for the standard deviation, we can also calculate the coefficient of variation. Remember our data set, our mean was 16 and our standard deviation was 4.3. So we can take our standard deviation of 4.3 divided by the mean of 16 multiplied by a hundred. And that will give us 26.93. And that tells us the variability of that data or the relative variation of the data that we have. And with that, remember I said I'm going to give you time to take a break and also do an exercise. Here is your exercise to do. Given the number of people living with ASD and this is the table with the X values or your sample of your values, 100, 4206, 85 and 57. Based on that information, which one of the following statement is incorrect? The mean and the median are equal. The mean is less than the median. The mean, the mode is zero. There is no mode, ignore the none of the above in this instance. So it means you need to calculate the mean, you need to find the median of this data set. That's A, B, what is the sample standard deviation? So you need to go and calculate the standard deviation of this data set. Remember the standard deviation formula as is equals to the square root of the sum of your observation minus the mean. You would have calculated the mean when you answer number A, squared divided by N minus one. So there are one, two, three, four, five values. So it should be easy to calculate quick, quick. Then remember the mean, your mean, if people have forgotten how to calculate the mean, we're using the sample data. So the mean will be the sum of all observation divided by N. And remember your median, you can find the position by using N plus one divided by N. And since we have odd numbers, so after you have sorted the value, you don't even have to use the position. You can go and find the median, the middle value. The last question that you need to do, calculate the coefficient of variation. So remember coefficient of variation, your standard deviation, you would have calculated your standard deviation there, divide by the mean, multiply by 100%. Oh, multiply by 100. Should let me not confuse those who didn't do maths before. Let's not put the 100% and just do 100. Just multiply that by 100, and then you get that answer. You have up until, let's say 25 pasts to do all these answers. You can take a five minute break, you can in between, you can do your exercise now and come back and take a five minute break later on, or you can take break now and then come back and do your exercise. But remember we coming back at 25 pasts. Good luck. If you have any question, I will be here. I'm not going anywhere. I'm just going to mute myself and you can ask if you need to ask. Good luck. You'll have three minutes. Remember if you have done your answers, you can post them on the chat. Hello, Z. Hello. Just a quick question. For A, should I not read which of the following statements is correct? Which one of the following statement is incorrect? Okay, but then I'm confused. Looks like there's so many incorrect answers. About three. Never mind. Never mind, Barth. You answered yourself. Yes, I answered myself. Thank you. Okay, great. I see there we have Sviso and Hendrik. And Sarah came in as well. And she's on the roll. Posted twice. It is 25 pasts. So let's help those who I can see that I think the majority of you might have struggled to calculate the standard deviation. There is no pressure at this point because on Saturday we will do I will show you on the calculator because at the moment I'm assuming knowledge is or skills or a skill is known. So with that, let's take the five minutes to do the answers. I'm going to erase all of them and I'm going to go back so that I answer every question step by step. So question number one, I'm going to do this for you, for everybody. I don't have to ask you. So you can just compare your answers with what you hear. The mean and the median are equal. So you should have calculated the mean. So yeah, the mean is the sum of all the observations. Somebody is at good. Please mute yourself. The mean is the sum of all observations divided by how many there are. So if I add 108420685, blah, blah, blah, all of them, I get 500 divided by 1, 3, 4, 5. They are five. So this will be equals to 100. That is my mean. So I know that my mean is 100. My median, I need to sort the data from lowest to highest. I did sort it. So it was 44, 57, 85, 108, and 206. I don't have to use the position because I can clearly see that my median is 85. What is my mode from the dataset that we have? If I look at my dataset, there is no mode. So this has no mode. So the mode is zero because there is no mode. There is nothing. Coming back to the statement, I'm going to start at the bottom. I said ignore this. There is no mode. Yes, that's true. Mode is zero. Yes, that's true because there is no mode. The mean is less than the median. The mean is less than the median. So our median is 85. The mean is 85 and 100. That's where it is wrong. Two is wrong. So nobody, only Sarah picked it up. This is wrong. The mean and the median are equal. That is wrong. Actually, I think yes, you are right. If also if I go back to the mode because zero can also be available in terms of the dataset because if there's a number of people, we can have zero person. If this is a registration per day. So if this was a registration per day for people who walked in, so on certain days we might have no person coming in. So zero can also be another value. So I think in terms of this question, I think people from last year this come from the tutorial letter of last year. So this should have been also wrong. So I think actually the question was looking for which one is the correct question. So this would have been the right one. Okay, so that means that question had an error possibly. Okay, so going to the next one, which is the standard deviation. We know that our standard deviation formula is the square root of the sum of your observation minus the mean square divide by n minus one. Since this is very long, I've already calculated it during the 15 minutes that we had. So I'm going to write a couple of them. I'm not going to write all of them. 44 minus 100 squared plus I'm just going to write two of them. 57 minus 100 squared plus I'm going to do dot dot. So you must include the others. And the last one was 206 minus 100 squared divide everything by 5 minus 1. So if you have calculated everything, so 44 minus 100, you would have found one, oh, squared 44 minus 100 squared will give you 31, 36 plus 18, 49. Plus I can write all of them so that you can just double check your answers. 225 plus 64 plus 11236 divide everything by 4. Now, if I would have calculated everything else equals because I ran out of space now. Let's make it here equals the square root of the top part. If I add all of them, I get 1651 016510 divide by 4. And if I calculate that, I get this 4127.5 and I take the square root. My answer for the standard deviation is 64.25 and that is the standard deviation. So who got it right? Who got one? Fiso got one. Sara got one and Indian got one. So now the last one was the coefficient of variation. So when you watch the video, you can pause the video and take down the answers if you want. So now move to the next question. It was the coefficient of variation. So we know that coefficient of variation is your standard deviation divided by the mean multiplied by 100. Our standard deviation is 64.25 because we calculated it there. Divide by our mean is 100 multiplied by 100. So 100 and 100 will cancel out. Then the answer will be 64.25 percent. Even when you calculate, if you do your on your calculator, you will say 64.25 divided by 100 multiplied by 100. It will give you 64.25. And let's see. Fiso got it right. Sara didn't answer the question and Indian got it right as well. And that's how you do your question of variation, your standard deviation, your mean, your median. On Saturday, we can work through them again. I think the majority of you might have struggled to calculate this way because some of you have not done math before. Don't worry Saturday, we will go through the calculator itself. We will do step by step. We will make sure that before we leave the session, everybody knows how to use their calculator to calculate the standard deviation. We will also look at how to use your scientific calculator to calculate the standard deviation because it's easy. When you have a scientific calculator, we're going to save a lot of time. Okay, that is a discussion for Saturday. Quartals. Quartals are also measures of variation because they also tell us how spread your data.