 was done with the module. We left with only two chapters, very small two chapters and then we done. Then we wait to write the exam. No, Lizzie, aren't we going to do some rehearsals, revision sessions with you as well even after the two chapters are done? Oh no, we're still going to meet until we go write the exam. We're going to do exam papers. We're going to do exam papers till they come out of our ears. Yes, yes. So my plan is immediately after you submit your assignment five, then we know that you're done with your assignments. Then we start with the revision. So during the revision, it means going back to the assignments, going through each and every assignment as part of the revision. That's one. Once we're done with all 11 study units revision, then we're going to do one exam paper together and then I'm going to give you access to the online mock exam that you need to take. Also, your lecture will give you a mock exam, but the mock exam that I will give you is the one that we will use as practice exams and then we can have discussions on those questions, but you need to take it. So the first one will not be timed, so you can do it as many times as you want. You can do it in multiple days. It doesn't really matter, so I will give you access to that exam. It will be a full exam with 22 questions or 25 questions, similar to what you will receive in your exam. Then I will have last exam revision with you, which will be you will have to take an exam type of timed exam online on our Twitter site as well so that you can practice and see how you will fare in the exam when the time comes. So I'll give you as many opportunities as possible so that you get used to the type of questions you get used to working online and while we're working in the background as well, assisting you and making sure that we iron out some of the challenging areas that you have. So we still have a long way, even though it's a short long way because I don't know when you will be writing your exam. It might be early October or it might be November, so depending. And also the time, the minute you start writing exams, I want to also limit the number of engagement that we're going to have because I need to give you time as well to do your other modules and write those exams because I know exams can be exhausting. So I want the whole of September to do exam revisions. So we're going to see how we plan that one out. And yeah, so other than that, Lizzie, yes, if I may make a recommendation or a suggestion when it comes to the revision, I'll be doing some revision and I found that my huge challenge is around you have X amounts of questions in exam. They are randomized based on the chapters we have. You probably know would know better. Is it possible to take, for example, a chapter and then we focus on exam questions just on that kind of content for that chapter. So we know what the exam question will look like. I don't know about the other guys if they also struggle with things like that. When you do an exam question, you think, what the hell am I supposed to do here? I don't even know where to start. I think that the best way, because most of the questions we've been using as activities and exercises for every chapter that we have covered, some of them comes from the past exam papers that I had access to. So we already did that. What you need to start preparing is to make sure that you understand the type of questions that you will get and make sense in terms of when you read the question, you need to know from which chapter this is and what you need to do with that. Because the exam paper, the logic will still follow the study unit. But I think there will be 25 questions. I'm speaking under correction, because when it was a face-to-face or a venue-based exam, it was out of 25. And because now with online time running out and so forth, the lecture might decide to limit the number of questions. I don't know. But let's assume the format still stays the same as the normal standard exam questions. So there are 25 questions. You have 11 study units, not chapter study unit. Of those 11 study units, it means two questions per study unit to make out 25 questions or 22 questions. Then you have three odd questions away. You might get three questions for those. Like hypothesis testing, you might get three questions. Confidence interval, you might get three questions. And also, in your discrete probabilities, you might get three questions. Because it's discrete, basic discrete probability, then it's bilomial, then it's poison. So you'll never know. But until you know and understand the format of your question and your exam paper, you can only understand how you're going to tackle your exam by doing that. So I'm not going to focus on one study unit and then give you the questions from random question papers. We've done that. We're going to work through an exam format exam paper so that you get used to the exam and not the study unit per se so that you can identify the key things because now study units are combined. Now we no longer working on specific concepts. We're now building up to and preparing you to write the exam. You will see once we have everything formatted and online as well, it will flow like that. Don't worry too much. Like I said, I've got you. We will do this together. Okay, so let's start with today's session. So today we're going to look at study units 10. You must also remember that you don't do ANOVA, which is if you're looking at your prescribed book, it will be chapter 10, which is analysis of variants. You don't cover that in your module. When we look at Chi-square test as well, if you look at your prescribed book, there are so many other chapters or sections within that we don't cover all of it. The only chapter you need to worry or learn is Chi-square of independence or Chi-square where we use contingency table. So by the end of today's session, you should learn how and when to use the Chi-square test for independence. So because Chi-square test like any other hypothesis testing, we with Chi-square test, we testing a relationship between two categorical variables. So with that, because we testing the relationship, therefore we need to follow the same standard in terms of hypothesis testing. So it means knowing how to state the hypothesis, the null hypothesis and the alternative hypothesis. We need to know how to find the critical value. We'll need to know how to calculate the Chi-square test statistic. And we need to know how to make a decision and then conclude. All those steps, you need to know them because one of the options like we did with hypothesis testing in study unit nine the options that might be given might be asking you all the steps might be asking you to validate or to choose which one is the correct one in terms of critical value or in terms of a Chi-square test statistic or in terms of making a decision whether you are rejecting or not rejecting the hypothesis. So you need to know all the steps. Okay, so Chi-square for independence, we use a contingency table to test the relationship between two categorical variables. Therefore it means since we're using a contingency table, there will be a number of rows and a number of columns. You know the contingency table, we used it when we were doing the basic probabilities as well. Yeah, we're going to use that because one category will be on the row and the other category will be on the columns and then there will be your observed values and then we can calculate some proportions from the which or we can calculate what we call the expected frequencies and then we use those to calculate the test statistic and then we make a decision. So in terms of stating the hypothesis testing for Chi-square test, you need to remember the following. Your null hypothesis should always state that things are independent. There is no relationship. One category does not affect the other one. Always state two categorical variables are independent. If we're looking at the rain and gender, then we will say rain and gender are independent. That's how we will state the null hypothesis. The alternative will be the opposite of that. Then the alternative will say they are related but we're going to say male and oh sorry, rain and gender are dependent. Therefore it means they depend on one another. One affects the other or one has a bearing on the other. There is a relationship between the two. Remember null hypothesis always states with independent. Alternative will say dependent. We can always say also in your null hypothesis you can say category one is independent of category two. Alternative will say category one is dependent on category two. As long as in your null hypothesis you state independent and your alternative you state dependent. I'm stressing on this because you need to always remember that. When we want to make a decision we need to calculate the test statistic. Calculating the test statistic since we are given the observed values we need to calculate the expected values and then those expected values we use them to calculate the test statistic. And the test statistics state that your chi-square test statistic is the sum of your observed value or your observed frequencies minus the expected frequencies squared divided by the expected frequencies. And once you have calculated that test statistic we're going to use that test statistic to compare it to the critical value. So therefore it means the critical value of a chi-square test. We will find it by using our alpha. Here we're not going to divide your alpha by two because a chi-square test is a right skewed test. It is a positive skewed test. It only has one region of rejection. So your chi-square critical value will be on the upper side so we use alpha. So our critical value will just be chi-square of alpha but we also need to use the degrees of freedom. And I'm going to show you on the table how to find that degrees of freedom. And the degrees of freedom for a chi-square test is the number of rows minus one times the number of columns minus one which is that. That will give you the degrees of freedom. And we use the alpha and the degrees of freedom. Go to the table so your degrees of freedom will be running down. Your alpha values will be up where they meet inside the table. That will be your critical value. We use that critical value to decide where the region of rejection will be, whatever the value is. We take the test statistic. We look whether it falls in the do not reject area or it falls in the rejection area. And if it falls in the rejection area, we make a decision and conclude. We reject the null hypothesis and conclude that the two are independent. Okay. How do we then calculate these expected frequencies? Calculating the expected frequencies? Easy. We look at the row total and the column total. So if you get a contingency table, let's say I said rain and gender. So let's say when it's gender, in this instance, we're going to use the two that we are familiar with. It's female and male. And when it's raining, whether it's raining years or whether there is no rain, days where it's raining, yes, all days that is no rain. There's no rain. So when you get a contingency table like this, if they did not calculate the total, you need to calculate those total because we're going to use the totals. Because to calculate the expected frequency, we use the row total times the column total divided by n, which is your grand total. So let's say if this is one, this is two, this is three, this is four. When you are given the observed values, you just calculate the total by any, all those for female, whether it's yes or no, you say one plus three, one plus two is three. Three plus four is seven. And your grand total, your n will be equals to 10. You do the same for the column because you need the column total. So this is your column and this are your rows. So the column one plus three is four and two plus six, oh, two plus four is six. I already have my answers in my aid. So in order for you to calculate the expected frequency, let's say we want to calculate the joint expected frequency of female and yes. So the expected frequency of female and yes will be calculated by taking the row total of female, which is three times the column total of yes, which is four divided by the grand total, which is 10. And that will give us three times four is 12 divided by 10, which is 1.2. And that will be your expected frequency. To calculate the expected frequency for female and no, you do the same. You will say the row total. So if I want to calculate the expected frequency, let's call it E for now because I'm talking about expected frequency, expected frequency of female and no, I will calculate it by using the row total of female, which is three times the column total of no, which is six divided by the grand total or the sample size, which then say three times six is 18 divided by 10, which is one comma eight. And that's how you will calculate the expected frequency. You will have to calculate it for all the joint events that are happening inside the table. Once you have your expected frequencies, then you can calculate your test statistic. That is chi-square for independence. And once you do that, you make your decision and you conclude. Let's look at an example. Oh, before we look at that example, making a decision like we already alluded to that, we said we use the critical value and we make a decision. And since I said our chi-square critical value, it's a right skewed test. Making a decision, therefore it means any value that falls bigger than the test statistic, if it's greater than, so it means it falls on the side, if it's greater than the critical value, we reject the null hypothesis. And that's how you make a decision. If your test statistic, if this is my chi-square critical value, the rule says, if your test statistic falls above the critical value, you reject the null hypothesis. Otherwise, you do not. And that's how you will make a decision. So like I always use visuals to make a decision, it makes it easy. If you know how to remember the signs and the decision, you can use that or you can draw yourself visuals. It's up to you. You can remember the decision rule or use the visuals to make a decision. Let's look at an example. Here we have the meal plan selected by 200 students and it's shown in this table. We have the class standing and the number of meals per week. So the class standing is by types of students and the meal plans, those who prefer 20-week meal plan, 10-week meal plan and those who prefer no meal plan. We have our observed values. We calculated our totals. We have our grand total which should correspond to the same number of students that were selected. 18, the null hypothesis, which is the first thing that you do is take your null hypothesis and your alternative hypothesis. The meal plan and class standing are independent. The alternative meal plan, class standing are dependent. We need to calculate the expected frequencies because we will need the expected frequencies when we calculate the test statistic. Remember with the six steps of hypothesis, we say state the null hypothesis, state what you are given and calculate whatever other measure that you will require to use to answer the questions coming up. Then find the critical value, then calculate the test statistic and then make a decision. So the same thing, yeah. So we first going to quickly calculate the expected frequencies. In order to calculate the expected frequency of fresh men and 20-week plan we're going to use, so if we want to calculate the expected frequency, expected frequency of fresh men and 20, oh my pen is speeding out and 20 weeks is raw total. So my raw total is 70 times my column total which is 70 divided by 200 and that will give me, I think, 7 times 7 is 49. Let me use my calculator because now I cannot use my brain anymore with big numbers. 70 times it's 4900 divided by 200 and that gives us 24.45. To do 32, expected frequency for 32, expected frequency for fresh men and 32. We use the raw total 70 multiplied by column total 88 divided by 200 and the answer you will get will be 30.8 and complete the whole table. Like similar to this, for junior and 20-week for 10, raw total is 30 times 70 divided by 200 and we get the expected frequency of 10.5 which is that and all the expected frequencies are done. When doing a chi-square test statistic, this is a good test statistic you can do even to test the relationship of variables for any type of work but you need to always be very mindful that your number of records in the cells should at least be more than five in order for the chi-square test to work properly. If you're like I used the example there where I had one and two and three and four that's not the normal way of using a chi-square test. Your number of records or observed value, your N should always be greater than five in order for the chi-square test to work. Okay, so now since we have our observed values and our expected values we can then calculate the test statistic. We have the sum of, remember the sum means adding all the values so it means repeating this multiple times for all the records. The observed minus the expected squared divided by the expected. Observed is 24, expected is 24.5. We need to square it divided by the same expected which is 24.5. So your observed and your expected should always correspond. What I always do, you will see when we do the next example. Instead of creating another column which might complicate things or creating another table, I always write my expected values next to my observed values. So this will, instead of creating this I would have created my expected here so that then I know which values are my substituted way. So you just write all your expected next to your observed so that you know that 24 will subtract 25 and divide by 24.5 again. And you do for all the values add them together plus you go to the next one which will be 32 minus 30.8 and you square the answer and divide by 30.8 and plus and and and until you complete the whole table. Then you do the summation of all the values and calculate and find your test statistic. And in this instance our test statistic is 0.709. We need to find the critical value as well. Finding the critical value let's assume that our critical value we were given alpha or level of significance of 5 percent of 0,05 which is 5 percent. So therefore we need to go and find the critical value chi square of alpha and the degrees of freedom. Our alpha value is 0.05 our degrees of freedom which is our row total. So our degrees of freedom which is the row not the row total the number of rows minus one times the number of columns minus one. So if I go back to this table just gonna clear all my ink. So if I come here and I need to calculate my degrees of freedom which is my number of row minus one times my number of columns minus one. So I need to count how many rows do I have. Not you do not include the total. So only for the observed values we have one two three four rows so that four rows minus one and how many columns did we have one two three columns three minus one and they four minus one is three times three minus one two three times two is six. So therefore my critical value here will be six. So I have sorry my degrees of freedom. So remember this is four minus one and three minus one which gave us six. So we have our critical value we need to go find it on the chi square table. Bear with me I just want to make sure I need to I think I I'm not sharing my screen let me share my entire screen. I hope you are able to see the table the tutorial that I'm using is the one that I shared with you I said you must start using that one. So let's go find our chi square test. So I'm just going to make it smaller. So this is the t distribution the table is also called chi square critical values of chi. So it's not called chi square so you just need to remember that the x square is your this chi chi chi this sign chi is the one that we're looking for. So this is the table I'm going to make it bigger. So also remember ignore the top part we only looking for the alpha values that are closer to the table not the cumulative probabilities there. And our degrees of freedom remember we're looking for chi square of zero comma zero five and the degrees of freedom of six. So our alpha value of zero comma zero five we go find it that column there our degrees of freedom which is six where they both meet and that's our critical value and our critical value is 12 comma five nine two. Now we can go and make a decision draw a graph for yourself because it's like left skewed just need to make sure that there is a longer tail on the right and this will be 12 comma five nine two that's where my critical value is at and once I drew my graph because I don't want to always remember this rule where that's my zero comma seven zero nine false it falls somewhere in the white area therefore it falls in they do not reject and in conclusion we can make a decision and say since our chi square test of zero comma seven nine is less than because it's below it's less than our critical value of alpha zero comma zero five of 12 comma five nine two we do not reject the null hypothesis and in conclusion we can say that there is not sufficient evidence that the meal plan and the class standing are related just for putting that not sufficient evidence or we could have just said we do not reject the null hypothesis and conclude that the meal plan and class standing are independent and that's how you make a decision and conclude any questions we'll look at another example any questions no questions if there are no questions now this one we're going to do it together according to the Center for Disease Control and Prevention Publication HIV and AIDS civilians report the number of AIDS cases in South Africa in 2007 classified by gender and race use the information shown below in the contingency table to test whether there is a relationship between gender and race even though the when you get a table like this you can see that there are missing information on this table but we can complete this whole table because we have the total in order for us to get this value for female we can use we know how many whites are there there were 70 of them so therefore if we know also that there are 40 males who are white out of 70 we can subtract so here we can say 70 minus 40 and it will give us the answer for that one make it easier for us to complete the whole table on the total for male it got lots of what the race there are we can just add them 40 plus 32 plus 48 and we can calculate that and once we have all the values also including 32 and 48 you can add that it's 32 plus 48 and the total grand total yeah you can either use that because it will be 250 minus those values or you can say 250 minus 70 plus 32 plus 48 will give you the answer for that one okay so I want you to complete the total of all of them so take out your calculator and calculate we'll do step by step I want the answer for female what value do you get what value do you get for blacks what total do you get for black so it means it's 80 plus 48 18 80 and we can do 250 minus 70 plus 80 to get the total for Indians hundred are they 100 so my 100 I think we'll have a problem my my values are wrong on the slide and uh calculate Indians females 52 calculate males total 120 120 and female total uh one it's one plate okay so I made a mistake there so it means when we get to the uh we can fix it so I will rely on you to give me the correct answers there okay so the first step we need to do is to state number one is to state the null hypothesis state the null hypothesis there is no uh nope um relationship nope no okay nope always 250 nope always always stated either independent or independent sorry the race in the gender independent yes race and gender independent don't fall into the relationship part always remember that in the null hypothesis we always state it with independent and your alternative will be what will be your alternative race and gender are dependent race and gender are dependent step number two we need to calculate the expected frequencies are using row total times column total divide by n and this is your n so let's calculate the row total for white male the expected frequency for white male it will be 70 I'm going to do the first one and you have to do the rest of it 70 times 120 because our row total is 70 our column total is 120 divide by 250 do the calculation direct 3.6 that is our direct 3.6 and continue to complete the whole table do for female white female will be column total oh sorry row total 70 times column total 130 divide by 250 36.4 black male 38.4 black female 41.6 Indian male and here is where I have incorrect values but we can fix that 48 sorry I'm trying to to fix as we go along come on sorry that's when I go back to where we were okay oh sorry and Indian female 52 52 so now we have our expected frequencies and we have our observed frequencies step number three is to calculate the chi square state which is the sum of your observed minus your expected you can see that I'm no longer using the frequencies but it's one in the same thing observed value minus the expected divide by the expected so I want to start it from here so now I need you to do the calculations as well so the first value it's 40 minus 33.6 squared divide by 33.6 plus 30 minus 36.4 squared divide by 36.4 plus 32 minus 38.4 squared divide by 38.4 plus 48 minus 41.6 squared divide by 41.6 plus 48 minus 48.4 squared divide by 48 plus 52 minus 52 squared divide by 52 calculations I can also use my calculator let me use the fraction so for the first one it's 40 40 minus 33.6 squared divide by 33.6 so you can you can do the whole equation if you want so you can do it step by step and when you do step by step as you can see that you will have lots and lots of decimals you need to keep at least four decimals in order to not to drop off the decimals as early as possible so four or five decimals should be enough so I'm going to keep four decimals for this one we might get different answers and now I forgot what the answer was 1.290 2 1 9 0 2 is come on 2 1 9 9 0 and the next one one point one two five you will need to to give it to me slowly so I can write it one point one two five one two five I just kept three you just kept three yes okay anyone who kept more than three or did you only get three okay no I'm not calculating you should be helping one another yeah I've got one one two one I've got one point one two five three okay thank you you keep four decimals at least and the next one one point zero six six six six six six six six six seven yeah we'll run it up to seven and the next one 0.9846 and the next one will be plus zero plus zero because 48 minus 48 is zero so we don't even have to do anything there and add them together I get four comma three nine five six okay so that is our test statistic that's number three let's see five three nine five six three nine five six not a four three nine five yeah five six okay so now we go to step number four which is making a decision so remember the rule I can write the rule here it says if my chi square step is greater than my chi square treat I will make a decision I will reject the null hypothesis but then it means I need to go find my critical value first before I make a decision so this is a rule so let's go find the uh this is a rule it's not the final thing let's go find the critical value so finding the critical value our information here is not full so let's say at alpha of zero comma zero one at one percent so if we want to go find the critical value sorry critical value of alpha and the degrees of freedom and the degrees of freedom and we know that our degrees of freedom number of rows minus one times number of columns minus one so our chi square alpha our degrees of freedom how many number of rows do we have three four three three minus one don't count the total so we have three minus one and how many number of columns two minus one two minus one and therefore this will be two and this will be one and then our degrees of freedom here is two so it means our critical value will be of zero comma zero one and two we need to go find this critical value and find the critical value on the table so let me display the table 9.210 so we're using that column that is our critical value 9.210 okay so now I'm running out of space oh come on let me change my pen color as well I must start writing small anything that falls in that shaded area we're going to reject we found that our critical value critical value was 9.210 right 9.21210 that's our critical value we calculated our test statistic and we found that it was 4 comma 3 where does it fall it will fall somewhere in the do not reject area so we can make a decision and when we make a decision we use the rule we say since our we can say since chi-square stat of 4 comma I'm going to use only two decimals or three decimals 9.396 since is less than our chi-square critical value of zero comma zero one of I'm going to use that number because I ran out of space we do not reject the null hypothesis and conclude that we do not reject the null hypothesis and conclude that there is no sufficient evidence to show that gender and race are related and that's how you make a decision or we can say we do not reject the null hypothesis because gender and race are independent since we're not rejecting the null hypothesis easy right any questions yes Lizzie yes my problem is there only first question I was trying to get this race and gender are independent how do you know how to figure out that statement the statement always know that for chi-square test it's always independent in your null hypothesis for the two categorical the two categorical variable you will state the hypothesis using the independent statement that's the only it's easy straightforward it's not like with the other hypothesis testing way you always have to understand whether is it a two-tailed test or a one-tailed test whether must use a less than or greater than whether they should be an equality sign to it or not yeah straightforward contingency table chi-square test null hypothesis independent independent null hypothesis no it's fine if it's always independent I will have always always it's okay okay so now it's your turn to do the talking the certain media company published four magazines for teenage markets the executive editor of the company would like to know whether the readership preference for the four magazines is independent of gender a survey among 200 teenagers were carried out the following contingency contingency table was obtained and there is your contingency table before I look at all the statements what is missing with this contingency table the totals the totals so you will need to calculate the totals for this contingency table which one of the following statement is incorrect number one the expected basis so it means you'll have to calculate the expected value there so you need those totals please note number two the null hypothesis which H naught is gender and magazine preference are independent remember we're looking for the incorrect statement the alternative states gender and magazine preference are dependent number four degrees of freedom is three number five the chi-square curve is symmetric we're going to answer all of them so that we just make sure that we understand so I will ask you to just quickly calculate the total because we need the total to calculate the expected value of youth and girls I'll give you some time to do that I'm going to ask you to calculate all the totals because you will need them for the next exercise exercise two is related to exercise one so calculate all the totals total for girls is 78 78 beat 56 put the 88 grow 54 live 52 total for boys is 122 and the grand total should give us 200 the question number one says calculate the expected value for youth and girls so you go to youth and girls is 12 so you need to calculate youth and girls the row total is 78 column total times 38 divide by the grand total which is 200 what do you get 14.82 which is correct number two statement number two the null hypothesis states that gender and magazine preference are independent is that correct or incorrect correct it's correct because null hypothesis independent the alternative that will be correct as well find the degrees of freedom your number of rows minus one times the number of columns minus one how many number of rows do you have two two two minus one and how many number of columns do you have four four minus one and what will be your answer three and it means that is correct correct a chi-square test is a symmetrical distribution would you say it's a symmetrical distribution even when you go to the table that will show you that what kind of a distribution is this what shape does it have it's a it's a skewed distribution so which makes number five the option that you are looking for because a symmetrical distribution will be a normal distribution it will have a belly shaped calf a normal belly shaped calf even though my belly shape is not correct but it will look like that and a chi-square is a skewed distribution and is a right skewed distribution so we only calculated the expected value for youth and gals and which is that one so which is 14.82 you need to calculate the expected value for all of them because we need to go and answer the chi-square test so did I replicate them correctly if you have the expected values just call them out so I can write them so the first variable okay for 18 yes 21.84 for 20 it's 21.06 for 28 it's 20.28 for 38 it's 34.1 34. 34.16 for 26 it's 23.18 23 or 28 23 23.18 for 34 it's 32.94 24 is the 1.72 thank you for teamwork now let's complete our chi-square state our chi-square test statistic 18 minus 21.84 squared divided by 21.84 plus 12 that's 12 minus 14.82 squared divided by 14.82 plus 20 minus 21.06 squared divided by 21.06 28 minus 20.28 squared divided by 20.28 it's gonna be a long long one so I just completed the first line I'm just gonna complete the rest of the lines so then somebody must start doing the calculations and I use the bottom so I got 6.8916 so you calculated all of it yeah but the guys must please check for me oh it's not answer number three yeah I got the same six comma eight nine one six number three six comma eight nine one six do we all agree or are we still calculating I'm still calculating I don't know how they got it so quick excel excel spreadsheet wow no they say the the laziest people are sometimes the most creative so wow are you willing to share that spreadsheet we can also share that spreadsheet on as part of yes we can do that okay so mine mine is a little different I have a calculator that has a stacked uh menu so I have values I can manipulate and move them around oh I guess with manual calculations we're gonna take forever to get to the answer are we still calculating still busy yep okay let me know when you're done I'm done I've got 6.892 okay you can unmute you can mute ever has their mic on so with the case here you can uh do up to three digits up to three three values before it allows you to continue so yeah so it will take you forever okay I'm going to get the answer for the three two are you all done now did you get the yes I got the same answer okay others are you happy so even um in the exam you will not get a table with so many questions where they ask you to calculate the test statistics on its own and then it's time consuming they will limit that because of the time of the number of hours that you have for the exam but for your assignment you might get a bigger table like this so it means you need to pace yourself you need to be able to do the calculation and make sure that you calculate correctly as well so that is the test statistic exercise number three consider the following table also you given a contingency table of the bias age and the car size bought which of the following statement is incorrect at alpha of 0.05 the critical value is that so it means you need to go find the critical value by using your critical value of k square alpha and the degrees of freedom which is your road number of rows minus one the number of columns minus one number two you need to check whether the hypothesis test is correct it says the bias age is independent of the car size bought number three you need to be able to calculate the test statistic so it means you need to calculate the observed those of the expected frequencies and your expected frequencies is your road total times the column total divide by n and then calculate your chi squared that test statistic which is the sum of your observed minus your expected squared divide by your expected in order to get the answer and find the degrees of freedom which we already you should have already calculated it there which is your number of rows minus one times the number of columns minus one and number five says the null hypothesis is rejected so you need to have the test statistics calculated and you need the the critical value in order for you to make a decision in order for you to make a decision so you will need the critical value and your test statistic i'm gonna give you time let's say i will ask again at half past and half past so take these five minutes to try and work out some of this how are we doing can we answer the ones we already have yes let's do that and then we can do the others so number one at alpha of zero comma zero five what is the critical value so did you find the degrees of freedom yes it's four okay so the number of rows so that will be two times two which is equals to four so that is our degrees of freedom so you need to go to the table we need to find of zero comma zero they said zero comma zero five yeah we need to find zero comma zero five and four nine comma four eight eight so number one is correct right yes number two is that correct yes also four is correct since we've already calculated it yeah and four is also correct now what is left is the chi-square test have you calculated it it means we need to go calculate the can we do the spreadsheet can they share the screen and show us how they do it miss on their spreadsheet okay so i'm not gonna be able to do the whole thing i'm not sure are you able to share your screen in my screen no you need to give permission you need to you need to be able to create um now i need the table sorry let me see if i can i'll have to end the show let's do this and you are who am i giving permission to mine was on the calculator who's our excel colleague that's the person i'm looking for maybe it's on mute the person who said they are using the spreadsheet i want to give you permission to share your your screen but i don't know who is that sorry i stepped away i was on a call so i can't share what yeah i can't share about the new the new question but i can show what i did with the old one if that will help but i don't it's fine you can show with the old one because i was gonna i think now you should be able to share and while you do that i will also try and do for this question that we busy with okay so you guys want to show how i how i calculated it in the spreadsheet yes they want to see the spreadsheet yes oh um well i obviously plotted first the the contingency table i calculated the expected values next to them next to each of the of the values and then when i was done i just added everything up below i'm interested in the formulas that i use it's basically the same thing that we did in the exercise but i just put it in a in an excel formula and i can send it now i have it ready to send it to you on whatsapp if you want to if you want to use it uh i think you can send it to on the whatsapp when the class is over as well okay left with almost 20 minutes okay um unless if you want to send it now and they can then it will be yeah yeah you can listen as well okay okay i'll just send it on the group chat thank you very much thank you welcome if you did any help uh you can share a text on the side let's just can you kindly show the table again so we can populate on the spreadsheet no problem we can do that okay let's do that are you winning changing the numbers but yes we get it thank you okay are you still busy i'm done sorry i'm done let me share my screen again my entire screen oh sorry i'm still sharing a window i want to share the entire screen okay so we were given the observed values these are our observed values and i calculated the totals and also the totals for our values there are our total for our observed and then i used another column to calculate the expected values by using the formula so which is the row total times the column total divided by the grand total for all of them so if you go to the next one you will see the calculation for each one of them then once done then i went to do this the formula observed minus the expected squared i was lazy to do the square oh maybe it's because i was lazy to go find the formula for the square we know that the square is that value multiplied by itself so i said observed minus observed minus the expected and i multiply the same thing again divided by the expected which is my c12 which does the observed minus observed minus the expected square times that and i did for all of them for every block i did for every block observed minus expected square divided by the expected and once i have calculated for all the columns and rows then i added all of them and that gives me my test statistics do you also get the same when you use the spreadsheet did you get that the same answer 14.62 which is the same as what they have and they said you must leave your answer to four decimals or are you still busy on the spreadsheet i think we're busy with excel in that excel i have two because i'm going to fiddle through my laptop and download it as well okay we can do the same let's see if we can do the same now you see my messages okay i just want to replicate the same so in terms of this we have three columns so you just need to make sure that you add another row and change this to under 30 30 to 45 and over 45 and then do the same at the top we have small you'll need to know how to use excel to do this so you need to change that to small and this is medium and this will be net and this will be large and because we don't have these last two columns they will be zero there and zero here for the sake of the formulas as well to keep everything still intact okay and we're going to add the observed on the color code you add the observed 10 24 and 45 everything should work out and you to adjust the formula because the formula only looked at the two columns we just adjust your formula by dragging and press enter it should be for all of them now and also i can just just drag this formula because i need to keep the same i'll just have to adjust it drag it so that it still reads the same values we also need to adjust that just now we have 22 there 42 and 35 and you will just need to adjust your formula in the same 34 28 foot so 300 and i guess also yeah we'll have to adjust because there are three now i have to add the third line and don't have to worry about that one so this between the square and just it takes your observed minus your expected square divided by the expected which was fine and so on this one you do like fine the last column is this which we can also just because we don't need that and this we need to add all the values so i should get the same suggested some reason i don't want to add up levi will it be possible for you to share your spreadsheet as well on the group yes i will share it on the chat as well thank you something they didn't work out the right so let's double check my views with those views so that 13 we got it right 0.02 that's correct 1.19 that's correct 1.27 0.51 0.58 that's correct it is on this one sorry on g5 you've got the incorrect one should be 48.8 on g oh yeah yes i see sorry i see i should be using that column sorry i need to adjust all of this at the bottom i am not getting values as you can see i get the same answer so we will share both of this spreadsheet you must start offering excel courses as well i do actually oh you do okay i do as part of the facilitation of learning for unisa i do offer classes for research analytics and basic statistics and we do a whole lot of things other than just discuss content because it's literacy so we we do also offer options in terms of using calculator we we have dedicated sessions on how you use your calculator how you use your excel to do the calculations you will see next time when we do a regression you will need to use that data analytics or the data analysis we will use this to do the regression models but that is discussion for the next few weeks to come so anyway sorry before you start last question you converted the decimal to four decimal places can you show us also you do that okay so um on the tab there is a number tab there are arrows here with decimals so it increased or decreased decimals you can use that thanks this will increase the decimals if you press it you will have more decimals if you want to reduce them you just use that also you need to pay attention to your spreadsheet as well because depending on your spreadsheet how you you have formatted your spreadsheet sometimes some spreadsheet you use a comma isn't like a comma comma this comma uh on my one my decimals which are my commas are points and I think also on eight um at least this one decimals are points so it was fine so yeah that is to reduce the options okay so going back to our presentation slides so we stopped at exercise number three there are other activities we will look at some of this on saturday so it should be easy you can go and start answering some of these activities on your own you will have the exercise spreadsheets you have the notes you can use that to answer some and then on saturday when we do more activities and exercises then it will refresh your mind there are a couple of exercises yeah as you can see most of them okay so for exercise four you need to calculate the chi-square test exercise five you need to know your five steps of hypothesis testing in terms of chi-square in order for you to answer all the questions also exercise six you need to know all your steps of your chi-square hypothesis testing in order to answer which one is the correct one and the next one as well so there are a couple of questions on here that you can go through just to conclude because we left with one minute you have learned how and when to use chi-square test we have done the calculations always remember that with chi-square test stating the hypothesis testing the null hypothesis always we stated with independent the alternative will state dependent which means is there a relationship between the two categorical variables you also need to be able to calculate or find the critical value remember your critical value you find it on the critical value for chi's critical value of chi-square and we use the alpha and the degrees of freedom and your degrees of freedom are your number of rows minus one times the number of columns minus one you also need to be able to calculate the chi-square test and we shared with you the tools that you can use to calculate the chi-square test whether you use the excel spreadsheet or you use the formula the way we have calculated it in the loads which means you also need to be able to calculate your expected value which is your row total times your column totals divided by the sample size or the grand total and you should be able to make a decision by looking at the rule which states that if your test statistics is greater than your critical value you're going to reject the null hypothesis with that it concludes today's session any question comment or query i have a feeling the assignment is going to be better than the others all right so if there are no more other comments or questions sorry Liz i have a question assignment three it's open again so if let's say i didn't do alone assignment three can i attempt again or it's for those who didn't do it at all no if you still have submissions if you still have if you didn't use all your three attempts then you can do it so as long as an assignment is open you can redo it as long as your attempts are still there so if you only did two attempts now you can do your third attempt on it thank you thank you so much yes so also the assignment four is extended if you haven't used up all your attempts you can remember you can do you can use up all three of them the highest score will be captured so take your chances like with assignment five coming up like i always say the first submission use it to get an idea in terms of what type of questions are there on your assignment i'm going to let me first stop my recording and then i'm going to talk to you just wait two more minutes thank you for coming through today let's stop the recording because we don't want to get into trouble