 Your exam takes all the study units from study unit 1 until study unit 11. So when you prepare for the exam, you must know that you're doing for everything, even though your assignment only ended at study unit 10. If you look at your textbooks, those who are using textbooks, your textbook has chapters. You will notice that you do not do chapter 10, which is ANOVA. So please don't go through that chapter, it's not part of your scope. So you do chapter 9 and then you skip and go to chapter 11 and then to chapter 12, which today is chapter 11 and I call it session 11. It's just that we are on session 11, it doesn't have anything to do with chapters of your books. But this is study unit 10, which is chapter 11 on your book. Also, I need to mention this, you only do Chi-square for independence. Do not do other sections. Do not do Chi-square for goodness fit test, you don't do those. Only do Chi-square for independence. That's what we're covering for today only. With that said, let's look at what we're going to be covering today. Since I said we're going to do Chi-square, we're going to do a little bit of the basic concepts in the beginning, so that you get used to what are the terminologies we use, how we get find the critical value, how we calculate the test statistic, and all that. Then we're going to do a hypothesis testing exercise, where we dedicate 30 minutes to doing that. Also, following the same method that we applied previously with the basic concepts. Then we'll take a break, come back, we do lots and lots of exercises. Some of the exercises come from your tutorial letter. I expect you to at least try and do those exercises and give feedback, because I cannot be doing your assignment for you. You need to be doing it yourselves. When we do the exercise, I expect you to take part and take the initiative to answer those assignment questions. Do not expect me to give you the answers, but I expect you to do the activity together. That's one. Enough with that, let's move on and do Chi-square. Like I mentioned previously that when we started the session, that some of the concepts are going to continue with those concepts, even though we're introducing a new topic. We're still going to be doing hypothesis testing. You know with hypothesis testing, there are six steps that we follow. We state the null hypothesis and the alternative hypothesis. We state what we are given in terms of Alpha and other values. We also state what test we're doing. Because we're doing Chi-square, the test that we're going to be doing a Chi-square test, then we're going to also find the critical value. Now, since we're doing a Chi-square test, we're going to go to the Chi-square test statistic or the Chi-square critical value table. At the back of your table, you will find the table named Chi-square test or critical values of Chi-square. Then we're going to calculate the test statistic. I will show you the formula to calculate the test statistics. Then we're going to use the critical value to make the decision. It means once we find the critical value, we are able to determine the region of rejection. With Chi-square, there's only one region of rejection. We're always going to find the critical value using Alpha and the degrees of freedom. It means once we define where our critical value is and we determine our region of rejection, then we can make a decision based on the test statistic and the critical value to say, where are we making the decision? Are we rejecting or are we not rejecting the non-hypothesis? Those are fairly the basic steps that you need to always remember when we do hypothesis testing. One, we're introducing a new table and the new test statistics. You just need to know all those. By the end of this session, you should learn how and when to use the Chi-square test for independence and be able to use a contingency table to answer the questions for your Chi-square table. What do I mean by a contingency table? A contingency table is just a table with rows and columns. Look at this. This is the result of a contingency table that organizes the sample size of 300 in terms of female, or males and female, or what we call gender and hand preference. So on the row, we have gender with gender broken down by female and male, and the hand preference goes in the column where it's broken down by left and right, hand side. So we use the contingency table when we did the probabilities. Remember that? And we said, if in the exam as well, they do not give you the total on the contingency table, you can actually go and calculate the total on your contingency table. So you just quickly calculate those total because you will require those total. So you calculate the total and you calculate your N, which is your sample space. In this instance, it's close to your N. And we're going to always use your N to calculate some of the things that we need to be calculated, especially the expected values. Okay, so how do we then use this contingency table? And remember, the Task Force for Independence tells us the relationship between two categorical values because we have two categorical value, gender and hand preference. So to test for independence, we are actually testing the relationship between the two categorical data, which one will be placed in the row and one will be placed in the column. And when we state the null hypothesis to test this, we always state the null hypothesis with an independent. So in the null hypothesis, it will always state that the two categorical variables are independent. Or it will state that hand and gender preference are independent, which means there is no relationship between the two of them. But you never state the hypothesis by using there is no relationship. We always say are independent. The alternative will say the opposite of that. It will state that two categorical variables are dependent, which means there is a relationship between the two variables. And that's how we state the null hypothesis and the alternative hypothesis. Then with the test, we also need to calculate the test statistic at some point. And our high square test statistics is given by the sum of your observed frequencies minus your expected frequencies, squared divided by your expected frequencies. What do I mean by observed frequencies? So those are your observed frequencies, the 12, the 108, the 24 and 156 are what we call the observed frequencies. Then we need to calculate the expected frequencies. And that might be going to learn how to calculate just now. So it means for all those values, the observed frequencies, we're going to subtract it from the expected frequencies, which we will calculate using the observed, the total of the observed frequencies. And when we go find the critical value because we will need to use the critical value to make our, or to determine where our region of rejection will be. Therefore, our critical value, we find it by using the degrees of freedom of the number of rows minus one times the column of, the number of columns minus one. And that gives us the degrees of freedom. So if I go back to this, how many number of rows, we have two rows. And we have two columns. So it will say two minus one. So because it's row minus one is number of rows, there are two rows times the number of columns minus one. There are two columns. So it will be two minus one times two minus one in this instance. And that will give us any degrees of freedom of one. I'm going to use as well our value of alpha to go find the critical value. And I will show you just now how to find that. To calculate the expected frequencies, then we use the following. Remember, we had a contingency table, which has the number of rows and the number of columns. And we have the total. And we have the total date. So to calculate the expected frequency, we use the total. So to calculate the expected frequencies there, we use the column total, or let's say the row total, which is that one, which is the row total times the column total divide by our N. So we're going to take that value times that value divided by that. It will give us the expected value for this one, for that column. If we want to calculate for this one, I'm going to call it X. We're going to take the row X and the column X and divide by N. And that will give us the expected frequencies. When we make a decision, then we say, if our test statistics, so remember for chi-square, we only have one region of rejection and that it's only on the side. So it will be our chi-square test, alpha and the degrees of freedom. And that is where our region of rejection will be. Anything that falls in this slide, we reject the null hypothesis. And that's all what that says. If our test statistic that we calculated, this test statistic, once we've calculated it and we go find our critical value, which is that critical value that we will find on the table, then we go into say, if the test statistic falls inside the rejection area, we reject the null hypothesis. If it falls in here, we do not reject the null hypothesis. Then how do we even find this critical value? Finding the critical value, let's say we're using the same table that we used the left-hand side and the right-hand side. We know that it was two minus one and two minus one. Therefore, our degrees of freedom for this will be one times one is one. So our case where test of zero comma zero five, let's say it was at 95% confidence interval or alpha is equals to zero comma zero five. So we say our alpha of zero comma zero five and the degrees of freedom of one. Go to the table. I should have shared my, I need to share the, okay. So we go to the critical values of chi. This is critical values of t and there is your critical values of chi table. And you will see that also on the critical values of chi, it also shows you where your region of rejection will be. So remember now, we're looking for chi, chi square of zero comma zero five. Therefore, it means also with this table, we ignore those cumulative probabilities at the top. We only interested in the upper tail area and we look for the degrees of freedom of one. So our degrees of freedom will be at the bottom. So we come here, we look for zero comma zero five, zero comma zero five, there is the value and we look for the degrees of freedom of one, where they meet, which is our critical value for this will be three comma eight for one. And that is our critical view. That is our critical view. And that's how we're going to find our critical view. Okay, any question? No questions. So if there are no question, let's look at an example. When you do chi square test, actually, it is very time consuming. Actually, if they give you a table with so many number of rows and so many number of columns, it's time consuming to do chi square testing. So you need to pace yourself when you come to this question because it will be one of those questions right at the end of your exam. So let's look at an example. As you can look at this table, it's very big. It's got four categories in terms of plus standing. So it has four data values and it also have three data values in terms of number of meals per week. So we have the meals per week selected by 200 students shown below. So the students are broken down by class standing, so which means the type of students. So this is an American term which refers to freshman, so five, junior and senior, so freshman will be the first year students. Sophia will be those who are moving from first year to second year and then junior will be those who are in their second year to third year and the seniors will be those who are doing postgraduate studies probably, like your honors and master's and PhDs and so forth. So if this is the list of students who selected the meal plan, so we want to test if the class standing has any or is class standing independent from the number of meals per week. We state the null hypothesis. Meal plan and class standing are independent. The alternative we'll say the meal plan and class standing are dependent. So once we've stated the null hypothesis, then we can determine the next thing. What else are we given? So for example, let's say they gave us the alpha of zero comma zero five, yay. And the other thing, because with this, we need to be calculating the expected value. It's very important to do that before you do your chi-square test. Have to calculate the expected frequencies, remember? The observed frequencies, we have 24, 32, 14 and end. And we calculated the total, we know that there are 200 students interviewed. Now we need to calculate the expected frequency. You know what the expected frequency is. So at this point, I'm calculating the expected frequency for this column. It says row total multiplied by column total divided by n, our row total. So we come to this column. Our row total is 30, our column total is 70. So it will be 30 multiplied by 70 divided by 200. And that gives us 10.5 and that will be there. So for 24, let's say we go back to the top. Let's say we want to calculate for this one. To calculate the expected value for 24, we say row total is 70, multiplied by column total is 70, divided by our sample space, which is 200. And when you do 70 times 70, divide by 200, you will get 24.5. To calculate for the two frame, we say row total is state is 70, multiplied by column total of 88, divided by 200, it will give us 30.8. To calculate for 14, we say row total is 70, multiplied by column total of 42, divided by 200, and we get 14 points. And we do for the rest of the table. So you don't have to complete that part, but when you add all of them, they should give you as well, it goes to the same like 70, 60, 80, 40. But you don't have to do the total. So you just calculate the expected, only the orange part, the expected values. So now once you have the expected value, then we can go calculate the test statistic, or we can go find the critical value first, whichever one. So in this instance, I'm calculating the test statistic. So the test statistic, we know that it's your observed minus your expected squared divided by your expected. So going back to the, we know what our observed are. So what we're going to do with this is, we look at our observed, remember, is the summation of your observed minus your expected squared divided by expected. What it means is for every observed, we need to go subtracted from the corresponding expected and square the value and divide by that. And then go and add the next one. So we'll take the observed minus expected divided by expected, square the top one. Plus observed minus expected squared divided by expected for every one of them. So for 24, we will say 24. So that will be 24 minus 24.5, 24.5. 24.5, and you square the answer divided by 24.5. Then you go to 32. 32 will be 32 minus 32 plus 32 minus 30.8. And then you take the square of that divided by 30.8. 30.8. Then you continue until you do all of them. And that's what your chi-square test statistics looks like. Then you do for all of them and add them. Once you solve all this part, you get zero comma, you get zero comma seven, zero nine. And then to get to the chi-square test, remember, we need to go and count our degrees of freedom. So we have one, two, three, four columns and one, two, three rows. So we know that it is the number of rows minus one. We have four rows times the number of columns minus one. We have three columns. So four minus one is three. Three minus one is two. So three times two is equals to six. So to go find chi-square zero comma zero five and six. And when you go to the table, you will go and find zero comma zero five at the top and you will go degrees of freedom. Six years when they meet there, you will find that zero is 12.92. So now once we have the chi-square test, we are able to clear the region of rejection because now we know that our test statistic is zero comma seven, oh nine. And our critical value is 12.592. So where does zero comma zero seven fall is falls in the do not reject area and you are able to make that decision. That's all what you do. So once you have determined your critical value which tells you your region of rejection, you also find your test statistic and you find way that falls. And we can conclude by saying, since our test statistic of zero comma zero nine is less than our critical value of zero, as critical value with alpha zero comma zero five of 12 comma five nine. So we do not reject them our hypothesis and conclude that there is not sufficient evidence that the meal plan and the class ending are related. And that's how we make the decision. Any questions, absence to questions? Let's look at more examples. So here I will request you to also participate and take action and assist me to complete the example because we've gone through one example. So now we're going to do the other example together as a group. But I will give you time to also do the activity and then we come back and we do step by step. So we will do it step by step together. Okay, so I took this from one of the past exam question paper. So I removed the answers because I want to ask you do this as if like we don't know the options given to us. So we're going to do six steps of hypothesis testing for Chi square. So a study on the mode of transport that work has used to commit to work and the associated distance covered by each mode of transport is summarized in the table below. Yeah, it shows the distance between zero and 10 kilometers and 10 kilometer and 50 and the mode of transport by car, bus or train. And they calculated already the total. So we don't even have to worry about that because they did that for us. The question we need to answer is, is mode of transport and distance covered independent? Has this hypothesis at 5% level of significance? What will be your hypothesis testing? So number one, think about it for two minutes. What will be your hypothesis testing and your null hypothesis? So yeah, we need to state our null hypothesis and our alternative hypothesis testing. So one minute, think about it. Okay, so anyone, how do we state the null hypothesis and the alternative? Is my mic broken? My mic is not broken. Guys, anyone, anyone, anyone, take a step. So nobody wants to try. Yes, Oscar, you can try. You can unmute and am I audible? Yeah, yes. All right, yes. I do want to try that I'm a bit lucky. No, it's fine, you know, learning is, this is the process of learning. Whether you say it right or wrong, at least you will know next time. How do you state your null hypothesis? Ash, I'm still lost. I was busy trying to do that, the activity that you've shown us before then, but I don't even get the drive. How do you state the null hypothesis for that question? We have mode of transport. Okay. Then we have distance covered. Okay. How will you state the null hypothesis? They are dependent. Your null hypothesis will say? They are dependent. In your null hypothesis, it should always say? Oh, the class, yeah, the class standing are dependent. So you will say mode of transport and distance covered are independent. That will be your null hypothesis. Okay. Transport, you're just going to say, transport and distance in the independent. The alternative will say the opposite. Both of transport and less are dependent. Are dependent, that is one. Number two, we need to state what we are given. So we are told that we need to do alpha at zero comma, zero five. We can also find the degrees of freedom as well from here. Our degrees of freedom will be our row total minus one, or number of rows minus one times number of columns minus one. So how many number of rows do we have? These are rows and these are columns. So this site is columns and this site, we call them rows. How many number of rows do we have? Two. So we have two rows. So it will be two minus one. And number of columns we have? Three columns. Three minus one. So two minus one. Is one. And three minus one is equals to? Two. So two times one is two. The next part, we need to calculate the expected values. That's all what we need to be calculating. So I'm going to calculate the first one and I will expect you to do all of them. And then I'm giving you five minutes to do all of them. And then I will ask you for the answers for the rest of them. So to calculate the expected value for bars and zero kilometers, remember we take the row total times the column total. So we're going to say the expected value for bars and zero to 10 is given by our row total multiplied by column total divide by n. So what is our row total? It's 53 multiplied by a column total of 45 divide by our sample size of 150. And that gives us 53 times 45 equals divide by 150 equals 15 point. My. My. So I'm going to give you five more minutes. Please calculate for the car, for the train and for the 10 and 50. So let's say I want to calculate for car. Remember car is 32 is there. So therefore my row total will be 53 multiplied by my column total of 49 divide by 150. You're always going to divide by 150. And that will give us 53 multiplied by 49 equals divide by 150 would give us 17.3. So continue and complete the whole table. I'll give you time to complete the whole table. Are we done? Yes, ma'am. Okay. Good. What is the total for for trade for 53? I'm sorry, for 11? 19.8. Or 11, 19.8. Because we say 53 times 56, divide by 150. Okay. Others for 35, how do we calculate the expected value for 35? 97 multiplied by 45 divided by 150. And then I got 29.1. You get 29.1. The next one for 17. 97 multiplied by 49 divided by 150. The answer is the 21.7. The answer is the 21.7. And the last one, which is 45. 97 multiplied by 56 divided by 150. The answer is 36.2. 36.2. So now we have what we are supposed to be given and calculated as well. So we move to step number three, which is step number three, state your test that you are doing. So since this is chi-square, the test will also be chi-square state, which is chi-square test statistic. And step number four, find the critical value. Finding the critical value, we say chi-square, alpha and degrees of freedom. And we know that our alpha, alpha at 5%, is 0,05. And our degrees of freedom, we calculated it each two. So go find the critical value on the table and come back and tell me. So you must go to the critical values of, or critical values of chi-square and go find 0,05 in the upper pale area. And then go find degrees of freedom two on the side. And where they both meet, you must tell me what do you see? 5.991. The critical value is 5.991. I hope the body is following and not only one person in class because if you get lost now, you won't be able to do your other assignments. And if I don't get participation of all the six of you, then it means I am not going to do your assignment question as exercises in class. I'm going to skip those and go to the other assessment because I do not want to do your assignment for you. So you need to show initiative when we are doing this so that you show me that you are really learning what I am telling you to do. So if you're not learning the skills that I'm giving you now, I need to understand that so that I don't do your assignment because I'm not doing justice for you. Okay. So I expect all six of you to participate. I do not want to hear from one voice in the whole class. I expect to hear from everybody because there are several steps. So it cannot be that one person is doing all the steps for you. Okay. So number five. We need to calculate the test statistic. So here I expect everybody to work as well. So step number five, we're going to calculate our chi-squared test, which is the sum of your observed value minus your expected value, squared divided by your expected value. So here we're going to say, I'm going to complete only one part. You're going to complete the rest and do all the calculations. So I'm going to say our expected value is 10 minus, oh, sorry, our observed value is 10 minus our expected of 15.9. Where the answer, divide that by 15.9. Plus, do the same with 32. So with 32, minus our expected value, which is 17.3, and you square the answer and divide that by 17.3. Plus, you continue, 11 minus our expected value for 11 was 19.8, squared divided by 19.8. Plus, you continue to do the same. 435, which is plus 35 minus 29.1, squared divided by 29.1. So I expect you to complete the whole table. And when you're done, you can start studying the question. I'm giving you five more minutes to do the answer. And when you are done, please tell me that you are done. I'm done. Yes, ma'am. Okay, others, are you done? Other people, are you busy or are you done? Still busy. Okay, how far? Okay, I'm done. Okay, so since you are all done, I expect that the others are also done because they are silent. Can we have the, I'm going to do answer by answer by answer and then we add all of them up. I hope you wrote it that way. And if you didn't do it that way, I don't know how you did it, but those who wrote it answer by answer can give me the answer for number one. For 10 minus 15.9, what do you get? 2.2. Let's keep it to three decimals. Let's keep it to three decimals so that we don't round off quickly. Let's say it's 2.189. I'm going to keep them to three decimals. Oh, we can keep it to two decimals because I don't know how others have kept theirs. Let's do it that way so that since the others don't talk to me, others, 32 minus 17.3, what do we get? I got 12.47. Okay, let's keep it to one decimals so it will be 12.5 because you say 4.7 and then the next one. 3.9. 7 minus 19, you get? 3.9. 3.9, 35 minus 29.1, what do you get? 1.2. 1.2. 1.2 and 17 minus 31.7. 1.2 divided by 31.7, what do you get? 6.8. 6.8. 1. And 45 minus 36.2, correct? Divide by 36.2, what do you get? 2.1. 2.1. Last 2.1. And add them all together, what do you get? Is the answer? 28.7. 20? 8.7. 28.7. Yes. You get 28.7. So if number six is to make a decision and complete. So if you draw the critical value of 5.991 will be there and this side will be the rejection area. So our 28.7 falls in there, correct? 19.7. 19.7. The non-hypothesis and concrete that is most efficient in real, but the more transport are independent, what would they are? Sorry, they are dependent. And that's how you do the cash flow test.