 Any content related discussion that you want to have, don't hesitate to send me an email or communicate with me on the WhatsApp group. I've given you both the 1510 and 1501 and 1502 WhatsApp groups. Make sure that you join the group that is relevant to your module so that you don't get confused during the discussions. Okay, so let's start with today's session. You will let me know if you are not seeing my presentation. So today we're going to look at how do we do hypothesis testing, looking at a difference between two population proportions. And today is the end of April probably because the next time I see you, it will be in May. The other thing I need to ask is your module, the 1502. Is it a year module or is it a semester module? And if it's a semester module, when are you writing your exam? Am I alone in this? It's a year module, so you're writing October, November. Okay, so it's a year module, so at least we have time. Thank you. Because I was worried if it's semester then we need to also re-look at the content. Okay, so in May, we're going to look at those topics where I will show you before we leave the session where to find all this information, including also the notes and where we will find the recordings. I know that UNICEF is having some difficulties in uploading some of the videos, but I think on the WhatsApp groups, I did share my YouTube channel. There are some of the videos that you can use there as well. You just need to subscribe so that every time I upload new videos, you'll get a notification and you are aware of new information that gets uploaded. Yeah, so let's go on. So at least now you know what topic we're going to be discussing when. Please remember, all the sessions start at six o'clock, unless if I tell you otherwise, but all the sessions start at six o'clock and set half past seven. Do you have any question, any comment before we start with today's session? Silence means that there are no questions or queries or comments, so what you need today, likewise, you need the statistical table. I hope you have them ready next to you. If you don't know where to find the statistical table, sometimes they are at the back of your prescribed book or they, if you have past exam papers, past old past exam papers, they are at the back of those past exam papers, you can use that. I'm not sure if your module do give you a tutorial letter with tables. If not, then those are the resources where you can find the tables. You will need the calculator. Every time you come to these sessions, you must have a calculator and I think the best calculator to use is a cashier calculator. If you do have it because of the complexity of your formulas that you use, you can use cashier, it can capture fractions and powers and all that in one go and you just click equal or answer and you will get your answer, but you will need to practice how to use your cashier calculator in order to use it. Otherwise, we're going to do step by step when we solve problems. So by the end of the session today, you should learn how to use hypothesis testing for the proportion of two independent population also to form a confidence interval for the difference between two population proportion. And let's learn how to do that. So because we're doing hypothesis testing, remember, there are six steps of hypothesis testing and since you'll write a multiple choice question paper, all those six steps can be part of the multiple choice question options that the lecturer wants you to answer. So you need to be able to know how to do all six steps and understand each and every one of them, how to get the answer from them. So step number one is to know how to state your null hypothesis and your alternative hypothesis. And remember, your null hypothesis and alternative hypothesis are what your researcher wants to prove and these are proof. Then you need to identify things that are given, including the level of significance, the sample, sometimes you, because here we're talking about the proportion, what proportion are you given as well? You need to be able to identify what is the appropriate test statistic for the proportion. For the proportion, we use only one test statistic, which is the Z test statistic. So it should be easy to do. Determine the critical value. Here you will determine the critical value of a Z test statistic using the Z table, which is also called the normal standardized distribution table. And with the critical value, we use them to identify the regional projection. Whether you're doing a one-tail test based on the hypothesis, actually the symbol stated on your alternative hypothesis will tell you whether you are doing a one-tail test or a two-tail test. And you will identify the region of rejection based on those critical values that you have. Step number five, you need to be able to calculate your test statistic. Remember the test statistic that you have identified. In step number three, you should be able to calculate that. And step number six, you need to be able to make your decision and conclude. And your decision is based on your test statistic and the critical value. And if it's p-value, it will be used, you will use the p-value and the level of significance or your alpha-value to make a decision. And when you make a decision, you either reject the null hypothesis or do not reject the null hypothesis. Never ever stated otherwise. Don't say we are not rejecting the null hypothesis. That's not how you state the statement. Or you say we fail to reject the null hypothesis. That's not how you suppose to state it in your statistics module. So we say we reject the null hypothesis or we do not reject the null hypothesis. Okay, so with any hypothesis testing, there are some assumptions and there are some goals that you need to know what you need to be proving. So yeah, in terms of the proportion, the goal is to test the hypothesis or form a confidence interval for the difference between two populations. So it means we say population one is equals to population two or the population one and population two have no difference. That's what we, oh, they are different. That is what we are testing. Your assumptions will state that the sample times the sample proportion should be greater than five, greater than or equals to five or your sample two times the proportion of two should also be greater than two. Or otherwise the inverse of both, which is the sample times your one minus your proportion should be greater than or equals to five. So we call the values that you're going to use to estimate or to calculate, we call those the estimates and this are your sample estimate and your sample one estimate minus your sample two estimate. These are the ones that we're going to use in the formula to calculate. In order for us to know whether we need to reject or we need to not, we do not reject the non-hypothesis. Some of these might be small. Yeah, sorry. So these are what we call also the sample statistics. So which will be the sample proportion one minus the sample proportion two, but they are just point estimate for the difference between the two samples. Okay, so in terms of stating the non-hypothesis, we need to assume that the non-hypothesis is true always. So we will assume that your population one is equals to population two and the pool of the two sample estimate will be calculated as follows. So we will calculate, this is one of the measure and we call it the pooled estimate because this will give us the sample, the mean. Let's call it like that, which is the pooled estimate, which is P with a bar is your estimate, which is given by your observation satisfying your sample observation one plus your observation satisfying your sample two divide by your sample size one plus your sample size two. We will use this when we calculate the test statistic as well. Okay, the test statistic that we're going to use, we said it is the Z statistic. So you also need to remember all these things. In the exam, you will be given all the formulas. You just need to identify the correct formulas because you're not gonna be told what this formula calculate. You need to know that this formula will calculate the proportions by looking at the symbols on the formula. So your Z test statistic for the proportion, it will be calculated by your sample estimate or your point estimate of your sample proportion one minus the sample proportion two minus the hypothesized population proportion one minus population proportion two. Now the population proportion one and minus the population proportion two will always be equals to zero. And normally in the formula, we don't even see that. Divide by the standard error, which is the square root. Now you remember, we spoke about the pooled estimate. Now here it will be the standard error will be given by the pooled estimate times one minus the pooled estimate times one over sample one plus one over sample size two. Where we know that our sample proportion, remember now in 1502, it's a buildup from 1501. So if you're doing both of them, I feel sorry for you because some of these concepts are discussed in detailed in 1501. So UNISA should not allow you actually to register 1501 and 1502 at the same time because the knowledge you gain from 1501 means you need to apply it in 1502 because in 1501, you will learn about all this point estimate as well. So if in the question, they did not give you a sample proportion, which is P one copy, if they did not give you that, you need to know that that is given by observation satisfying that sample size divide by the sample size. So those point estimate or what we call a sample statistic proportion called P one from the sample one. It will be given by observation satisfying that sample one divide by the sample size of that sample one. And that will give you the point estimate. So if they didn't give you this, you need to know that you need to calculate it by using the observation given divide by the sample size for both of them. Okay, so let's look at how we do the hypothesis testing. So remember one of the steps is to identify your critical values but on your null hypothesis, you always state your null hypothesis and your alternative hypothesis. So if your null hypothesis stated that the population proportion one minus the population proportion is greater than or equals to zero, then the alternative will state that the population proportion one minus the population proportion two will be less than zero. And that will be a one tail test because we're only going to concentrate on the symbol, this symbol, the less than. And that less than will tell us where our region of rejection will be because it's pointing to the left, therefore our region of rejection will be to the left. And we will put our critical value there to define our region of rejection which will be that part. And when we make decisions, if our test statistics fall somewhere in the blue shaded area, we're going to reject the null hypothesis. Otherwise, if it falls in the white area, we do not reject the null hypothesis. In a word format, we state it in this way. We say we reject the null hypothesis if the Z test statistic, it's less than the negative Z critical value. Now, because it's one tail test, when we go find the critical value, we do not divide alpha by two, only when it is a two tail test, then we divide alpha by two. For the upper tail test, which is also one tail test, we look at the alternative. When the alternative says it's greater than, therefore it means the region of rejection will be on the right hand side. And also, if your Z test statistics falls in the shaded blue area, we're going to reject the null hypothesis. Otherwise, we do not reject the null hypothesis. For a two tail test, it will be given by not equal. Therefore, it means there are two tail or two regions of rejection where we can make decision. Then it means our alpha will be divided into two and we go find the critical value based on that alpha divided by two. And once we have the critical value, if our, we will have two areas where we can reject. So you will go and calculate the Z test statistic. If it falls in the white shaded area, we do not reject. If it falls in the negative, let's say your Z state answer, you went and you found that it was minus 1.23. Because it's minus 1.23, it will be in the negative. So if it falls here, you're going to reject. But if your Z state was 1.23, it will be in the positive side. If it's positive, then you're still going to reject. So depending on the answer you get, you're going to look at whether does it fall on this side, the right-hand side or the left-hand side. Otherwise, if it falls in the white shaded, the white area, then you do not reject the hypothesis. So it means Z state less than negative critical value, we reject, Z state greater than the critical value, we reject. So you have two options to reject your null hypothesis. Now let's look at an example. Is there a significant difference between the proportion of men and proportion of women who vote yes on a proportion, proposition A? So now we need to find out whether are the differences between men and women, the proportion of men and the proportion of women on this proposition given for what? So we are also given some facts with this statement. So they will not just give you the statement, they'll not give you some facts to help you answer that question. In a random sample of 36 of 72 men, so yeah, they give us the sample that has 36 out of 72 for men and 31 out of 50 women, they voted yes. So 36 of men voted yes and 31 of women voted yes. Now, yet they didn't give us the proportions, they just give us the observations in terms of our observation of sample one, which will be for men and sample two, which will be for women. Our observations are 36 and our sample size is 72. So there were 72 men and 50 women. So we now know that we have our X1 and N1 and X2 and N2. That's all what we know right now. And they also give us the alpha of zero comma zero seven and they asked, is there a difference? Therefore it means we need to do a hypothesis testing. And here now I'm applying already humans error analysis method prompts. Because now I've identified the values that I need. Now I can go and do my hypothesis. The first step is to state the null hypothesis and alternative hypothesis. Null hypothesis says the population, there is no difference between, there is no difference between population one and population two. It means that both proportions are equal. My alternative will state the proportions are not equal. And the reason why I'm picking up that it's a two-tailed, not equal, it's because in the statement here, they didn't say men are more than female or females are less than male or more females and less male, something like that. So they didn't give you those clues that you can use in your hypothesis testing. Yeah, it's just a general they are equal, not equal. Okay, so this is a two-tailed test. Therefore it means already in my mind, it's a two-tailed test. It means I'm going to find two regions of rejection. Therefore it means I'm going to, when I find my critical value, I'm going to divide my alpha by two. And we know that our alpha is zero comma zero five, right? That's what they gave us, zero comma zero five. Okay, so now I must state what else am I given? We know that we can calculate our sample proportion because they didn't give us the sample proportions there, but they gave us the observations satisfying those sample proportions. We know that for men, we had 36 out of 72, which is 50, 0.5 or 50%, 0.5. And for female, women, it's 0.62. So we do have our sample proportions. We need to also calculate our pooled variance, remember? Sorry, our pooled estimate, which is X1 plus X2 divided by N1 plus N2, remember that? So it means we're going to say 36 plus 31 divided by 72 plus 50. And then we get our pooled P with a bar, pooled bar, which is our mean estimate. We get to zero comma five, four nine. Now we almost have everything that we need. Then we can continue. In this instance, I'm going to calculate the test statistic and we just substitute the values. P1 was zero comma five minus P2. P2 copy, sample proportion two was zero comma 62. The hypothesized proportions mean different, remember? It's always going to be equals to zero because we take it from the hypothesis, divide by the square root of our pool estimate, which was zero comma five nine times one minus our pooled estimate of zero comma five, four nine times one over 72, which was our sample size one and plus one over 50, which is our sample size two. You can put this onto your cashier calculators, the one with the fraction thingy. And when you get the answer, it will give you one comma minus one comma 31. Finding the critical value. To find the critical value, remember, we use z alpha divided by two and we were told that our z is zero comma zero five, therefore divide by two. We will find our critical value by using z of zero comma zero two five. Now, in order for us to find this z value, we need to go to the z table. Depending on your module, if your z table looks like this, which is a standardized, in your module, sometimes your z table, yes, no, your z table will look like this, where you will have two digits go in there and two, three digits go in there. So, yeah, you will start with zero comma zero zero and yeah, we will have zero comma zero, zero comma one, zero comma two and so forth. And as well, it will continue like that. So, yeah, you will have zero comma zero one and so forth. Now, in order to find the critical value of this, we need to go inside this table and look for zero comma zero two two fives. Depending on the table that you are using, sometimes the table is zero comma, it's in four decimals, so therefore it will be two four nine. If it's in four decimals, so that will be the two, the zero comma two five. You're going to take this value and go outside and go find the value that it corresponds with. I'm going to remove this because the value that it corresponds with, it will be one comma nine and then you're going to go up and then you will see that at the top you will have zero comma zero six. You will take that one comma nine six and because we're doing a two tail, you just put the plus or minus in front because it's in the negative side and in the positive side. And that's how you will find your critical value, I'm going to clear all of them. Then we have our critical values there. So we define our one comma nine six, one comma nine six and we shade our area, shade our area. Now we need to find out whether where is one comma three one. So our one comma three one, it is somewhere and they do not reject because it falls somewhere there. It's not bigger than one comma nine six, it will always be less and it's not bigger than one comma nine six, it's less than. So it's not less than minus one comma nine six but it's bigger than that. So it falls in the do not reject area. The decision will be we do not reject the null hypothesis and in conclusion, we state that there is not sufficient evidence of the difference in proportion who will vote yes between men and women. And that's how you make a decision based on there. Hypothesis testing. So let's do some exercises. Consider the following results from the independent sample taken from two proportion. The key weight here, two population, sorry, independent samples selected from two populations. Now looking at the table, we are given sample one and sample two, the sample size and the other key weight here is proportion, the sample proportion. So therefore it means we're not looking to do a T test because there they said independent samples. The first session that we had, we looked at independent samples calculating the T test because there we were given the standard deviations and the mean and all that. Yeah, you are not given that but we're given the proportion. So clearly here we're doing the inference of two population proportions. Okay. So what are we given as well? We are given the sample size, we are given the population proportion peak one guppy and peak two guppy. We are also given this observations which had the number of successes X1 and X2. The professor wants to investigate to whether there is a difference between two population proportion. So we just want to investigate if there are two, if there are differences between two population proportion and we are told what level of significance is which is our alpha. And this question is asking us if the test statistics for the difference between the proportion is this? Therefore it means we need to calculate the test statistic. What is the test statistics for the population proportion, between, for the difference between the two population proportion? So then that's what they are asking. That's what they are asking us to calculate. We need to go calculate Z-stat and that is your sample proportion one minus your population proportion two. Divide by, I'm not going to put the minus zero because it's pointless to put that. It doesn't add any value to the formula. Divide by the pooled estimate times one minus the pooled estimate times one over N1 plus one over N2. Now you can substitute the values. The only thing that they didn't give us here is the pooled sample or the pooled estimate. So we need to go and calculate the pooled estimate. So P bar is equals to X1 plus X2 divide by N1 plus N2. That's what we know from the formula. So our X1, they gave us, it's 192 plus X2, it's 108. Divide by N1, it's 400 plus 300. And you take your calculator. If you have a cashier calculator, you can use the fraction mode. If you don't have a cashier, then you will have to calculate what is at the top, which is 300 divide by 700. 192 plus 108 divide by 400 plus 300. I hope you also get the same answer as me. You need to be able to calculate this. I'm gonna leave it to four decimals. Zero comma four two. If I leave it to four decimals, then it's 486. So we can come back and substitute. Our point estimate one is zero comma four eight minus point estimate two, zero comma three six. Divide by the square root of zero comma four two eight six times one minus zero comma four two eight six, not three, times one over, the sample size one is 400 plus one over 300. Okay, I'm gonna use my fraction mode because the equation is very long. Zero point four eight minus zero point three six divide by the square root of point four eight. I am using a cashier point four two eight six. You must let me know. If you're not getting the same answer as me, don't just take everything as I tell you. I want you to also do the calculations. Just remove everything, square root of, just give me a sec. If you have already the answer, you must just also let us know. Four two eight seven, what am I doing? That is four two eight six times one minus zero point four two eight six, close bracket, open bracket, fraction. One over 400, okay, plus, what do you get? You can post it on the chat. If you're getting a different answer to me, I'm also getting a different answer to myself, of course. What do you get? My answers are not what I am looking for, three comma. Unless I am rounding off too early on the answer that we got there. Let's do it this way. I will do it step by step because I'm not getting the answer that I have there. So I will do it step by step. Point four eight, since you guys are also not talking to me, point three six, it's zero comma, zero comma one two. Unless if you guys have disappeared and you are not here, I'm alone here. We are with you. Okay. Anyone who's getting a different answer? One over 400, plus, one over 300 times. Point four to eight six times open bracket. Point four to eight six plus bracket. Take the square root of the answer. Why am I not getting the answer? Wait, wait, wait, wait, wait. One divided by 400, point four to eight six times open bracket one minus point four to eight six close bracket equals. What answer do you get? Let's give me a second. Okay. I'm trying to use my calculator on my phone but I'm getting errors. I don't know why I'm gonna get the real case show because I was trying to use the one from my phone. I think now I broke it. I don't know how but it seems if I broke my calculator now. Wonderful. Thank you. I usually have an online calculator but at the moment it's not working well. So let's do this for the last time. I'm gonna use the case show. Point four eight minus point three six divide by the square root of. Instead of calculating this point estimate manually I'm gonna, I will have to use the value that we have here. I'm gonna leave it to state of leaving it to four decimal I'm gonna leave it to five decimal and see if it makes any difference. Divide by the square root of four point zero point four to eight five seven times one minus point four to eight five seven close bracket, open bracket, one over 300, 400 first plus it doesn't really matter, one over 400. Let's see if I get the same answer and equal and my answer is three comma one seven. I still get the same answer which is not here. So my calculator was right. So we still get. So probably there is something wrong with the answer that they have here. Unless if they rounded off too quickly I'm just gonna check if I round off some of this. So I get three comma, I hope that is what you also get three comma one seven. So. I've also tried it three times and I get the same answer. Yeah, I just want to round off the values that we have here on the point estimate. Sometimes it's, it adds, so let's leave it to four point two, four point three. Let's say it's four point three and I must do the same on the first one. Delete, delete, delete, delete four point three and answer. Okay, let's see what we get equals. I still get the same three point one seven. So probably there is something wrong with there. Answer here. It is three point one seven. Unless because we rounded off too quickly. Let's see if I don't run off too quickly because sometimes it adds value there. If you round off too quickly you're not gonna get the values correctly. Let's do all of it. So we have 192 plus 108 equals divide by 700 equals. And let's have zero zero zero zero zero. Just gonna use all the digits that I have which is zero comma four two. Gonna use all of them and see if it makes any difference. Zero comma four two eight five seven. Zero comma four two eight five seven one four two nine. And I know that previously I used to tell people that please do not round off quickly. And I think this is one of those time that you don't need to round off too quickly. You need to write out all the values and only round up when you get to the answer. Nope, that is not that case. I still get the same answer. So probably what their options here is incorrect. Okay. I'm not even going to waste any more time with that question. Okay, let's look at number two. Assemble size of 150 from a population has 40 successes. So we are given N and we are given X. And sample size of 200 from a population has 80 successes. So this is N1, X1. So the next one, N2 and X2. That's what we are given. For testing the null hypothesis that the proportion of the successes in the population, one exceeds the exceeds. It means it's greater than the proportion of the success in population two. So that's what they say, population one exceeds population two. And remember, like I said, you need to lend this from STA 1501. In your null hypothesis, you cannot put, sorry, your null hypothesis should always have an equal sign. So because the research wants to prove something that cannot go into the null hypothesis, this will go in the alternative hypothesis and your null hypothesis will be the false null hypothesis that we're going to create. Okay. Which one of the following statement is incorrect? Okay. So before I can answer all this question, I'm going to just take a step back and then do the hypothesis testing. Step number one is to state your null hypothesis and alternative. Your null hypothesis states that population one minus population two is equals to zero. Your alternative will be population one minus population two. It's greater than zero. It's greater than zero because that's what the researcher wants to prove right now. The reason why I don't put less than or equals to, it doesn't really matter on your null hypothesis, whether you put less than or equals to or you put equal sign because it's just, the null hypothesis always have an equal sign to it. So we always put the equal sign and we teach this in STA 1501. Okay. So that's the first steps. Step number two, what is my alpha? Alpha will be given somewhere in the questions because that's what we're looking for here. They gave us our alpha, which is zero comma, zero five. I just use that because I just need it. I can calculate my point estimate, but before I calculate my point estimate, I need to calculate my P copy one. What is my P copy one? It's your X one over N one. What is my X one? It's 40 over 150. And what is the answer? So you guys, you want me to give you all the answers. 40 divided by 150. That's 0.2666 recurring. Okay. So we just gonna keep two decimals is fine. So let's say it's not comma or four decimal. They keep it four decimal, not comma. Two, six, six, one, two, three, four decimal will be, the last one will be seven. So let's go and do P two, P copy two, X one, X two, divide by N two. X two is 30 divided by 200. It's zero comma one five. It is zero comma one five, zero comma one five. Now let's go calculate the point estimate. So we still on step two. We just gonna do everything on step two, which is X one plus X two divided by N one plus N two. X one, it's 40 plus 30 divided by 150 plus 200. And it is equals to 150. Zero point two. Zero point two. So we're done with step number two. Step number three is to go and find the critical value now. We are doing a one-tail test. So therefore it means we're going to find the critical value, step number three, by using Z alpha and we know that Z is zero comma zero five, I need to find a table. Do you have your table? Go and look for the Z test statistic, which table it's called. I keep on misplacing my past exam paper. I'm gonna use another, I don't have. Do you have your tables in front of you? For some reason, I don't have a stats. I know that I just need to keep, just give me a second. I used it. The last time we met, we did look at the table, right? The T table. For some reason today, that table is missing in action. I'm gonna show you from this 15, 10 table. I hope it looks exactly the same as your tables. So let's see if they do have a table at the back. Yes, they do. So usually it is called cumulative standardized normal distribution table. It's a Z table. It's got the positive and the negative side to it. And yeah, we're going to look at the negative side of that table. Okay, so this is what it looks like, cumulative standardized normal distribution. So we're looking for zero comma zero five. Remember that? So you come inside the table. If you look there is zero comma four, zero comma four, zero comma four, zero comma four, zero comma four, zero comma five, zero comma five, and zero comma five. There we go, I found it. Zero comma, zero four nine. There is zero comma five one. It already passed five. So we can use this one. We'll use this. So if we take that value and we go out, remember we first go this way. We go out and we find, you must write that number down to five. You can ignore the negative number in front. And then we go up. And then you go up this way and you take only the last digits, which is eight and put it next to the five. So that is our critical value. So now let's go back to our presentation. Okay, so now we know that our critical value is two comma five eight. That is based on the information, based on everything that we are doing here. Right, don't look at the questions. Step number four is to go and calculate the test statistic. Oh, we can also find already here the region of rejection because we can draw it and say this is greater than. So therefore the region of rejection will be the side. So the year will have two comma five eight. And if it falls here, we're going to reject the null hypothesis. That's what we're going to be doing. So let's go and calculate the test statistic. I'm going to put it here. Step number, step number four, we're going to calculate the z test statistics, p-gapi one minus p-gapi two, divide by the square root of our pulled variable or pulled estimate, which is zero comma two times one minus zero comma two times one over, what is, oh, why am I putting the value scheme already? Sorry, my bad. I'm writing the formula first. The formula first n one plus one over n two, which is equals to our estimate. We calculated them, b one is zero comma two, six, six, seven minus p two, which is zero comma one, five over the square root of zero comma two times one minus zero comma two times one over 150 plus one over 200. Let's do the calculation. Point two, six, six, seven minus point one, five, divide by the square root of bracket. It's fine, point two, open bracket, one minus point two, close bracket, open bracket, fraction, one over 150 plus fraction, one over 200, close bracket and answer. What do you get? I get two comma seven, zero, one. I'm gonna leave it at that. What do you get? Now we need to answer the question. Do you also get the same answer? Maybe probably I've done something wrong. Do you also get the same? Do you have calculated? Okay, silence means, I don't know. There's nobody is saying, we're still calculating or we're not calculating? Still calculating. Okay, you need to tell me, Galoko, now if you are all quiet, then I don't know what's happening. Yes, I'm getting two point seven, oh one, oh eight, oh two, yeah. You do get the same, okay? Can you also calculate just the value underneath the square root, just the square root part, only the square root part? I just realized we need also to find the standard error. Do you have the answer? Zero comma four, some number nine. Zero comma zero four three. Zero comma zero four three, yes. Okay, so let's see how we answer the questions. Now we have all the information that we require. So I'm gonna use another pen, which color. So let's look at this. The question is asking, which one of the following statement is incorrect? So we're looking for the incorrect one. Now, number one, state, the population, the proportion one is zero comma two, zero comma two, six, six, seven, and proportion two is zero comma two, five. Why do we have zero comma one, five? 30 divided by 200, that's what we are calculating, right? Zero comma one, five, yes. So this is incorrect because that says zero comma two, five. So that is zero comma one, five. Size 200, population two has data successes. So that is incorrect. And probably then we will have to stop right there because that is the incorrect answer. But I'm having a problem with this question because now if you go, unless then the question was not asking what is incorrect. So this is what you're going to get from your past exam papers as well. The estimate standard error, P one minus P two is zero comma one nine. We did find it is zero comma zero for three. So therefore that is incorrect, right? The test statistic which we calculated, they say it is two comma six, seven. We said it is two comma seven, zero one. So that is incorrect. I'm going to skip this one. I'm going to come here. The rejection region at alpha level for two tail standard deviation will be given by those two. Now we know that this question is asking for a one tail test but here they say for a two tail test. So for a two tail test, therefore the Z critical value will be alpha divided by two and it will be zero comma zero two five and therefore the critical value will be one comma nine six. And that's where the challenge is with this question. If you look at those two questions, those two, they are probably correct, both of them, because for the proportion, if we use normal standardized normal distribution, therefore it means our population needs to be normally distributed. And if it's not normally distributed, the sample sizes must be large. If you look at our sample sizes, 150 and 200, which N is greater than, N is greater than, say, which is large enough. Otherwise we can look at other assumptions because the other assumption said N one times P one should be greater than or equals to five. Oh, sorry, times P, but we were not given P as well. So between those two, if we go with the fact that one of the assumptions is not normally distributed, let's go to our first slide that we had here where they said the assumptions doesn't have to state that the null hypothesis or the, the population needs to be normally distributed. But usually for Z distribution, your population has to be normally distributed because we use a cumulative standardized normal distribution, which is a score. So two questions already. There are challenges with those questions. Okay, so let's hope the other questions don't have the same problems that we see. So either one or two will be correct. So if we only based the answer based on the questions given, this is not a two-tailed test, it's a one-tailed test. So therefore this will be incorrect and that one will be incorrect. But no way in the assumption states that the samples have to be normally distributed as well. So, okay, moving on to the next one. Let's look at this question. I'm not going to do all the steps of the hypothesis. We're gonna go through the statement one by one, but I just gave you some insight in terms of when you go answer the questions, you can apply the hypothesis testing steps and then come and answer the question because then it makes it easier for you to identify where the errors are as well, or which one is the correct one or which one is the incorrect one. Okay, let's look at this one. Consider a hypothesis testing for the population proportion with the null hypothesis. P one is equals to P two, which we can also write this as null hypothesis stating that P one minus P two, P one minus P two is equals to zero. Given the following information, your number of successes X one and X two are given. The sample size independent N one and N two given from the two populations. We use the data above to decide whether the percentage of population one is less than, this is a keyword, less than, it means it is less than the percentage of the population two. So therefore it means in our alternative hypothesis we'll state that P one is less than P two, something like that, or we can state it in this manner, H naught is P one minus P two, it's less than zero. You can state it in that manner. So which one of the following statement is incorrect? I've already answered, oh, sorry, this is H one. I've already answered number one because I've stated number one here. So this is correct. Which one of the following statement is incorrect? We're looking for the one that is not correct. The sample proportion P one is equals to 0.5 and P two is 0 comma one. So you need to go and calculate P one, which is X one over N one. X one, you are given 10 over N one is 20. So that will be 10 divided by 20, which is 0 comma five. And P two is 0 comma six over N two, which is 18 over 30, which is 0 comma six is 0 comma six. Therefore it means this is also correct, right? Then we move to the next one. The Z test statistics for the two proportion is appropriate. We know that for the proportion, we always use Z state, right? So that is also correct. When we identify the test statistic, it's always going to be a Z for the proportions. The pool estimate for the population proportion is 0 comma five six. So we need to go and calculate the pool estimate, which is X one plus X two divided by N one plus N two, which is 10 plus 18 divided by 20 plus state. It's 0 comma five six, which is 0 comma five six, which means that is correct. Then they say find the standard error. We know that the standard error is your square root of your pooled one minus pooled estimate times one over N one plus one over N two. So which will be given by the square root of 0 comma five six times one minus 0 comma five six times one over 20. That's one over 30. What do you get? Do you have an answer? I get 0 comma one, four, three, two, nine, and some numbers, which is totally different to that. Do you also get the same? Yes. Okay, so that is the incorrect answer. So let's look at the next one. A sample size of 100 selected from one population has 50 successes and a sample, I think we did this. Oh no, no, this is a different one. And a sample size of 150 selected from the second population has a 90 successes. So X one, X one is 50 and one is 100. N one, it's 100, X two is 90, N one, N two, it's 150. You need to go find the pooled proportion. I am not going to calculate it for you. I'm just gonna give you the formula. It should be easy, quick and easy to calculate. Do you have the answer? 0.56, 0.56. So you would have said 50 plus 90 divide by 100 plus 150, right? Yes. Which will give you 0.56, which is option four. Okay, I know that time has gone, but in terms of the confidence interval, we concentrated more on the activities of hypothesis testing. But the same principles also happens when you need to calculate confidence intervals. So it will be your sample proportion, or this is sample proportion two. Sample proportion one minus sample proportion two plus or minus the critical value. Here we will have to find the critical value by dividing alpha by two times the standard error, which is P one times one minus P one divided by N one plus P two, P cap two, which is the sample proportion two times one minus the sample proportion two divided by N two. So let's use the same information that we had. Like we had the previous, when we did the hypothesis testing here, we need to construct the confidence interval on the same, oh, sorry, on the same information. So we know that our X one, N one, X two, N two, then we are told what alpha is. So we can calculate P one. We know it's 36 divided by 32. And we did find that it was 0.5 and P two, which is 31 divided by 50, which we found that it was 0.62. So it's easy to calculate the confidence intervals. So we remember it's GAPI. So it's P one minus P two plus or minus because there are both sides for confidence interval we will find the lower tail and then the upper tail. So we find the minus side and the plus side. So plus or minus defines your upper and lower. Z alpha divided by two. So it means we're going to find the critical value by dividing by two. We know that that critical value of alpha divided by two is one comma nine six because from the previous activity that we did times the square root of you pick one GAPI times one minus P one GAPI divided by N one plus P two GAPI times one minus P two GAPI. Start with one minus one minus P two GAPI over N two. And that is the formula. Let's go back there. P one GAPI, one minus P one GAPI. So we are on the right track. Okay. So now we can just substitute the values. P one is zero comma five minus zero comma six two plus or minus our critical value of one comma nine six. P one is zero comma five times one minus zero comma five one minus zero comma five everything over N of 72 plus zero comma six two times one minus zero comma six two everything over N of 50. So first let's do the calculation. Let's remove the bracket. Let's move the values after the plus or minus. So we say point five minus point six two is equals to minus zero comma one two. So I'm just gonna write here minus zero comma one two plus or minus and I'm going to do everything behind the plus or minus. So it's one point nine six times the square root open bracket the square root of and the first one will be a fraction which is point five open bracket one minus point five close bracket divide by 72 and then plus fraction again point six two open bracket one minus point six two close bracket divide by 50 and I need to close the bracket and let's go one up one up close the bracket and equals and that is zero comma and I get zero comma one seven seven three one four I'm just gonna stop right there plus other numbers. So in terms of the confidence interval we need to find the upper tail and lower tail. So this side will be it might change around because we have a minus there. So I'm going to start by doing the plus side the minus side first minus zero comma one two minus I'm gonna do the minus first minus zero comma one seven seven three one four seven and then I must do the other side so this is a comma let me put it somewhere here so you can see minus zero comma one two plus zero comma one seven seven three one four seven so we can calculate that minus point one two minus point one seven seven three one four seven equals and the answer either side will be minus zero comma I'm just gonna keep four decimal zero comma two nine seven three two nine seven three and on the upper tail area we'll have minus point one two plus point one seven seven three one four seven which is zero comma zero five seven three and that is your confidence interval and that's how you will answer the questions relating to confidence intervals so you can answer the same question. We have calculated the P's and the sample proportions you just use that same information to to find your confidence interval which almost look exactly the same as that so they should be almost the same and that concludes today's session. Any questions any query any comments? Unfortunately I am so sorry for the two questions that we could not answer them because of the errors from the questions I will remove them for future references because I didn't assume that we're going to have problems with those questions. Any questions? I have posted the link to the regular questions in the chat. Please make sure that you complete the register. If there are no questions or comments or queries I will see you next Tuesday. Have a lovely evening. Bye. Thank you. Thank you so much. Bye.