 So we can move forward. Thanks. So let's continue and look at question five. Question five is normal probabilities. So you are given all the information required because this information was collected from a five point scale of five different questions, and the data is normally distributed with the mean of 13 and the standard deviation of three, and n is your number of cases. So the first question is, what is the proportion? Remember all this, we also did cover this during our session. Proportion means decimals, percentage means you need to leave your answer multiplied by 100, and the number will be the actual value that we need to be calculating. You also need to know how to read the table. So because we're going to use normal distribution table, remember this is our normal distribution table, where we're going to use the smaller portion, larger portion, and the mean to z to find the probability, or to find the proportion that we are looking for. Then we also need to remember to use the z-score formula. Remember that. With that being said, remember your z formula looks like this. It's your z is equals to your x minus your mean divided by the standard deviation. This n of 100, we're going to use it when we calculate the last value, but for now, we don't need the n value. You are given your mean, you are given your standard deviation, your x value will always be the value in the question. The other thing you need to take into consideration when you answer the question is the greater than sign, or the less than sign, or the between. So that you need to take into consideration. So how do you answer 5.1? How will we answer 5.1? Are you guys back? Am I talking to myself? Did I meet myself? Okay, I will be back to the road. Okay, so that will be z of greater than our x minus the mean divided by the standard deviation. Then I say the 7 minus 13. 7 minus 13 divided by 3. Then I got minus 6 over 3. It's negative 6. Okay. Equals to minus 2. So therefore, even if I don't have to write all this z, because I'm running out of space anyway. Okay, then I went on the z table to check where does negative 2 go, the first under which. Then I got 0.977. So because it's greater than and it is a negative value, so we need to go to, do we go to the larger portion or the bigger portion? It was the smaller portion. So if I draw a graph, sometimes we always use this. So negative 2 will be here. And because it says it's greater than, therefore we're looking for this area. Yes. So what does this area mean? The bigger, this is the bigger area. It's the larger portion. So it means we need to go to the larger area. So we're looking for 2. So you need to scroll until you see 2. That will be on the second page. So you can find the table either on your study guide if they do give you or in your tutorial letters or something. This one I'm using from the past exam papers. So 2 is there. We're looking for the larger area. Oh, OK. And that will be the answer that you are looking for. So the proportion. So this, now because I am running out of space, actually I should be writing this properly in a way that you are able to see how to answer this question. So the whole question should have been the probability of z greater than minus 2 is equals to 0,97725. How many sevens? Two sevens. Two sevens. Two seven, two five. That's how you will write your answer in terms of the proportion. And that's what they are looking for. So that is one. So let's go to the next one. I'm also going to run out of space here. I'll see how I can squeeze it in. So that was number one. So number two, which is your z of less than 19. Now remember here we're looking for z of less than. Our x is 19 minus the mean of dating divided by 3. And that gives you z of less than. It's over 3. So it's over 3. So then it is equals to 2. Therefore our probability that we're looking for is the probability that z is less than 2. Is that correct? Yes. So what that probability is. So you can also do the same and draw for yourself a graph that represent the same. And the z value is the same, isn't it? Is it the same? Because we're looking for a positive 2. Oh, come on. Still go to the positive 2 site. And we're looking for a less than. So a less than would mean this area, that site of positive 2, which is the larger portion as well. So if it's the larger portion, therefore it's the same. It will be 0 comma 97725. But remember, it says what is the percentage of the school of the students? So we're going to have to multiply this by 100. And the answer here will be? 97.7. 97.725%. Or you can give it as 2 decimal, which will be 97.73%. Or you can give it as 1. 1 decimal, and it will be 97.7%. And that will answer 5.3. Now, 5.3 is asking you to find the actual number of students that falls between that. What will be the number of students with the Roscoe between 7 and 19? To do that, since we have calculated both 7 and 19, we have the Z score for both of them. All what we just need to do is we can either go and use the mean score to Z, or the mean to Z values, 4, 7, and 19, which will be 2 and 2, which will be the same value. And we add them together, or we subtract one from the other. So how do we answer this? We know that to calculate the mean to Z of the between, we use mean to Z, isn't it? This will be a very difficult one to do, because there are zeros. There will be zeros, so because the mean to Z for both of them will be the same. So let's go to the mean to Z for 2. We need to say, OK, let's use this place actually for it. So in order for us to find the probability that Z, not Z, X lies between 7 and 19. Sorry about the flicking. Therefore, it means we would have found the probability that remember we calculated the Z. I'm not going to repeat this Z scores and calculation. I'm just going to use the values we got, the values we got. We know that for the Z score will be between for 7. It was minus 2, so it is between minus 2 and 2. And in order for us to use the mean score, we say the mean to Z of minus 2 plus the mean to Z of 2. I'm just going to put it in brackets like that. So it means we're going to use the same value twice. So that will be 0,47725 plus 0,47725. And that is how much? 0.977, which is 0.94545, yes. So now remember, this is the proportion of between. This is the probability of the between 7 and 9. The question was, what is the number of roll falls between? So we need to know the number of students. How many students do we have? We have hundreds. So if possible, then we need to take this, the probability. So we can say the probability of Z less than negative 2 and 2 of 0,94545. Is it 4,544245? 4,5. The probability of? We need to take that to find the number. That is 0,94545 multiplied by 100. And that will give you a number. How much? How many people? 95. It will be 95. 0,94545. Rounded up, it will be 96. 96. You can give it as that or you can say 96. Depending on how the memorandum looks like, you can write all of them. You can also say, therefore, it will be 96 students. You can write the decimal value and then also round it off and say it will be 96. Because if you round off 95.45, you will get 96. And that is how many students scored between that much? That is question number five. I hope you did write down this because when I send this documentation, I won't be able to send it unless if I send this as a screenshot. Let's see. 5.34. Let's call this. 5.34. I will send this as an email and WhatsApp as well. OK. OK. Let's move on to question six. Question six is your problem. Sorry, sorry, Lizzie. Yes, yes. Yes, no, before we move on guys. Can you, can you explain again for me this thing of of a larger portion and the smaller portion like with with an age of a diagram like so like I always like find it very difficult for me like to understand this thing. OK, so let's use this site. You need to take into consideration when you work with this data with with the table. You need to take into consideration the sign provided like the greater than less than less than yes. Yes. The sign. Yes. When I'm going to put it this way. And you must write it somewhere because you can use that as a reference. When the sign says greater than yes. This is without using the diagram. OK. I'm going to use the diagram just now. When the answer is negative, so your Z value is negative and the question was greater than when the Z value when the Z value is negative. Therefore you go into. Look at your diagram. Which looks like this and look at the negative value because remember in the middle here it's zero zero yes on your graph. So any value that is this site of a zero will be negative. So this will be a negative. Because the site says greater than you need to shade the greater than of that value that you have as a Z value. So the greater than will be the site of the Z value. Of the Z value. Yes. So maybe instead of using A I must use Z. I must use when Z is negative. Therefore we came for this. And if you look at the area that you shaded there is a small part that you didn't shade but you shaded the biggest site. So therefore it means on the table you're going to go to the larger site. At the site. Yes to the larger portion of the table. Of the table. Yes. So what then now comes what if the site also stays as greater than but the answer is positive. So if the answer is positive therefore you're still going to draw your graph like that. Remember in the middle here it's zero. But it is positive Z. So let's say this is your Z is positive here. On the right side. The site here says greater than. So we're still going to assume the greater than. So it means the shaded area will be this. If you look at this graph. Yes. You can clearly see what site. Which site is shaded. Is it the larger portion or the smaller portion that is shaded. The smaller portion. So therefore it means you're going to go to this. Smaller portion. Smaller portion. So. Okay. So. Okay. If I can correct you so like we like after throwing this thing this diagram like then you will go for the shaded area. Yes. Yes. So. Okay. If the site is less than let's take that for example. Okay. Yes. If the site is less than. Yes. And you get your answer is negative. Your Z value is negative. So I'm going to draw my graph there. And I'm going to draw my zero day. Yes. So negative. On the left hand side. Yes. The negative Z. Look at the side. Which site am I going to shade. The left. The left. Okay. I'm going to shade the left side because the sign is less than. Less than. Less than means me. So this tells me that I'm a shade. This site. Left. Left. You see. So the sign need to also guide you in terms of which area you're going to shade. So this one was a greater than. Less than. So the sign tells you I'm a shade. The bigger side. The bigger size. This one says I'm a shade. The smaller size. Right or left. Right. Yeah. Let's make it that way. So this one says I'm a shade. The right. Right. Current says current to the right. And this one says I'm a shade. The left side. The current is to the left. So I'm saying left. Right current shade to my right. Left current in shade today. I give you the shade today. To the right to the left. The left. Yes. Left. Left. Right. Shade. Right. You need to also remember that. So if the. The sign your Z score is positive. So there is your zero in the middle. And there is your Z. is your Z. So remember the sign says left. So it means I'm a shade to the left. So that's all this, and you see. And then automatically clearly I can see that now I need to go to the larger portion. And then the between it's always easy. Between, between, we always use the mean to Z. We add the values of the mean to Z. So the between it's always easy. You just look for the mean to Z value. If the mean to Z is that two years, you just add that value and that value. You just take the two values, the mean to Z values and add them together. That is the mean to Z. That's easy to do. Okay. No, thanks, man. Thanks very nice. Clear now. No problem. No problemo. Okay. So let's move to question six. Question six. Let's just make this bigger or smaller. Let's reduce it a little bit. Two departments have entered a cross weight puzzle. And the competition wins 15,000 worth of prices for the leading computer store. Department A has 300 entries. Department B has 400 entries. The report from the competition administrator is that 1,000 entries qualified from the solution provided. Sorry. The report from the competition administrator is that 1,000 entries qualified and from the solution provided, both departments qualified. What is the probability that the two departments will win first and second place? How do we answer this? Okay. I wrote P into department A close brackets times P into department B close brackets. Then I went on and said 300 divided by 1,000 times 400 over 999. Yeah. So the first thing you need to actually also do because this is an independent probability question. In order for you to be able to find the probability of joint probability of both of them, right? In order for us to find the joint probability of both of them because this says what is the probability that department A and department B win first and second place? So they didn't say which one wins first. So because that is the case, we can calculate the probability of A times the probability of B. So therefore it means we need to find the probability of A and we need to find the probability of department B winning the price, right? That is one of the first thing that we need to do. I think for A is 0.3 and then for B is 0.4. For us. Well, wait, let's do this because I want to. Right, correct me here. So we first need to identify what will be the probability that they will win the price, right? Because they say, what is the probability that the two department will win first prize and second prize? So the first thing that we need to establish for everything, how much max is there? Three max. So in order for us to find the probability that the department will win a first prize and second prize, we need to calculate. Sorry, someone's background has something going on. Okay. So if we take into consideration, let's say you win the first prize and your friend win the second prize. Oh, that was the one example that was used in your study guide, I think. We can use the same scenario here in terms of this. What if we take that the first prize goes to department A and the second prize goes to department B? So if what will be the probability that the prize will go to department A, that will be 300 divided by 1000. Isn't it? Yes. Divided by 1000, not 100, right, please. By 1000, by 1000, that will be how much? Zero comma three. Zero comma? Zero. Zero comma three. Zero comma three. Then what will be the probability of this department B winning a prize? Zero comma four. So no, that will be 400 divided by, remember already the first prize went to department A. It's 400 divided by 999. 999. Yes. What's two? Zero comma four. Four. Zero comma four. Zero comma four. Zero comma four. Zero comma four. Sorry. Sorry, folks. Can you repeat this part of 99 thing, the first prize and the second prize? So like when do we change to that, to 99? Yes, yes. Because already we know that the first prize went to department A. So therefore there are 99, 999 prizes left. Yes. Then the second prize will go to the next. To the next availability. Okay. Okay. Okay. So then we can use the same information and calculate because that's all what we need. Just substitute into the formula that we have. We could have just substituted into the formula there. So the probability of A and B, because it's first and second winning a prize, can be given by the probability of A being zero comma three multiplied by the probability of B, which is zero comma four. And that is equals to zero comma 12. And that is zero comma. Zero comma 12. Remember this is, sorry, I need to also make this clarification. This is if department A win the first prize. Now we need to also do the same for when department B wins a first prize. So if department B wins a first prize, you go and do the same thing. I'm not going to do the same thing. I'm just going to do it here. So the probability that department A and department B wins is the same will be. If department A, department B wins the first prize, therefore it will be 400 divided by a thousand, multiplied by, now we're going to multiply by the 300 divided by 999. And that will give you, what, how much? It will give us zero comma 12. And that is zero comma 12. Zero comma 12. Now, what would be the probability that both of you win the prize? Then that would be, there, you can just use the addition rule. Okay, so for 0.12 plus 0.12. Yes, so that would be, so to answer this question, so that would be zero comma 12 plus zero comma 12, which is equals to zero comma. So you just need to follow that step. So we could have just took this and substituted into there. There's zero comma three times zero comma four. We could have just said 300 times, oh, 3,300 divided by a thousand times 400 divided by 999. And it would have given us zero comma 12. And then, the last part will be to add both of them to find what is the probability that the two department A and B win first price and second price. So yeah, we could have just say the probability of first and second price is equals to zero comma 12 plus zero comma 12 is equals to zero comma. That will be the answer to the question that you are looking for. And this is for three marks. So probably this is one mark. So one for doing department A first one mark for doing department B first and one mark for getting the probability of first and second. So that would be the answer to the question. The probability of first and second. Okay. There's a lot of working for such for solo max. Yes, I know. Yes, I agree. Yes. So it's just one mark, one mark, one mark for showing that calculation. So you just need to say department A first and department B first. Let's show that. Okay. And that is question six. Now let's move on to question seven. Okay. Question seven. Now we get into hypothesis testing. Question seven says a personal officer of an organization. You as a personal officer of an organization, you want to determine whether there is a difference in the productivity of employees from two departments, from the organization. To determine this. Remember now this is two departments. So there are two different departments. Not to say different departments. To determine this you choose to, you choose two group consisting of 10 employees from each department. Therefore they are independent of one another. What happens in one department does not have any bearing on what happens in another department. So these are two independent samples. The productivity of employees from each department is measured in terms of the average unit produced per workday in a month. In order to draw a conclusion, you need to determine whether there is a significant difference between department A productivity score and department B productivity score. The other thing you need to take into consideration because this is a hypothesis testing. You need to read the question carefully in terms of what you need to be doing. So there you need to determine whether there is a significant difference between A and B. They didn't say anything about greater than, less than. So therefore you need to take that as your two tape or what you call it a non, is it a non-directional, a non-directional test that you need to be doing. So if that is the case and if that is your non-directional case, formulate a hypothesis testing. How do you formulate the hypothesis testing? You have the mean of A and the mean of B. So your non-hypothesis. How do you state your non-hypothesis? Okay, I said U department A minus U department B equals to zero. Mu A minus mu B is equals to zero. And that's how you state your non-hypothesis. Or you could have just said the non-hypothesis. Mu A is equals to mu B. It would have still meant the same thing. So you can state your statement like that. So you will get one mark for doing that. That is your non-hypothesis 7.2. Formulate the appropriate hypothesis in weights. That's very important. You see, not in symbol. So the first one they said in symbol, this one says in weights. So you need to be very careful with this. So next time in terms of that one, 7.1, if they said in weights, you just repeat the same statement that you have there. There is no difference between department A and department B because there is no difference between the two departments. Or you can say if they said in weights, you would have said there is no difference between the mean score of department A, productivity score, and the mean score of department B, productivity score. That would have surface if it says weights. So now what will be the alternative hypothesis? Based on that, we know that we're doing a non-directional test. So what would you say is your non-hypothesis? There is a difference. I think that's what we say. There is a difference between, and you just repeat the same statement. I'm not going to write it. You just repeat that statement. There is a difference between department A, productivity score, and those of department B. That's how you will state it. Don't state the significant, just the difference between because you haven't proven anything to be significant. So you just stated that. If they would have said symbol, so in terms of symbol, since we're doing a non-directional test, so you would have said your alternative will be the mean of A is not equals to the mean of B, or you would have said your alternative hypothesis, not the now, the alternative. You would have said the mean of A minus the mean of B is not equals to zero. That's how you would have stated it in a symbol format. So you need to take that into consideration when you answer the question. So it's just only one mark. Don't use symbols when they're going to ask you for symbols. If they ask you for the weights, use the weights. 7.3, assuming that the data are normally distributed, select an appropriate test statistic and calculate the test statistic. So here they're asking you to do two things. What is the appropriate test statistic? You don't have to say this is the appropriate test statistic. You just need to write the appropriate test statistic that you need to be calculating. So what will be the test statistic? It is the Spooled Variance T test. So you're going to use T. Okay, we're going to say T equals to mean of A minus mean of B over the square root of standard deviation of A minus over number plus standard deviation of B over number. Not the standard deviation, but the variance. And this S squared, which is S squared over the sample size of B. Then we will have our mean of A to be 7.7. Our mean of A is 7.7 minus 4.52 over the square root open bracket for 2.23 over 10 plus 1.16 over 10. We got 3.18 over the square root of... I will just tell you the total of those two numbers, which will be 0.339. Okay, so you just give me the overall answer. Yes, for those two beneath, yes. So it means everyone needs to know how to use their calculator to calculate and find the values. So... Then after removing the square root will be 3.18 over 0.582. Okay, so sorry, the top one is 3.18. Divide by... Square root of 0.339. Okay. Then I... Then I get really confused on how to remove the square root because we have t. So our t is going to be t squared. So you just say 3.18 divided by... Find the square root. There is a square root like this on your calculator. What kind of a calculator do you have? It's a case you're going to be 0.582. So you need to use that fraction thing for your case, because you could have used the whole thing on your calculator to calculate. Okay. You have this fraction. Yeah. It will be 0.582. It's 3.18 over 0.582. Yeah. I ran out of space. Yes. So then what will be the answer? The final answer will be 5.5. The final answer is 5.5. Oh, we can say the final answer is 5.4. If we give it 2 decimals, 4.6 or 4.8. Is it 8? It's 5.46. Okay. Let me just double check something here on my calculator as well. 3.18 divided by... Oh, come on. I used the old calculator. So... Yeah, it's 4.6. Okay. Yes, it's 4.6. Yes. Okay. Okay. So then that will be your test statistic. Then the next question says, determine the degrees of freedom. How do we find the degrees of freedom? N1 or NA plus... Is it plus or minus? Plus N1 minus 2, because there are two of them. Which is 10 plus 10 minus 2. 10. Which is 18. 10 plus 10 minus 2, which is equals to 18. Yes. That would be your degrees of freedom. Okay. The next one says determine the critical value. So it means we also need to go to the T table. But determining the critical value. Because we're doing a two-tail test, so it will be T alpha divided by 2 and the degrees of freedom. What is our alpha? We were given 0 comma... What was our 0 comma 05? So you just take 0 comma 05 and divide by 2. T of 0 comma 05 divided by 2 and the degrees of freedom we did find it was 18, which is T of 0 comma 025 and the degrees of freedom of 18. So we need to go to the T table. We'll have to go to the T table. So you go to your tables. We look for T table. For a table that says T table. And on there, because we're doing a two-tail, we're looking for... Oh, okay. I forgot about your tables because they look different to my statistical table. Sorry, my bad. Ignore what I just said. Let's go back. Let's ignore what I just said because I realized I am teaching you the pure state way of doing things. Yeah, I was getting lost because I've never divided by 2. I don't know. Because I know N is minus 1. T is alpha and the degrees of freedom and that will be T of 0 comma 05 and your degrees of freedom of 18. And because we're doing a two-tail test, we're going to look at the two-tail test and look for 0 comma 05 because I just realized that there is the value that I was referring to because I was splitting the two-tail into the upper tail and the lower tail and that is the value that I was working with. So you guys, you don't have to divide by 2. You use the table and your table has the two-tail if it's a two-tail test and a one-tail if it's a one-tail test. So we're going to use 0 comma 05 and our degrees of freedom was 18. So we go where? The degrees of freedom is 18 and where they both meet that would be your critical value and your critical value is? 2.1009 2 comma 2.1009 009 is your critical value for one mark. Then the last step is using the information interpret the results in terms of the rejection or non-rejection area of the model hypothesis. So now, here is the other thing you need to always use the diagram. It's easy to represent your information on a diagram because then you will not get lost when you are interpreting your answers. You don't have to, you can use it. Since it's a two-tail test you can do it this way. So you have your upper tail, that side and your lower tail, this side. So those are your rejection areas. Anything that falls in the shaded area you reject the null hypothesis. Anything that falls here you reject the null hypothesis. That's the easy part with this. So since we have defined what our critical value is we said it's 2.10, so therefore this side will be a negative 2.10 and this side will be a positive 2.1009. So I'm just saying 2.09. So it got less of where our test statistics were. So our test statistics is 5.46. If it was negative we can also use the negative side. If it's positive we use the positive side. So it's positive. If it falls in the rejection area. So it falls in the rejection area. Therefore in this instance if your T critical values this is the decision. So this makes it easy decision the decision rule it makes it it makes life easier to do this way. Get confused. If your test statistic not the critical value if your test statistic which is the T value that we calculated here if your test statistics it's greater than your critical value then we're going to reject the null hypothesis. That's what we're going to do. This is just the decision rule. We're just giving you as an information. You just need to have it in the back of your mind or you can write it down so that you can reflect on it. Because the question there is only asking you for one thing. Only one thing. Interpret your results in terms of rejection or non-rejection based on this decision rule are we rejecting or not rejecting the null hypothesis? What is your answer? So the answer for 7.6 will say the test stat of 5,46 it's greater than the T critical value of 2,1009 therefore we reject the null hypothesis. You just write that statement. That is it. One mark. You get that one mark. That is 7.6. 7.7 interpret. So yeah, we know that we're rejecting the null hypothesis. Now we need to go back to the statement that we stated in your null hypothesis. Remember that statement? There is no relationship there is no difference between the null hypothesis and the alternative. Now if we are rejecting that statement and we're saying there is a difference because we're taking the alternative statement here we're rejecting the null hypothesis that says there is no relationship or there is no difference. How do you interpret that in relation to this? In your plain language you see there they say in your plain language. So don't use the textbook language use this language. So even though they say in your plain language there are a couple of things that you also need to take into consideration the level of significance retells you the rejection area in terms of this significance of that rejection. So you can do that in terms of the certainty of the data that you have. So because it's 5% you can use the 95% confidence or you can say 95% that blah blah blah you can use that. You need to take that into consideration the level of significance you need to take into consideration the results of whether you're rejecting or not rejecting the null hypothesis and you need to take into consideration the statement you stated in your null hypothesis. So and because they say with how much certainty can you conclude this because it was 5% what will be the certainty of that will be 1-5% or 1-0,05 which is 95% certainty. So you're going to state it in the way that you will say with or you can say you are 95% certain that there is a difference between how did we state this there there is a significant yes you can say it the way you want to say it so that you can correct me if I'm wrong. I have said the research concludes with a 95% certainty that there is a significant difference in performance in department A from department B there we go you put it nicely there and that's how you will say it as long as you can mention the 95% certainty and you can confirm that there is a significant difference so what you just do is just take the 95% and also take everything you stated here in your alternative because that's what you said but include now that portion of the so now you take this whole statement with 95% certainty that there is a significant difference between department A productivity score and those of department B so the only thing that you add here is a 95% certainty that there is a significant difference between and that is if you are rejecting the null hypothesis if we are not rejecting the null hypothesis the statement would have looked different you would have said with 95% certainty there is no significant difference and that is what you would have said if we were not rejecting the null hypothesis okay that is 7.7 any questions to the other statement questions so now let's move on to ANOVA sorry just give me a second sorry about that I just wanted to before we move to the ANOVA to also confirm something as well here when you do a hypothesis test because I don't know in terms of your the hypothesis testing question that they will give you you need to also take into consideration the following two things or one thing that might give you a difference so if you have so for this instance because we know that we do it for independent variables or independent departments in department A and department B we used this formula for those who don't know why we use this formula we used the test statistics formula for this because the sample sizes are equal if the sample sizes are not equal then you will have to use the pooled or this pooled variances formula where then you will need to calculate this pooled variance and then substitute it back into the formula so you need to take that into consideration when you answer questions like this so the first thing that you need to recognize is others and sizes the same if they are equal then it's fine to use this formula if they are not equal then we will use the pooled we call it the pooled variance I think the pooled variances because of the unequal sizes so that will mean that the formula that you will use you must look for that is your N1 minus 1 times the standard the variance for 1 plus your N2 minus 1 times the variance of 2 divided by N1 plus N2 that is the pooled variances that or the pooled variances that you will use and you will need to use your t-test as your pooled variance and if I can remember the formula correctly I think your t for that pooled variance will be the same as your mean 1 minus minus your mean 2 the pooled variance I'm not sure now is it the pooled variance divide by N1 and the pooled variance divide by N2 I think so I think so pooled variance divide by N1 plus the pooled variance divide by N2 you will use this formula which looks different to this because your S pooled variance is given by your N N1 minus 1 times S squared 1 is it plus or minus I can even remember this but you need to look at the formula the exact formula I think it's a plus it's a plus plus N2 minus S squared 2 divide by N1 plus N2 minus 2 that will be the pooled variance formula but this is if your N is different so department A maybe they selected 10 and department B they selected 11 or 12 then you use that one so you need to take those small things they can make a huge difference because the answers you get will be different as well and the way you will make decisions as well will be different because of the critical value but not the critical value but the statistic that you would have calculated as well so take that into consideration unequal sample sizes which is your N we use that so just remember that for some reason I wanted to highlight that because you will never know in your exam they might not give you the same as this exam that they have okay so let's look at question 8 which is ANOVA so with ANOVA there are several formulas that you also need to familiarize yourself with and you need to know them and how to calculate them in order to answer this question so this is a table with all the information you need we need to find out if there is a significant difference between the student attitudes towards statistics at first, second and third year level and why you know that also you're doing ANOVA it's because now you're not giving two variables you're giving three variables when you have three variables or three measures you're going to use the ANOVA if you're given two of those you're going to use a t-test or z-test for you to answer the question okay so they gave you the non-hypothesis they say the mean of all three levels are the same and we are also given the level of significance which is 0,01 okay so 8.1 so you need to take into consideration all this information that is given in this table that is very important this information okay when I use that let's just erase it manually okay so let's look at 8.1 so 8.1 says choose an appropriate test statistic so now in order for you to do this there are so many other things that needs to happen to calculate the test statistic for this hypothesis and you can see that this is out of 8 marks so it means you will need to calculate some total, some measures total you need to calculate the you will need to find your some measures your mean measures as well so your some errors your some for treatment or group and then your mean measures for group and errors and also to calculate the test statistic so this will take you most of the time to do we are about to start off so to answer 8.1 so we go into first I have to answer the following question we need to first calculate your sst which is your ss total which is given by the sum of your x squared minus the sum of your x squared divide not everything but just only those ones divide by n so you will have to use that formula so actually we do have formulas here why am I scribbling my head if we know what formulas are there so we will need to calculate all this because we need to find that and we also need to calculate all this so the first one we are calculating is that one the sum of x squared minus x squared divide by n so that is the first step so based on this information this the mean by is shifted so there is a bar on top of this you can see that there is a line that shifted then so the sum of x squared is this value so we just take 587 the sum of x squared with that value so this whole thing with that value minus the sum of x this is your sum of x is 825 but it's squared so you just need to put the square divide by n and remember your n so let's go back to the formula n is a capital letter n you can see that you can see that also on the others the n is different so your capital letter n is different from the small letter n so you need to take that into consideration your capital letter n what did I use I used the small letter n on the formula so let's change that to capital letter n so that will be 15 and what is the answer so it will be 587 minus the fraction of 85 divided by 15 what do you get 105.3 that's what I got 105.33 I'm just going to keep two decimals you need to make sure that you keep most of the decimals probably while you're still waking so I'll just keep two decimals or you can keep one it doesn't really matter at this point but I'm going to keep two decimals so we have our SST we can go find the next formula we need to use we can use this formula SS group your SS group is the sum of your mean J minus the mean so that will be the mean of every the mean so it's that minus the overall mean that minus the overall mean because it's the summation of them squared so we just need to calculate it that way so we can do that so so I'm going to calculate it I'm just going to use this part here so we have SS group is equals to the formula is a small n so small n times we don't have to do times we can do the sum of sum of mean J which is the mean of every one of them that minus because that's what the formula says minus the overall mean and squared so our small n is 5 times we need to do all this so this is okay let's go back there you can see that it says the sum of that of that so we can also in this because we need to do the sum of individual values so that will be 6 minus 5 5.67 and we need to square that plus because it says the summation and then we do the same 2.6 minus 5.67 squared plus the last one 8.4 minus 5.67 squared and we can close this so you can calculate that and get the answer today and also calculate on my side so that we can have the same answer when you give how much do you get you just let me know I got 84.9315 do you all agree with that answer yes I also got this one I don't know what was that the answer here will be 84 84.933 some number so I'm just going to keep two decimals so that will be 84.93 so that is the SS group then let's go back to our formulas because we can use our formula to guide us what we need so we've calculated that we've calculated that we also need to calculate this so we need some SS error I'm just going to do SS error on my right SS error is given by your total minus the group so it's SSG so we've calculated SSG which was 105 20.33 minus 84.93 yes what do we get 20.37 20.37 that's the second one I got 20.37 20.37 so depending on how many digits you kept your answer might also look different to the one that we have in front of us 105.33 minus 84.93 so in terms of the one that I think I'm looking at I get 20.4 it will depend on use the value you see on your calculator as long as it's not far away from the values we are getting and this is mainly because of how many decimals we are keeping so they should not penalize you for that I don't think they will expect you to be mathematicians but just to see that you understand the concept as well okay so we're done with the SS now in order for us to calculate this because our aim is to calculate this this is our final thing the test statistic this is the test statistic that we want that is the aim in order to calculate this we need to also calculate the mean square measures as you can see there so we can start with the mean square measure at all you see there at the bottom of this there is the degrees of freedom so now it means we need to go and calculate this degrees of freedom all of them we can do all of them and then come in substitute into the formula so let's start with the degrees of freedom for total so we can start with that I'm going to write it in here so that I don't lose the space for other things to do so let's start with the degrees of freedom for total that is capital letter n minus one capital letter n minus one is 15 minus one therefore it is equals to 14 and then we can do the degrees of freedom for group which is the number of groups minus one gf group we have three groups there are three groups minus one therefore it's equals to q and the last one it says degrees of freedom for error is number of groups times the sample size of the group minus one so degrees of freedom error it will be three times five minus one which is equals to three times four 12 we have the degrees of freedom now we can go and calculate our measures some mean measures measure group mean measure group it's SS group divided by degrees of freedom group which is easy msg is ssg divided by dfg ssg we found that it was 84.93 84.93 and our degrees of freedom for group is two so the answer here is equals to 84 the answer I got is 42.47 that's what it got as well 42.47 and I guess that's what most of you would have gotten okay so we need to go and calculate the last one which is because it's ms group divided by ms error ms error is ss error divided by degrees of freedom error so we can do that ms error is given by ss error divided by degrees of freedom for error we did calculate it was 20.37 20.37 depending on your date remember your values that you got when you answered you must substitute them as you saw them or as you have them divided by the degrees of freedom error which is 12 and what is your ss ms e error I got 1.67 697 which is 1.7 1. let's keep 2 decimals I just want to keep it to be 1.69 1.70 to 1.70 hi okay 20.37 12 is 1.697 oh yeah I see why are you going so yes and just leave it at 7.0 yes 1.7.0 yes okay now the final thing we need is the test statistic that is what we are aiming for our test statistic is ms group divided by ms e so our f is ms g you will write it in full name ms group ms error our ms group was for the e 2.47 divided by 1.70 and that is equals to 25.4 24.98 and that is out of it's for 8 marks so let me just see how many marks we have 1, 2, 3, 4, 5, 6, 7, 8, 9 okay so somewhere one of them eats so it's one each and then the degrees of freedom probably all of them they constitute one mark because if I read this it's 1, 2, 3, 4, 5, 6, 7, 8 somewhere will come somewhere from the degrees of freedoms we are not supposed to present the table they don't want to see all this but they want the table so you don't even actually need to show all the the calculations but what is very important is that's what they say choose the appropriate test statistics for this and calculate the test statistic there is no way that you can calculate the test statistics without so you don't have to show all of them and once you're done then we can present it in the table you remember the table looks like this that's 25 marks yeah that's a lot okay for some reason my lines don't want to cross over so I'll just let's erase this one it must be here so remember then this will be your source degrees of freedom your sum square measures not writing again anymore we can't see I think it's not writing your pen is not writing anymore can you also not see what I'm writing you ended up drawing but now what you are writing the degrees of what the sum of what we cannot see that can you see the table yes the table is there but what you are writing in the table are you sure you're not seeing the MS the SS heading yes that's what I was writing the headings yes I'm just writing the headings for now so now we can do this one the group the era usually it's called treatment I'm not sure if you want to use treatment or group it doesn't really matter and then the total maybe probably I can do it like that and like this as well so yeah you will have degrees of freedoms you will say 12 14 is it 14 12 you will say 12 and then you will have 14 there your sum square measures here you will have your SS group it's 84 so you can just say SSG equals I'm not sure if they want you to label them look at the example in your study guide as well how they did it so you can do the same as what they have just so that you don't lose marks and then SST which is 105 and MSG which is 42.47 and MSP which is 20.37 you don't have to worry about the MST and then here you say FE 24.98 so if you had already calculated them outside I don't think there should be any problem when you just substitute the values into this 8 marks so probably one mark for the table one mark for each of the calculation I don't know I don't know how they mark it that would be one mark for each one probably they will mark it inside the table so one mark for the table 12345678 something like that I don't know but it's 8 marks 8.2 determine the critical value which will help you decide whether or not you will reject the null hypothesis so this is ANOVA remember with ANOVA we use the F distribution so remember with F distribution you have your degrees of freedom one and degrees of freedom two so your degrees of freedom for the data which will be your degrees of the data will be what was in the F test which one was at the top group so degrees of freedom for group is two and degrees of freedom for error is 12 so yeah we look for group degrees of freedom for group and degrees of freedom for error and yeah so let's go there so it is two and 12 so we're looking for two at the top and 12 at the bottom so that will be 3.89 that is your degrees of freedom oh sorry that the other thing wait wait wait not too quick not too quick not too quick the other thing we need to take into consideration is the alpha value what did they say our alpha is 0.01 we need to take that into consideration that is 0.05 so we need to look for 0.01 0.01 so our degrees of freedom two and 12 two at the top and 12 on the left and two where they meet 6.93 let's go there your degrees of freedom is given by your alpha and v1 and v2 our degrees of freedom 1 and degrees of freedom 2 which is your t of 0.01 and 2 and 12 and that should give you 6.93 now based on that information the other thing that can also help is looking at your f-distribution this graph also gives you a guidance in terms of your f-distribution it's always a one-tail test f-distribution even if we're doing a two-tail or what this will always the rejection area will always be in the lower side in the right hand side or the smaller shaded area so if this is our critical value of 6.93 what was the test statistic that we got we got the test statistic of about 24.98 so we can make a decision based on that so 6.93 is your critical value because that will be your f-critical value tells you that so if that falls in this this is our decision room anything that falls here we reject the null hypothesis so now our test statistic is 24.98 it will fall in the rejection area so we the question is do you reject the null hypothesis then you can say your f-statistic of equals to 24.98 it's greater than your f-critical value of 6.93 therefore we reject the null hypothesis I don't have to write it in full if I wrote it in symbol we reject the null hypothesis that is the decision you make 8.4 says interpret your findings now the same as what we did remember now this is at 0.01 anyone who wants to take a step at this how do we conclude always go back to your null hypothesis our null hypothesis stated that the mean are the same so there is no difference because the question was is there a significant difference so the mean says there is no difference between all three levels now if we are rejecting the statement that there is no difference how do we then conclude anyone who wants to take a step okay Mina I said there is a significant difference in the student's attitude towards the statistics then I said this can be concluded with 99 percent certainty thank you very much that's how you conclude you don't even have to put the 99 certainty in front you can also include it in the end and you said it perfectly because there is no there is a difference there is a significant difference between the student attitude towards their first, second and eleventh, thirtieth level okay that is question number eight let's move on to question number nine we almost done and that just making clear on the document when I send it then you can have the information that you need can think this will be almost like mostly how many more questions this is the last this is the last one okay so now we are on the chi-square test which is also another test for categorical variable differences we will test the relationship between two categorical variables so now we will test whether there is a difference between first and second semester students regarding their level of satisfaction relating to the assessment method so semester one student responded and semester two responded and you collected the information so this is a two by three chi-square table they also did calculate the chi-square test statistics which is chi-square is that your test statistic is calculated you just need to make sure that you read the question carefully and answer 9.1 make a contingency table to represent the given information clearly indicating the observed and the expected so now since they say create a contingency table you need to rewrite this contingency table and somewhere you will have to tell them that this are your observed and anything in the bracket is your expected something like that that on your table will clearly indicate whether what is what that they are looking at okay so all you can create two a copy of this you can create observed you can say observed and then you can replicate the same information here and say expected it depends on you how you want to do that and have another column here that shows expected yes and no and not show you can have it this way there is nothing wrong you can just redraw it and have observed in at the top here so let's remove this because they don't say how you want to show the table so you can have it like this where you have your observed and then your expected next to it but the other thing that you need to remember is to go and complete the total right here at the bottom but only the total will apply for the observed values so 60 plus 40 it's 100 22 plus 68 is 90 90 and also I forgot we also need the total here at the end hmm so they should be another total there you need that total there and the total year will be 60 plus 32 plus 30 90 for some reason now it means I must rethink of my table with a date you will need to have the total it's very important if you're going to calculate the observed frequencies so this will be 112 I need also the grand total here at the bottom so this will be your total I've already calculated that was 100 now I forgot 90 and 90 again 20 168 and the grand total is 280 280 280 okay so you need that but you also need the expected values so you can extend this and create the expected value on the side there is no right or wrong answer so you can extend this and create the expected value on there or it doesn't really matter you can use the same table but you need to clearly indicate that inside the values are observed and inside are you expected if I'm going to rewrite them on here so now I need to make this given clearly indicating your observed and expected so let's do that expected for 60 will be so expected for 60 will be 112 times 100 divided by 208 and you can do for the rest of them for expected for 22 will be 112 times divide by 28 so you can do for all of them so let's so I'll give you some minute when I go to the bathroom see if you are able to calculate all of them and then I will come back with what you want us to do but I'm not you know no I want to do it not in class I don't have you know so have you tried to answer and just give me a sec let me just write all those values here question paper just want to show you something as well while we're busy with this I'm going to stop sharing and then share my entire screen reason why I don't like using my entire screen is because people now they are able to see my because I've got so many applications open okay so I have this template that I use with statistics sessions it makes it easier for them especially now it's a cheat sheet something like that so I'm going to hide some of this for the papers of our class today because what I did was to answer the question that we have on here so this template has all permutation tables or not all of them but most permutation of contingency table based on the previous activities that the statistics students would have done so in terms of your one I just used the two by three because I needed to identify what kind of a contingency table you have and then I just have substituted the values onto yeah so moving back to your question let's come back here you can see that you have so in terms of our one we have our observed values with the years and no and not sure and semester one and semester two and then calculated the total so similar this template does the same so what I did yeah I observed values so it means you can write it like observed values and have your observed values and substitute all the observed values the way you like it the way they are asked and the totals so you can see they calculated the totals as well and my expected values and there you can see so I just need to remove that highlight so we have our expected values for something else so there is our expected values remember we said what is our expected value in terms of our question okay so we said expected value for sixty is one one two multiplied by a hundred what did you get forty eighty is equals to forty as you can see on the template as well our expected value for that for sixty is equals to forty and that is just the calculation that I did the same way your L so if I double click on it you will be able to see that let me make it bigger let me make the screen bigger so if I click on this you will see that it takes the observed total of semester one multiplied by observed value total for yes and divide by the grand total of two hundred and eight so automatically the template does the calculations for you all what you need to do is just feed it this information the only thing that you need to calculate is just that information and then it does all the calculation so there are your expected value which is forty so in terms of our question that was forty and in terms of twenty two it is one one two multiplied by ninety what do you have thirty six thirty six so therefore it is thirty six and that is thirty six so that would be so if you use this kind of a template then you can say forty put all your observed values into the bracket and say thirty six and so forth and so forth or you can create another table or you can use the same table and create all the values in there so I'm just going to use the bracket for this back into my graphic forty fifty four fifty four so this will be forty fifty four fifty four am I not doing that sixty forty sixty oh sixty sixty fifty four fifty four and thirty six sorry my bad forty is sixty sixty fifty four fifty four thirty six thirty six again so there are your observed and your frequencies as well and then the next question actually asks so I just need to remove all this and that is why I don't like making in a PDF because erasing the values in the PDF is more difficult and challenging than erasing in a PowerPoint slide just give me a sec to do this we should be done by half past because this is almost done or even within five minutes of your time so we have our expected value in a frequency table so you can create it this way or you can also put the frequency table the frequency values next door or next to this observed frequency but also highlight because it says clearly indicate your observed and your frequency depending on how you draw the table really matter as long as you are able to clearly identify so instead of me creating this observed values here I could have just extended this table and have a heading here like I said this will be the observed and next to it here will be my expected frequencies and I will have here forty thirty six thirty six sorry jumped off thirty six and sixty fifty four fifty four and like I said there is no right or wrong answer in terms of how you create your contingency table and making sure that you are able to clearly explain the values you can create one where it's just this table with the observed in the bracket but we need to tell them that the sorry the expected in the bracket but we need to also clearly indicate that the expected are in the bracket so that they can know what they are looking at okay so it's only for three months depending on how you want to draw it use any format you want and determine the degrees of freedom degrees of freedom for a square it's the number of rows minus one times the number of columns minus one remember when you calculate the number of rows don't count the additional expected value columns if you created one don't count the total as well only the observed categories okay so the number of rows these are rows and these are columns how many how many rows do you have two are only two so it's two minus one how many columns do you have three there are three columns minus one and that will be two minus one is two times three minus one come on it's one times two which is equals to two so that is your degrees of freedom now we need to go find the critical value so finding the critical value its chi is squared critical value is alpha and the degrees of freedom so your alpha is zero comma zero five degrees of freedom is two so we need to go to the chi square test so you need to go to the chi square test also pay attention to the level of significance which is your alpha and your degrees of freedom so these are your degrees of freedom and these are your alpha values so our degrees of freedom is two our alpha zero comma zero five zero comma zero five and two where they both meet and that is five comma nine nine one five so that is five comma nine nine I mean nine two nine one five the last portion like last two questions do we reject the null hypothesis similar with the chi square the graph is there anything that falls here we reject reject the null hypothesis this should help with the decision decision rule come on what's my English now I think my brain is now tired really I can feel it decision decision rule so that is the decision rule we can use this as our base so we know that anything that falls in the small shaded area we're going to reject the null hypothesis that is the decision the critical value is five comma nine nine one five I'm going to do it on here so that we can have it here so we know that this is the decision the decision rule this is just to help us make that decision so the critical value is here our critical value chi square correct of five comma nine nine one five it is here that is our critical value this area of rejection so looking at our test statistic remember the test statistics which is chi square they gave us they calculated it already so we can use that with the critical value and make a decision so anything that falls here we're going to reject the null hypothesis so now in terms of this do we reject the null hypothesis we do because our chi square test so you always have to remember to do that chi square of twenty seven point three eight it's greater than it will be greater than the critical value of five comma nine nine one five therefore therefore we reject the null hypothesis that is the only statement you need to write for one month the last last statement is how do you interpret your findings and with how much security that would be that is the last patch that you need remember we always need to go back to the statement that was given to us which is very long in this instance so in relation to this so remember because we didn't talk about the null hypothesis and the alternative hypothesis for chi square test all right you need to always remember the null hypothesis should always state in the pendant and the alternative should always state the pendant which means independent means there is no relationship or this will say there is a relationship because this is the differences between the two but we can also state it in relation to the statement that was given is there a difference between first second semester student regarding their level of significance relating to assessment method used in this module during the lockdown period so it would have meant that in your null hypothesis you would have said there is no significant relationship and in your alternative you would have said there is a significant relationship between because remember this is a measure of relationships so how do you conclude anyone anything for two months anyone okay if there is nobody because I think we are all our brains are all tired but the same thing that we have been doing the whole time when you are answered this question you could have also said with because it's 5% level of significance with 95% certainty there is a significant different there is no significant differences between because we rejected the null hypothesis which said there is no relationship so here we would say there is a significant difference or there is a relationship between there is a relationship between the first second student regarding their level of significance or you could have said there is a significant difference between first and second semester students regarding their levels of significance and with that it concludes our exam prep session number one which will be our last session I'm just going to put on my mic for now and I'm going to ask you if there are any other questions this is me for those who joined late and you don't know Elizabeth Boye this is Elizabeth Boye yes so are there any other questions any anything that I can help with before we call it quits I'm going to stop the recording now so that we can have a free and open discussion I'm going to stop it sorry lemon