 your formal session two where we're going to discuss things around hypothesis testing comparing two samples. I know that the last time we met we were discussing one sample so today we're going to learn how we do this hypothesis testing when we look at two samples or two groups. Yeah so we can get to it so for the rest of July this is the session and then hopefully by next week or the following week I will let you know on the WhatsApp group or the beginning of July when is our next session and what topics we're going to be covering for August so I need to send them first to Jacques to confirm and then we can and then they can book the sessions. Yeah so let's and remember the sessions we meet bi-weekly not on a weekly basis so we're going to meet again after another week so after a second week we meet. I was gonna ask if you have any question or query I'm not sure if you do have any question but before we start with today's session do you have anything to say or comment about so I guess everybody knows where to find all the information including all the slides for today as well. I have updated the folder for the notes for today's session and for the past sessions that we had and I also did post in advance the other two sessions that I think they might follow in August so you must go to the notes section and check. Is there anyone who wants to say something? It's me I want to ask if where are the previous sessions because this is my first time joining the sessions because I couldn't register I've been struggling to register so it's my first time so I don't know where to find the previous notes is it possible before the end of the session I will show you where to find I will put the link for you on the chat it's also on the WhatsApp group if you are not on the WhatsApp group then we can also share that with you I will share the link and if we have time I will also go and show you when you go to my UNISA because everything is on my UNISA it's easily accessible via my UNISA how you can access the recordings of the previous sessions and the notes I will show you that at the end of the session. Thank you so much. Alright so let unless if there is another question okay if there are none then let's look at hypothesis testing when we look at two groups. I have here statistical tables but for your module you do not have to use any statistical table for finding the p-values and for finding the probabilities because or finding the critical values because in your module they don't expect you to do technical things but they expect you to know the basic concepts they expect you to know how certain things are arrived at how a decision is arrived at so I'm not going to show you tables but I will make reference to where we find the information when we do some of the things in hypothesis testing as well especially when we do when we make decisions as well then you also need a calculator because there are certain aspects certain questions where they will ask you to do some calculations you you need a proper calculator because sometimes a normal calculator doesn't do what you need it to do because there are functions that you need so those on the WhatsApp group I did send you a link that you can download a calculator onto your phone and it will look exactly maybe I can show you my calculator it will look exactly like this calculator but it will be on your phone so that if you don't have a calculator or a proper calculator then you can follow the steps that we're going to be doing and be able to work it out by yourself using your phone or your calculator and this is a case you normally you find this case you're at pick and pay shop right and check us and gain anyway the great 12 learners and great 11 learners are encouraged to use the case your calculator so it's easy to be to get and I think it's reasonable enough as well so okay so by the end of the session you should learn or you should know how to use hypothesis testing for comparing the difference between two means of two independent population groups or two means of related population groups so here we're talking about independent population groups it means the the groups come from two separate or you create two separate groups that do not influence one another or related groups where you want to test the two related groups where those two groups are related or they can influence one another because then when you talk about the related groups we're talking about example will be where you do a pre-test and then you do a post test so the same group will be exposed to the pre and will also be exposed to the post so it means one will influence the other how you did before will influence how you do now as well so we're going to look at how we make decision based on the two scenarios as well so like I already explained when we do for independent samples then you have two groups maybe you have males and females or you have a group from how thing and a group from western cake and you want to compare how they their opinion on certain methods and though you do a test based on those two groups and also when we talk about independent groups it means the observations that are in one group cannot be in another group as well and then when we talk about related samples which are dependent sample then it's the same group but you test the before and the after so you look at before they take medication and after they take medication so it's the same group of people so you expose them and you first test them before you expose them to the treatment as well so that is the two sample test so we're going to first start looking at the independent samples so when we talk about testing the hypothesis testing for independent we also want to check if there is a difference we want to test whether there are differences between the two groups and usually for independent samples there are two scenarios that can happen when they come from the same population or different population it doesn't matter whether they the samples that we select come from two different population groups the when the population variances are unknown but they are assumed to be equal when so if the population variances or the populations donor deviations are equal then we're going to use the t-test and we're going to use the when we calculate the standard error we use the pooled variance now the reason why I'm not going to show you the pooled variance is you are not expected to know how to calculate the pooled variance because I think also because it's time consuming as well but just for interest sake know that there is what we call a pooled variance so when you calculate the test statistic you use the pooled variance for the equal population variances the one that we're going to concentrate on is when you are given the or when the population variances or standard deviations are unknown and we know when they are unknown we use the t-test statistic when they are unknown but they have unequal variances and they we're going to use the normal standard error and I'm going to show you how we calculate that t-test as well so and always remember when we do hypothesis testing your null hypothesis and alternative hypothesis always refers to your population parameter so it will say the population mean of group one minus the population mean of group two or the population population mean of group one is equal to the population mean of group two that's what we want to test I'll really explain this that we're going to use the pooled variance t-test and then for the other one we use the normal um separated variance t-test I will show you at a later stage so the other thing that you need to know is when you do independent samples there are certain assumptions that needs to be met as well the samples before I even move to this one so for the equal variances the assumptions are the sample needs to be randomly selected and independently independently drawn the population needs to be normally distributed with a sample size of more than 30 and the population variance needs to be unknown and this is for the independent variable or for the independent groups where the variances are assumed to be equal for those that are assumed to be not equal the only thing is the population variances are said to be assumed to not to be equal that's the only difference between the two when we do hypothesis testing as well remember the first step of everything we need to state the null hypothesis and the alternative hypothesis so when we do for independent sample regardless of which one we're doing so whether you you're doing for um where the population variances are known where you do a pooled variance t-test or you do a separated variance t-test reading the statement that the researcher wants to prove will guide you in terms of how your alternative hypothesis needs to be whether are we testing for the lower test which is um the less than or are we testing for the upper the upper test which will use the greater than or are we doing a two-tailed which the two will be a directional one directional test and then the two-tailed test which will be a non-directional test so you need to read the question carefully when they give you the statement in order to identify whether are you doing a one directional or are you doing a two-tailed test it's very very important because if for example in the statement they talk about the decrease then you need to know that you are going to do a lower tail therefore in your alternative hypothesis your null hypothesis will stay remember the null hypothesis can always be equal the alternative hypothesis will have a less than and we can state then the hypothesis testing in this manner or we can say the mean of population group one minus population group two is less than zero if they talk about increase then we know that we're doing an upper tail test or one directional upper tail test which then we can also state the alternative as the mean for population group one is greater than the mean of population or we can state it in another manner for a two-tail test they might say is there a difference or is there a change and then your null hypothesis will say there is no change or there is no difference and then your alternative will say there is a difference because then you will put an equal and not equal and when you make a decision because at the end once you have stated your null hypothesis and you know which test you're doing a one directional or a two directional you need to make a decision and visually when you make a decision for a one directional lower tail if we use the critical values these are what we call critical values if we use the critical values then we will reject the null hypothesis if your test statistic that you would have calculated is less than your critical value if it's an upper tail area if your critical value in the upper tail or the positive side of the graph if it is greater than your critical value we reject the null hypothesis for a two-tail test then we look at the two regions of rejection which is the upper tail sorry the upper tail and the lower tail so if it falls in any one of the two we're going to reject the null hypothesis in your module I think most of the time when you use the p-value and the decision for the p-value we always state that if your p-value if your p-value is less than alpha which is your level of significance then we reject the null hypothesis so you can make a decision based on the p-value and the alpha or we can make a decision based on the critical value and the test statistic to get to the same decision as well so just to remind you as well the steps that you need to always remember when you do your hypothesis testing is the first step is to state the null hypothesis and alternative hypothesis in order to know whether you're doing a one-directional test or a non-directional test step number two you need to define what kind of a test you're doing because you need to read the statement in order for you to know what you are given are you given the population standard deviation are you given one sample are you giving two groups what is it that you are given what what level of significance are you given so you need to state those so that once you know the things that you are given or the facts given in your statement then you can say for sure you're doing a t-test and also when you're doing a t-test it's very important to know whether are you doing a pool test or are you doing a separated variance t-test which in this instance we just call it a t-test which will have a subscription c at the end or it can have a t-state subscription t-state then once you have identified what kind of a test you do then you need to calculate that test so it means you need to compute take a calculator and substitute the values into the formula and do the calculations and calculate your test statistic once you have your test statistic then you can find the p-value and make a decision or you can find the critical value and make a decision and that is the final step to make that decision and when you make the decision you're always going to refer back to your null hypothesis statement what you already hypothesized the value should be at the beginning so it means when you make your conclusion after you make a decision when you conclude it needs to refer back to how you stated your null hypothesis okay so let's look at an example when we do hypothesis testing for an independent sample so let's say we need to test um if there is oh yeah i'm just giving you an example of the steps so null hypothesis will state will state that um there is no difference between treatment and control groups um and the alternative will mean um will say there is a difference the key thing here is recognizing that your null hypothesis should state that there is no difference because if they are equal there is no difference in your alternative it will state that there is a difference so it means if it's um a difference in terms of there is an increase in the treatment or the control responses then it will be an upper tail if it says there is a decrease in the control group then it will be a lower um a lower test and if it just says there is a difference then or there is a difference then you will use a not equal which will give you an undirectional so in this i'm just showing you one of the example that you can have in your alternative hypothesis then you need to state what type of a test you're doing based on the information that you would have got that then you will state that you're doing a t test and then you're going to compute a test statistic which in this instance we're looking at if we're looking at the independent samples where the population um variances are not equal then we will use the t test and which will be your mean your sample mean difference minus your population mean difference and because in your null hypothesis you always state that the mean population mean difference is equals to zero so this will always be equals to zero is there a question please make sure that you mute your microphone so this will always be equals to zero therefore this equation will always say the sample mean difference divided by the standard error which is the square root of your sample variance one divided by the sample size one plus sample variance two divided by the sample size two and that will give you the test statistic and then we can use the p value that we will get from a computer application where it generates that information for us and then we can make a decision and then lastly we make a decision and if we use the critical value then we make a decision based on the critical values okay so let's look at an example so we are interested in whether the type of movie someone sees at the theater affects their move their mood when they leave we decided to ask people about their mood as they leave one of the two movies the comedy movie which is group one had 20 people in it and the water movies it had also 20 which is group two which has 20 people who were watching that movie our data are then coded that the highest goals indicate a more positive mood sorry yes quick one do we ask questions as we go along yes okay quick one this one here I see the two groups here of got ends both and equaling 20 so looking at that sort of information you provided earlier so these two ends in this case have to be always always have to be equal in size or something like that is that the case so they can be different sizes the ends can be different because this comes from two different groups so it could have been that people who came from the comedy were 18 and then those who watched the water were 20 okay thank you so the size for for the independent groups doesn't have the sample sizes doesn't have to be the same doesn't have to be equal only for the pet test later on those ones they need to be equal because it's the same group okay thank you okay so we give we told that the highest score indicates that there is a more positive mood and we are given some sample statistics which they gave us the sample size we know that from the statement given they give us the sample mean for group one and group two and the sample standard deviation for group one and group two so as a researchers we know we have a good reason to believe that people living the comedy will be in a better mood so we use a one tail test at alpha 0 comma 0 5 to test our hypothesis so this they gave us the information that we require so they are telling us they that we're going to be doing a one tail test at alpha 0 comma 0 5 and we also have all the effects that we require so the first thing we do is to state my slides are scrambled the first thing we do is to state the null hypothesis and alternative hypothesis so the null hypothesis will state that there is no difference so the mean for population group one minus the mean for population group two is equals to zero or the mean of population group one is equals to the mean of population group two the alternative already we are told that we're doing a one tail test and we were told that high score will mean higher score will mean a positive mood so we can assume that the higher score is greater than so we say mean one will be bigger than mean two or we can say the difference between mean one and mean two will be greater than zero that's the number one now we go and find other information that we given we state those because they're going to help us in identifying what kind of a test we're doing and we know that we're doing a t a t test and here we can also assume that our population variances are unknown and our population variances are different we can assume that and therefore that is why we're using this t test and we substitute the values into the formula of our t test remember the top will state your sample mean one minus sample mean two divided by the square root of your standard deviation squared or your variance of population one because now we were given the standard deviation so it will be 320 squared divided by 20 plus for the standard deviation of group two we had 318 so because we calculate using the variance so we square the standard deviation divided by the sample size which was 20 and when you calculate you get 4.461 as our t test and based on our critical value once we go and find the critical value which would have found earlier from the test from the t test and remember for your module you do not have to go and lend the tables because you will never get the tables in the exam as well they will supply you with the information that you require to make a decision so in this instance the t test if we're using the critical value the critical value will be given to you to say mega decision and then we can know that with our critical value we can create a region of rejection so we know that we're doing a one tail test which is a one directional test and it's in the greater than so it means our critical value the region of rejection will be in the greater than and anything that falls above the critical value we're going to reject an alpha 0 comma 0 5 anything that falls on the left hand side of the critical value we do not reject so everything that will fall in the white area we do not reject so our t test is 4.4 we can say it's 4.5 or 4.461 so it will fall in the blue shaded area therefore we're going to reject that null hypothesis because it falls in the rejection area and we can conclude that the average mood after the comedy is better than the mood after a horror movie and that's how you will make a decision so like I said sometimes you can make a decision based on the critical value but sometimes we need to make a decision based on the p-vane so taking a a package like excel when you put in the values into the excel and run the t test on excel it will generate a table um a t test table which will have all the summary statistics about the test that you just ran and one of the measures that will be included will be the p-vane and if you use the same information the p-value that we get from that package or from that computer generated software uh output we get the p-value of 0 comma 007 and we know the rule says if our p-value is less than alpha we reject the null hypothesis so our p-value is 0 comma 007 which is less than our level of significance which is alpha of 0 comma 05 therefore we can also reach the same conclusion whether we use the critical value or we use the p-value we're going to reject the null hypothesis and do the same conclusion and that's how you do hypothesis testing so I had oh because my I didn't check my slides before I published I have an exercise activity for you consider the following statistic regarding the post-training attitude score we have group one and group two and we are given that the mean and the standard deviation of each group what are the values in this table what do we call those values what do we call the mean and the standard deviations I would say number one why number one because those are part of population remember how do I explain it but I know it's number one if we are doing if we if we are doing a hypothesis testing will you call those part of population I would say yes any other person okay so those are not the populations the you are you you have actually also you have a hint given to you at the beginning what you are looking at in the statement the test statistics if uh nope there are not test statistics so um yeah because we hijacked the the the sessions we started the right in the middle oh actually right at the end uh you must remember that you get what we call a population and a population if we select measures that comes from a population we call them paramitas those are the measures that come from a population and because the population is too big enough we select a sample and measures that we select or we we calculate from the sample we call them statistics so everywhere you read about statistics then you must also know that we referring to the measures that comes from a sample and remember we use the sample measures to make conclusions about the population because most of the time we do not know the population parameters especially when we come to inferential statistics so we use the sample statistics to make conclusions about the population parameters because we do not have the information about the population parameters so the answer for this question would have been option number two and you yeah so when you read questions also you must pay attention to key words in the question because sometimes it gives you and it nudges you to the right answer or response a test statistic uh in this regard is the method that we use to make conclusions is that the test statistics from the example it is that formula that we use to calculate and it's not the mean or the standard deviation so if they would have said consider the following parameters then you would have said this is the population parameters but because they said consider the following statistics and knowing that also we do in hypothesis testing where we look at two groups then this are your sample statistics so this is your x bar and this is your s okay good exercise two if we know that group one has 20 and group two has 20 calculate the value of test statistics so now yeah is your chance to do the calculation for t is equals to the mean for group one oh minus so it should be minus it's the difference of the mean of group two divided by the standard error which is the variance of group one over n of group one plus the variance squared which is standard deviation squared of group two over n of group two so you just need to substitute the values remember this will be your x bar one and this will be your s one so you just substitute the values I will help you with substituting the values this is mean for group two let's see if you are able to calculate this remember you can use the post your answers let me give you a little bit of time remember also to to mute yourself especially when you have kids around Ellen please mute I've also posted the register in the chat for those who dread late are you winning we're getting there we're working on it okay are you still busy are you winning you need help I have 1.3088 I'm not sure if you're 1.3088 1.3088 that's what you have okay okay so our mean for group one I also have the same okay so we can answer the question now the mean for group one is 21.65 minus the mean of group two which is 20.40 divided by the square root of our standard deviation squared which is 2.99 squared over our n for group one is 20 plus 305 squared over 23.0 is a 3.05 then I'm going to use my calculator and with this calculator it's easy because I have this fraction thing so if I press that I can type the whole equation onto my calculator which is 21.65 oh sorry 21.65 minus 20.40 then I use my arrows to go down to the bottom one and the bottom one is the square root so I'm going to press the square root function but I also have two fractions I'm going to do my first fraction and in that first fraction I'm going to do my fixed value which is my first value is 2.99 which will be 2.99 and I'm going to square that by pressing the square button then I'll use my arrow to go down and put the 20 so I'm done with the first part but it needs to add the second one so using my left arrow go to the n and then press the plus button or not the multiply, delete, delete plus and do again another fraction because that's my other fraction which is 3.05 squared and go down and put the 20 and once I'm done I can also use my arrow twice three times so that it goes and flick at the top and equals 1.3088 and if we round it up to one decimal we're just going to get a positive 1.3 so our answer is option number three and that's how you answer we will answer the questions as well excuse me the calculator that you sent or the link that you sent for the calculator online it's not the same as this one here I don't see the arrows on that one there is arrows it will be right in the middle so if you look let me open it from my side as well on my phone it's called case your name okay I don't even know the name it's no name Cal E S oh sorry I opened the wrong thing the arrows that they under the white shift then there are two yeah there are two arrows there is a lighter arrow that looks like that and that looks like that and then there is the up arrow that looks like that 100% yeah do you see them yes I do your arrows on your calculator okay thank you you see them yes I do see them thanks yeah so that is the left the right down and up arrow because you don't have enough space on on your phone they made the buttons bigger so that your finger can be able to to fit and click okay last exercise relating to this then we move to the next part of the session suppose the two tails p value for the t test of a differences between two means in the previous question is 0.9 so relating to what we just calculated now if the p value was 0.9 and yeah we're doing a they say if a two-tailed p value was this and if alpha is set to 0 comma 1 what will be the decision regarding the null hypothesis so they just want to they just want you to look at the test are we doing a two-tailed test or are we doing a one-tailed test so based on the information that we have we're going to assume that we're doing a two-tailed test in this regard or we're doing a yeah a one-tailed test because our result says it's a plus so if we look at this it will mean that we're doing a one-tailed test so this is the answer for a one-tailed one-tailed test this will be an answer for a two-tailed test one-tailed test so since option three is the one-tailed test the question yes says suppose that we were given a two-tailed value remember in order for us to find a one-tailed test a one-tailed p value for a one-tailed is a two-tailed p value divided by two or a a one-tailed p value is a or we can say two-tailed two one-tailed p value is equals to a two-tailed or a one p value or one-tailed p value is equals to two-tailed p value divided by two it's half of the two-tailed so since we know that we're doing a one-tailed test we are given two-tailed tests so what will be the decision that we're going to get to so we need to find the p value first you need to find the p value you will need to find the p value for a one-tailed test so it means we need to take zero comma one nine we need to divide it by two and what do you get zero point zero nine five and that will be zero point zero nine five and once you have gotten that you need to go and make a decision remember the decision so let's put that the rule the rule says if the p value is less than alpha we reject the null hypothesis that is the rule based on the information that we just collected now make a decision we accept the hypothesis remember what your p value that's the first thing that you need to check we need to reject because it is less than zero comma one nine five which is less than zero comma one zero so we reject the null hypothesis quick question yes quick question on that one zero point zero nine i will not go into round it off because if you round it off it goes to zero point zero zero point one no we're not gonna round it off we're going to use it the same way as we see it now why not um also because the other options are not even correct but we're not going to round it off we leave it as is okay because that's the value so do not round off the value of your alpha because sometimes you will get an alpha value of zero comma zero zero five you cannot say that is the same as zero comma zero one there are two different values and an alpha value of zero comma zero two five is not the same as this is not the same as zero comma zero three they are different so when you work with p values don't round them off do them as they are so we know that this was not correct and this is not correct anyway because our p value was zero comma zero nine five so you need to use the information that you have to guide you in terms of what you need to be doing next okay so I need to run to down this one so the other thing that you need to know and remember as well in your module is we talk about the effect size and with the effect size we look at the effect effect size which is the core hand for D or core hands D we can find it when we look at the hypothesis testing for one sample size we can find this when we look at the correlation where we get the relationship between two variables and so forth because this effect size we use it to make or to compare the different uh the different means or the different sizes of the means whether are they close to one another or are they is there a huge difference and there is a metric that you use so so that or to test the level of importance in terms of that difference that you calculate and in order for you to find the correct effect size for the test that you are running you need to use this formula which says um your mean for group one minus the mean of group two divided by the pool sample or the pooled variance let's call it that or what we call the estimated standard deviation which is the same as the pooled variance or the pooled standard deviation so if we look at the example that we had previous and let's assume that we calculated the pooled variance um the formula to calculate the pooled variance which is SP and that's the reason why they don't want you to calculate this is your s1 squared times 1 over n 1 plus s2 um I might be doing something wrong minus one I think there's a minus one somewhere um it's a very complex formula and I think we divide everything I don't want to write a formula that I don't know by heart but it's a very long complex formula that you need to um calculate to find that and that is the reason why they don't want you to do the pooled variance or the pooled standard deviation formulas um but they will give it to you so in this instance um I've calculated it and I found that it was 3.88 so the effect size is 1.15 so it means there is one standard deviation um from the mean the two mean groups one standard deviation from one another with this so when we do some interpretation as well we can also look at it in terms of the content so whether um how large it is so if your d value it's greater than 0.8 it means there is a big gap or there is a a huge difference between the the groups as well so if I look at 1.1 it is large because it's greater than 0.8 so therefore it means the two groups are they differ and if I need to interpret it it will mean that they differ in 1.2 or 1.157 standard deviation away from each other and that is how we use the effect size to check okay so that is one here is your exercise relating to the coherent d effect size based on the sample you are given group one and group two and all children pooled so this they give you the pooled values or pooled statistics so here we have our sp so this is our pooled standard deviation they also give you the standard deviation uh sample standard deviation and the sample mean and the sample size for each group they give that children and other children they also give you measures and they tell you that um the researcher calculated the test statistic and they found that it was 4.196 and they used a computer program to determine that the p value which is the level of not the level of significant but the p value the probability for a two tail a non-directional test they found that it was 0.02 so if we want to find the one tail on this we'll have to divide this by two so a two tail test which is highly significant she however concerned that the significant result may be due to the relative large sample size because the sample sizes are big she needs to decide also and calculate the effect size to determine whether the results are meaningful irrespective of this so she needs to calculate the core hands g using the formula so you are given the sample mean for group one and the sample mean for group two and you are given the spool uh i'm going to call it spooled because it's standard deviation pooled variant or i can call it the pooled standard deviation so you are given s which is your standard deviation for pooled um values now you need to calculate this and once you have calculated that use this table to guide you in terms of whether the d is large medium or small i would say medium didn't calculate already uh yes okay so what did you do to calculate uh 55 minus 49 so it's five sorry on the past example pass the way you see a gap it means there is a decimal missing so there should be a 5.5 this is okay okay so you must pay attention especially when you're working with old example pass 5.5 minus 4.9 divide by 1.0 so what did you get so we can calculate it so that will be 5.5 minus 4.9 equals divide by 1 which is the same as the answer that i got um on your calculator if you get fractions like this there is an s arrow which is changed from decimal to fractions or you can just press that and you will get a answer of 0.6 so you just use this it's it's giving me a syntax error don't give me a syntax error it gives it gives you syntax error error is so did you do 5.5 minus you must make sure that you didn't have any any number on there you can use the ac button to clear your calculator 5.5 minus 4.9 which should give you 0.6 are you winning okay so the answer is 0.6 which is between 0.4 0.4 and 0.8 which is media which is media any question before we move no syntax error still is a syntax error what calculator are you using cashier using a cashier i don't know why you're getting a syntax error sorry ma'am i got 0.6 yes i got the answer but then i don't understand how it begins medium small or large i i i think i missed that one between 0.4 and 0.8 it's medium oh okay i now see the table you take this value and you look at the effect size okay to the answer that you get thank you so much i don't know what how why are you getting a syntax error i'm on your calculator if you just do 5.5 minus 4.9 and you press equal what do you get what do you see on your calculator syntax error okay so you need to reset your your calculator i think there is something wrong with your calculator um and since the cashier calculator does not have a reset button which it's very tricky to do the research um press the shift button you see the shift press your shift button and then press the mode button which has the mode setup the red one the red mode setup i see that yeah yeah it says uh it it's written mode setup yeah it's giving me counter table ratio of that uh yeah press number one way it says one and then press one again and then press one again one again one again one again do that yeah and then go and do your calculation again 5.5 minus 4.9 at the bottom between zero and between zero there is a dot and then there is an x 10 x we need to press that next to zero yeah 0.6 don't press that don't press the commas at the top next to the negative and the whatnot 0.6 zero you got it okay thank you then you are ready to do any calculation when we get to any question where we do calculations all right so any questions if there are no questions then let's look at hypothesis testing for dependent groups or related samples or test test so with related um population or samples we need to make some assumption both population needs to be normally distributed if they are not normally distributed then your sample size n should be large enough also because we're looking at related samples so sometimes we look at the before and the after the post pre and post before test and after tests and we no longer do only the difference but we're going to calculate the difference and use the difference of the scores or the difference of the means in order for us to calculate the test statistic okay so therefore it means we need to be able to calculate as well the mean difference which is the sum of all the observations or the sum of all the differences of your observations divided by how many there are which is your mean difference we also need to calculate the standard deviation of the differences so usually they will give you a table with observations and then you calculate you then you will have your pre and post so you will have a table so let's say we have five children in class so one student second third and fourth and fifth student we we test them the pre test on concepts and we get the score let's say this one score 80 this one score 10 this one score 30 and this one score 50 and this one scores 60 so those are the scores and then we take them through a lesson and then they learn something new they learn something that they didn't know about and we give them a test the same test because they wrote the test before then we give them the same test again this will be the post and with the post she gets a hundred because now she knows most of the things she gets 30 or he or she or number two gets 30 and this one the score improves gets 50 and the 60 gets 70 and the 70 got 8 because of the information we gave them so in order for us to calculate we need to find the difference so we need to say 80 minus 100 which is 20 that's our D our difference 10 minus 20 oh sorry 10 minus 30 is 20 oh this is minus sorry I must not forget the minus 30 minus 50 minus and you have the idea then we take the difference and we calculate the mean or the sample mean and I will show you all the calculation later on and then we calculate the standard deviation easy to do on your calculator if we have time we can do the calculations on I will show you on the calculator when we do an example and then once we have calculated this then we can calculate the test statistic because we use the difference and we will be given in your hypothesis testing you will or your hypothesis a statement when you state your null hypothesis an alternative you would have give be given the null the population difference and then you calculated the standard deviation divided by the square root of n which is your standard error and that will be the test statistics for the page sample the decision same as before we can do the decision for a one-tail test two-tail test um or what we call one directional test and non-directional test and make a decision so let's look at an example so here is where I said I will show you how to calculate some of the things that we need like the mean and the standard deviation so that it saves you time in case they ask you to do that so yeah we are given um we need to assume that you send yourselves people to a customer service training workshop and we want to test if um the training has made any difference and is there a decrease in the number of complaints and you collected the data and this is the information the key thing here is decrease pay attention to that weight decrease we want to find out if there is a difference in the decrease number of they didn't say the increase so it means it's a one-tail so it's a decrease so it will have and less than in your alternative hypothesis so we know the names of ourselves persons before the complaints before they went on the training or the workshop and after the workshop the complaints that came through we need to calculate the difference so in this instance we take two minus one in my example I used one minus two so here we taking two minus one we after minus before so four minus six is minus two six minus 20 is 14 two minus three is minus one zero minus zero so there is still no complaints is zero zero minus four is minus four so quick one yes uh how are we supposed to to take this uh six minus four or four minus six um you can use six minus four it doesn't really matter that much um I'm using after minus two so you can use before minus after okay so now we need to calculate the standard deviation and the mean so what I want to do I'm going to hide all those other values because I don't need them I only need the difference so I need to capture this information on my calculators to save me time so not to use my formulas because for standard deviation it's very very complex so you can come back to the video and watch and see how I did this so I'm going to do it as quickly as possible so then we save enough time to do activities um so I need to put my calculator to state mode so I go mode and there is two and I'm going to select the first one which is one and here it gives me a table to capture my information and my information is minus two minus 14 there is the negative so I'm going to save minus two and I press equal then it moves to the next line and I can just continue minus 14 equal minus one equal zero you must capture everything as you see it as you have calculated it zero equal and minus four equal and I've captured everything there are one two three four five so there are five so I've captured everything then I can go on and off my calculator now I'm ready to calculate the mean remember I'm going to put the to calculate the mean the mean is the sum of all these values which is minus 21 divided by how many there are there were five so I will calculate the mean on the calculator which is the shift and state which is button number one then I'm just going to go to button number four because that's where the means are and there is my mean which is two so I'm using two and I press equal and I can see that if I take 21 and I minus 21 divide by five I will get the same answer which is 4.2 calculating the standard deviation which is this formula which is the sum of your observation minus the mean squared divide by n minus one what it means it says take minus two minus minus 4.2 square the answer plus minus 14 minus minus 42 square the answer plus minus one minus minus 42 square the answer plus until you get to four divide everything by five minus one once you've calculated what is inside the square root take the square root long calculation long formulas now on your calculator we just go shift state four again and we're looking for this sx and which is on button number four and you press equal and the answer is 5.67 so if you do the long calculation you will get 5.674 which is the same as that answer that we have here so that is our standard deviation then now we can go and do our hypothesis testing so we know that we need to check if there is a difference in the decreased number of complaints at alpha zero comma zero uh zero comma one zero so state the null hypothesis and alternative remember it said there is a decrease it was less than so it means in our alternative hypothesis we can state that the population mean difference is less than zero step number two is to state what you are given in order to assist you in knowing which one you are calculating whether is the t test for the difference or is it the t test for the independent so also we can go and find the critical value which will give us the region of rejection our degrees of freedom helps us to find the critical value then we calculate the test statistic remember we've calculated our mean difference we found that it was minus the mean was minus four point two and our population mean is zero always going to be zero divided by the standard error which is the standard deviation which we did last five point six seven divided by the square root of five because there were five observations and the answer we get is minus one point six six using our critical value we find the region of rejection or the rejection area we define that and we make our decision since our hypothesis side is less than so it's in the negative side so our critical value will be negative one point five three three and our test statistics was negative one point six six so it falls in the rejection area when our mean says there is no difference we say there is an decrease difference number of complaints so we can see that we reject the null hypothesis which says there is no there is no difference that there is a significant difference in the number of complaints there is a difference or there is a significant decrease in the number of complaints and that's how you will make a decision as well so on here is an excel output taking the same kind of values that we had the thing with the excel output is it doesn't calculate the mean difference all what it does is it calculates the mean of the pre and the mean of the post which we don't need all these statistics are related to each group or each sample the pre and the post what it also does is to calculate the test statistic remember is the same as what we got if I go back we got minus one point six six on this instance it left it as a positive and it calculated the p-value so we can use the p-value from here we know that we're doing a one-tail test and we can take that one-tail test which is zero point zero eight six we know that our alpha is zero comma zero one so our p-value of zero comma so we get the p-value of zero comma zero eight six because it's a one-tail zero comma zero eight six and it is less than zero comma one zero so we still reject the null hypothesis we're still going to get the same information but we got when we do the critical value any questions if there are no questions then we can move to looking at activities remember using your calculator to do the step modes when you get to that that part you can pause the video and do step by step and make sure that you understand how I did it as well but it's going to be very rare for you to be asked to do the calculation especially to calculate the mean and the standard deviation in the exam as well so let's look at more activities we will work through them together so if you want to take a two minutes break you can do so and then we can use the 30 minutes that is left to go through the activities so if there are no comments or anything so let's continue then and work through the exercises a researcher wants to test the following hypothesis and they gave you the hypothesis the alternative states that the group one is greater than group two on the basis of the data provided the output from a computer program indicates the t value of 1.72 and the p value for the two tail test is given as p is equals to 0 comma 0 5 6 what should the researcher do to evaluate the results of this significance alpha at 0 comma 0 5 similar exercise that we just did what do you need to do the first thing you need to look at is the side in your alternative hypothesis this is a one-directional so it means it's a one tail test and since it's a one tail test we are given the p value of a two tail test so what is the p value note point note note five six look what will be the p value the p value for for one tail will be taking the p value of the two tail and dividing it by two we just did this not so long ago so you'll take two p value and divide by two so we don't even have to work it out because the question here says what should you do to evaluate the results look at the options which option will it be option number one number one yes it will be option number one where we take the p value and divide it by two so you always need to look at what you're given in an alternative and compared to the p value that you are given if it's a one tail and let's say they give you a one tail p value and they say this is a two tail or a non-directional which is not equal therefore it means you will take your one tail and you multiply it by two so then it means option two would have been correct this divide by two we do not do that you cannot divide your alpha to compare it to the p value we leave alpha as is so option one is the correct one exercise two a researcher um if maybe even let me go back to this if you look at this question I took it from the spike paper of 2017 the exercise or activity that we did previously I took it from a tutorial letter 101 activity so you can see that your your questions are almost exactly the same so you just need to pay attention to small things like the values the detail given for each question as well so you will notice that most of the question they will look as if like they are familiar we have done it before or or not um so exercise two a researcher suspects the is a difference between the creative ability of boys and girls in school for gifted children she uses a test for creativity that has a standardized in such a way that the mean creative ability score for the general population is 50 which of the following pose possible way to state the null hypothesis what will be the null hypothesis I would say T why two uh the researchers are unsuspects that's that's that there is a difference so if you suspect that the difference the null should be that there is no difference yeah your your but what does to does to tell you there is a difference and no okay I was I was continuing there and your your value there is 50 so if there is a difference it either has to be greater than 50 or less than 50 but we test in the difference between two groups and this is the difference of one population it's not the difference of two groups no so this is incorrect I'll say option three when we state the hypothesis testing we never state the hypothesis testing using the sample statistics so three would be correct the only option that is correct is option number one always use the population parameter to state the null hypothesis based on the same information that we had number three in which of the following research situation is the most is it most likely that a test for comparing independent groups will be used so now you need to ask yourself a lot of questions here so we need we're comparing two independent groups so either when we compare two groups what do we check we check for there are any differences so reading those three statements which one is the correct one I would say two it will be two because two is talking about evaluating the differences whereas the others they are evaluating the development of verbal skills they just evaluating the effectiveness of new medication but never the difference between the groups so you need to also make sure that you understand what the hypothesis for independent is aiming to do we always look at the difference the same way as next time when we meet we do the relationship you always need to think about when we look at the relationship what do we talk about as well so key things key small things will tell you what you are doing what is co-md effect size it is the effect size you can check the definition of coins in which circumstances can the z test for comparing two independent mean not be used I know that we didn't do z but think about everything that we just did today so today we were talking about for t test remember last week each one remember last week we spoke about hypothesis testing for the for the mean hypothesis testing for the mean for one sample size and we said if the population parameter of the standard deviation let's go there if the population standard deviation is known we use z if it's unknown if the population standard deviation is unknown we use t based on that information in which circumstance and remember that sigma is your population parameter option four is talks about sample standard deviation which is not correct and also option four speaks about unknown option two it talks about the correct thing but it says unknown when do we use z I just gave you the answer when it is known it says answer one it's known it's known available to the researcher option one will be the correct answer it's when when your population parameters are known and available to the researcher we use the z test even if we talk about the independent for independent as well we use a z test but when they are known if they are unknown that's when we use the t test and in your module always we use the t test for independent samples so when oh when not to be used oh yes you are right when not to be used sorry my bad yes you are right so in which circumstance a z test for comparing two independence means not to be used is when the population standard deviation for two groups are unknown we cannot use the z test this won't be right because we never refer to the means we never refer to the sample we and then this will definitely we will use it if we're looking at the t test sorry we ever answered option two was right my bad I keep on telling you small things and I forgot to read the whole question and those small things happen so this is also a key thing that we need to remember okay so number b two samples may be regarded as independent when there is no systematic relationship between the composition of one and the other that would be correct okay so a market researcher is asked to conduct a study to examine people's reaction to a movie trailer he draws a random sample of 20 males and 20 females who saw the trailer he asked them to indicate how likely it is that they will go and see a movie on a seven point skate where one indicates not at all seven indicates definitely he wants to compare to establish whether males and females differ in their intention to see the movie based on an exposure to the trailer suppose that the researcher finds that the mean and the standard deviation for the group samples is as follows they've given you the measures which is the appropriate way to indicate the researchers hypothesis which is to be tested remember the key words decrease increase change difference and this is your key statement if there is anything there that includes or involves things like increase decrease or just the difference you will know remember that increase it will be greater than decrease will be less than and just the difference or change will be not equal I think the answer is three the answer will be three because number one actually it's you not even look at it because it uses the sample mean and for the fact that there is no way where it mentions less or greater than increase or decrease therefore this won't be the right one so option three is the only option that is correct based on the same information which is the appropriate t test to calculate to evaluate the significance of the hypothesis ask yourself you're testing two groups is it a test for difference between two independent sample is it for the single sample is it for two dependent samples one two or three it would have been three if they were using only males and they test them before they go and after they come out it's number one it's one it will be number one because there are two groups ask yourself those two males and females can males be part of the the females or females be part of the males no it cannot be so it's independent and such as asked by a motivational speaker to establish whether a workshop on assertiveness training is effective the researcher decides to use a particular questionnaire which tests an individual level of assertiveness he presents the questionnaire to each of the sample of 50 participants in a workshop before it begins and once after it has ended to the same participant when analyzing the results the researcher should use a test for the number two number two it will be number two because he does the before and after on the same sample of 70 to be tested blah blah blah all the whole story which formula to use to test which formula do we use for the before and after okay we help you option two this is to test one sample this is to test independent sample independent sample and this is to test dependent samples so are we dealing with dependent or independent remember it's the same question number three and after we are doing dependent if it was male and female we doing independent so it would have been this so before and after main and female i'm just gonna use one variable gender i don't know what else i can do i'm lazy to read the whole sentence but let's do that because we might miss something a social psychologist wants to test how long people will wait before responding to cries of help from an unknown person the psychologist wants to confirm his suspicion that people will take less time to react when they hear a female voice than when they hear a male voice he tests this on a sample of n15 people who are told one at the time to sit in a waiting room to be called for an interview while they wait each participant hears a call for help from a male or a female voice which is actually a recording the dependent variable is the number of seconds each participant's wait until they go to investigate or try to help the sample follows following the sample statistics are calculated as follows the male voice the information is there female voice the information is there given the findings what type of a statistical test will a psychology has a psychologist have to confirm the relevant statistical hypothesis or to do to confirm the relevant statistical hypothesis he wants to confirm that the people will take less time to help a female when they hear female voice but if you look at the readings there the female's voice takes 15.3 seconds which is much smaller than the male one yeah but it cannot be nothing yeah because he wants to take how long what he's testing is how long people go to assist when they hear a person cries for help now it is not saying how long it just says whether people take more or less time whether they will take more time on one side or less time on the other side yeah but the test is he wants to test how long people will wait before responding to a cry that's what the researcher wants to test yeah the other thing that will help you know what type of a test you're going to be doing is that statement the last time are you so it cannot be a no statistical test there is a statistical test that he needs to do the the only other options are like if it's one two and three where no statistical test is necessary it's irrelevant yeah because there is a statistical test that he can do because he recorded the number of times or the number of time people go and assist while they were waiting in the room as well so he has the information to help him do the test so what you need to ask yourself is are you doing a one tail which one will be the relevant one are you doing a one tail statistical test because we're looking for a statistical test not the type of a method that we are doing whether it's dependent or independent we weigh past that we know that we're doing an independent test but we want to know what type of a statistical test they need to be conducting let me ask a question so if I know that you know from the readings that the values that I get they sort of nullify my my my initial suspicion so why would I continue and test no it doesn't nullify your suspicion my suspicion my suspicion is that people will take less time yes that's your suspicion but we need to test that remember there are two sites to end to the court there is the truth and there is the opposite of that so there is two sites so your your opinion so the researcher's opinion that is his claim his claim is that they take less time that is his opinion yes of the research we need to prove that we need to prove the researcher whether the statement is right or wrong that's what we need to do so what type of a test will we be doing is it a one-tail test or is it going to be a two-tail test that we're going to be doing to prove this based on the researcher's assertion based on their claim and that is what this is very important that is why you need to take into consideration all those remember when we were doing this activity I said look at four words like this in your statement because they will give you those kind of statement to tell you whether are you doing a one directional which is that or one tail test statistic or are you doing a two tail test statistic in this statement nothing different from everything that we have been doing what type of a test statistic that less time that's what the researcher wants to prove so it's a two tail test it's taking less time so therefore it means we're going to be doing a one tail test to prove that assertion and state the null hypothesis do the calculation for the test statistic may confusion that's all what you will need to do but in before you do that before knowing where you need to make a decision the important thing is is it a two tail I will find two regions of rejections when we look at the p-value are we going to find the p-value in the two-tail p-value two the two-tail p-value or are we doing a one-tail p-value to make that decision so the answer for this one is a one tail test test statistic that's what we do it based on the information I'm not clear on that because last time you said for the two tail is male and female then for the symbol then you say no no oh for the independent it was independent main and main pendant before and after okay okay now I see less than greater than I was talking to this so increase greater than decrease which is less than which is decrease tells you whether it's a one direction which direction okay now I see then so if they would have said it takes more time so it would have been a greater than sorry it would have been more time would have been a greater than less time it's a less than which is a decrease if they didn't if they said if if yeah the researcher would have said I want to confirm my suspicion that people take time without mentioning less or greater than then it would have been a difference because then there will be a difference between the difference of how long they go when they hear a female voice then how long they go when they hear a male voice we left with two minutes let's see if we can answer this last question a researcher wants to test the hypothesis and they gave us the hypothesis null hypothesis and alternative hypothesis on the basis of the data provided the output from a computer program indicates that the t value is that much 1.72 was found with the p value of a two tail test of zero and I think this is almost the same question as we did before what should the researcher do to evaluate this result one I think four should whoever said number four is correct pay attention the sign is a two direction or two tail this is a this is a two tail test they gave you a two tail test p value and the alpha you cannot be dividing you cannot multiply you cannot divide the only thing you need to do because you have the p value and the alpha is just to compare your p value with the given alpha tricky what else on the previous one the sign here was greater than so it was a one tail test so we needed to divide so now this is a two tail test we just compare and that concludes today's session but I do have a lot of other exercises not too many exercise 14 and exercise 15 if you want we can continue the conversation on whatsapp and if you're not sure about the answer for 13 and 14 13 sorry how many questions 12 13 and 15 15 has a and b we can have those discussion on whatsapp otherwise thank you for participating today and being part of the discussion so if there are any comments which question query now is your platform now is your time to ask or comment please make sure that you also complete the register if you haven't I will repost it because I know that sometimes it disappears for those who joined late and if there are no questions and comment I will just close off the session today we looked at the independent sample so remember for independent sample the one does not or the groups does not affect the other one does not depend on the other they are independent so the other group does not contain members from the other group the dependent is the same group but we test in the two variables the pre and the post the before and the after and with that I hope you will have a lovely lovely I hate and enjoy your evening but before you leave