 Welcome to another session which possibly it is our, there are two more sessions left for September, then we are done. Remember, if you have any technical issues, send an email to cityentact at unisa.ac.za. And if you have any module content related for your Psych3704, you are unsure about certain things that you are doing or you don't understand how to calculate certain things. You can send me an email and that is my email there and copy cityentact.ac.za. So welcome to your next session where we discuss hypothesis testing from pairing two samples. Looking at the last two sessions that we are left with, we're going to look at the correlation analysis later on. Next week, Tuesday is the 20th. And then on the 7th of September, we're going to look at CHI-squared test. And that will be it. We would have covered everything you need to know, technical skills. I would have shared everything I know to assist you in answering your questions and be ready to write your exam. Are there any questions for me before we start with today's session? Any burning questions or comments or queries that you have? Once, going twice, and we are off. Let's dig in and look at hypothesis testing when we compare two samples. In order for you to do this hypothesis testing, I would say normally you will require a statistical table, but we've realized last time we met that in your module, they don't want you to learn how to use the tables, but you just need to know the concept relating to the hypothesis testing. And you should be able to calculate certain things related to the hypothesis testing in order for you to be able to make a decision. So we're not going to worry too much about statistical tables, but we're going to look at formulas and how do we use our calculators to calculate those formulas. By the end of the session today, you should learn how to use hypothesis testing to compare the difference between, one, the mean of two independent populations to the mean of two related populations. Remember this session we're covering when we compare the difference between two groups. The first one, we're going to start by looking at the independent groups, which means those two groups have no effect on each other. They are independent of one another. And the second one, it means they do have some sort of an effect on the other one. And here we're talking about the before and the after. And we're going to look at some examples and some more exercises so that you can be able to feel comfortable to know when you're reading a statement. Is this for independent and is this for dependent or related samples? Okay. So with two samples already I've just said it in my intro. When we talk about two independent samples, we're talking about two groups that do not have an effect or a relation to one another. So one group does cannot influence what the other group does. So they are independent even the way you draw the data, it will be drawn independently of one another. The other type of independent groups, it can be if you're looking at two independent groups like for example, here we're going to use the variable. The two variable groups, one group will be males and the other group will be females. Both of them are independent of one another one cannot influence the other one. But in real life, sometimes it happens but in terms of statistics, one cannot influence the other. For example, another example can be where you have a group of people who vote yes and a group of people who vote no. Two, having a traditional music concert at a school or something like that. That is independent groups because one cannot influence the other. When we talk about related sample, it means they are dependent on one another. And here we refer to things that you might do something and you performing this might influence your experience. So this is your expectation and when you go through a process, you experience something and then you can score both and say at my expectation was five. After I experienced this, my score now is two. So there are two instances that influence one another. Or the other one, we can always refer to this as the before and after. So we give you a test, you write the test. After you write the test, we put you in a classroom, we explain to you the concepts and then we expect you to write another test. Now you already went through another process where you unpacked the concept and then you write another test. So there was some sort of an influence in between the first instance when you wrote the test and the second instance. Another one, if we go into medical field, we can talk about treatment. So before treatment and after treatment. So we can also look at the results of both. So those are the two areas where you should be able to identify whether I have an independent sample or I have related samples. Okay. So when we do a hypothesis testing or we test for or compare the two differences, we need there are some some goals that we need to set at the beginning or some sort of an hypothesis statement that you state. So in order for if our goal is to test the hypothesis for the difference between the two population population one and population two. And we test that we are going to use the point estimate, which are your sample statistics value. So we're going to use the mean for sample one minus the mean for sample two for the difference between the two groups, regardless of whether it's independent or dependent samples. However, there are some assumptions that also take into you need to take into consideration when you do the test. So in terms of your module, we only going to concentrate on one, which is where the populations done a deviation of the two populations. They are unknown. And we assume that they are unequal. Oh, so we assume that they are equal. We're not going to concentrate on the. Assumed equal one because with the not as equal when you use the food variances. We also we use the separate variances. So when we talk about the independent samples, now we're going to first concentrate on the independent samples, the independent samples. Therefore it means we already explained this. They should be unrelated. They should be independent. The samples should be selected from one population and have no effect from the sample that is selected from another population. Or they can come from the same population, but they need to be independent of one another, like male and female. The assumptions that needs to be met is that the samples needs to be randomly and independently drawn or selected. The populations should be normally distributed and the sample size or both of their sample size should be bigger than that. And your population variance. Obviously there will be a loan because it's one of those assumptions for independent that the population variance are unknown, but are assumed equal. Then we use the pooled variance T test to calculate and when they are not equal, not assumed equal, then we use the separate variance test. And most of the time in your module, you will be using the separated variance test, the pooled variance. You don't have to worry about, but we do in your module, they do cover a little bit of it, but they don't explain it in more detail. But you don't also have to worry about it because you are not expected to know how to calculate it, but you are expected to calculate the separate variance T test. And we're going to look at the examples just now. I'm not going to repeat the same because also the samples needs to be randomly selected and the variance needs to be assumed to not to be equal to one another. When you state your hypothesis last week, we did that with one population and we looked at the hypothesis testing steps. And we said the very important step in your hypothesis is to state your null hypothesis and your alternative hypothesis because based on the type of test or the test, whether it's a one directional test or a non directional test, and whether it's a one tail or a two tail based on the two concepts, which are almost exactly the same, when it's one tail we say it's a directional, when it is two tail we say it's non directional. So based on that, you need to know how to state your null hypothesis and your alternative hypothesis because when you make decisions, you need to make the right decision based on your hypothesis statement. Now, in your hypothesis statement, how do you even know when you read the statement? What are the key words that you will have to think about when you are reading the statement in order for you to know what you need to put or what kind of a spy you need to put on your alternative hypothesis? Words like if it's, I'm going to start on the far right, if they say there is a difference between the means of the two groups, then you must know that they are talking about two tail or a non directional test. Or if they say there is no change or there is a change, they are talking about non directional two tail test. And this is how you will state your hypothesis, you will state it that the mean of population one is equals to the mean of population two. In your alternative, you will state that the mean of one is not equals to the mean of two. Or you can state it in this fashion. The mean of one minus the mean of two is equals to zero because the difference between the two if it's equals to zero, it means there is no difference. Or your alternative will state that the difference between mean one and mean two is not equals to zero because then they are different. And that is for a two tail for a one tail or what we call a directional test. You need to listen to keywords like if they say it increase, then you must know that your alternative will have a greater than five, which is the right side, the right current. If they say it is small or it is bigger, it is larger than you know that you're talking about the upper tail. If they say it is less, it is decrease, it is smaller, then we need to know that you're talking about one directional test in the lower side. Then you will use your left current, which is the less than in your alternative. In your alternative because the sign in the alternative will be the sign that you use to make your decision. And it's the sign that helps you because your null hypothesis statement always has an equality sign. So therefore it means your null hypothesis will always have an equal sign or it can just be equal, equal, equal, but your alternative will have different sides. Okay. So also the sign in your alternative will help you to make decisions because we will need to know whether are you going to make decision on the right or on the left or in both areas. It helps you to create your regions of rejections. And because we're going to be calculating the T test, so you will calculate the test statistic, they will give you the critical value and you use your critical value which creates this region of rejection. If it's on the lower side, it will be negative critical value. If it's on the right side, it will be a positive critical value. For two-tailed tests, you will have two regions of rejection and you will reject your test statistic if it falls in the rejection area. And we're going to look at the example just now. Let's refresh our mind about the four steps of hypothesis. Remember, you need to be able to state the null hypothesis. I've just shown you how to state that. You need to be able to determine the kind of a method you're going to use to make a decision by making sure that you create your region of rejection or you will use your P value. The number three, you need to be able to calculate your test statistic. And here we're going to use the separate variance T test. You need to be able to know how to calculate that. Number four, you need to be able to know how to make a decision and conclude based on the information that you would have calculated or they would have given you based on some of the calculations that you have done. Okay, so in a simple term, what I just explained is when I say you need to be able to know how to state the null hypothesis is to state your null hypothesis. You will say there is no difference between the means of treatment and control groups. And this is how you will state it in terms of a mathematical formula, which is our statistical formula of how we state this weighted sentence into a statement like this. It says H naught is equals to the mean of treatment is equals to the mean of control. The alternative, let's assume they told us about the difference. So the alternative will say there is a difference between the two treatments because then we put the not equal. Step number two, we need to know what type of a decision you're going to be making. We're going to be using a T test statistic. You need to be able to know how to calculate the test statistic and this will be the test statistic that you need to know how to calculate. Now the population mean will always be the hypothesized population mean will always be equal to this. Remember this statement, we can also state it as the null hypothesis of mean of treatment minus the mean of control is equals to zero will always hypothesize it to equals to zero. So therefore it means this will be equals to zero. And therefore the test statistic will always just be the mean of one minus the mean of two divided by the standard error, which is of your proof variances of the standard sample variance of one divided by this sample size of one plus the sample variance of two divided by the sample size of two. And that is how you will get the formula so it will always states like that. And the last step is for you to make a decision so based on whether you are making a decision based on the P value or on the critical value. If you are basing your decision on a critical value, you will need to know which side on the lower side or on both sides when you're having a non directional test. If you are using a P value, remember the decision. The decision rule will state that if your P value, which is the probability value and they will usually give it to you. If it's less than your alpha value, which is your level of significance, then we're going to reject the null hypothesis. Otherwise, we will not reject the null hypothesis. That will be the same if so you can either use the critical value and the test statistic or you can use the P value and alpha to make a decision. Let's look at an example. If we are interested in whether the type of movie someone sees at a theater affects their mood when they leave, we decide to ask people about their mood as they leave one or two of the movies. So one group watched comedy. So it's our group one and one group watched horror film. You can see that these two groups are independent of one another. It's not like the same group went and watched the horror. No separate groups. One watched comedy and the other group watched horror films. And our data was recorded and our data was coded so that the highest score can indicate a more positive mood. Now, when you read questions like this, you need to always be mindful of every statement or everything that they tell you on there and highlight the things that stands out. And for example, now I just spoke about highest scores will indicate positive mood, but it's not the end because I still need to know what they want to hypothesize. So they gave us, they calculated already the statistics of the two groups. So group one sample size N is 20 and group two is 20, the sample mean, which is my X bar. So the mean of one is 10.65, the mean of two is 6.15. So they've already calculated it. If they didn't calculate it, they would have given you the data and then you will calculate them. This sample standard deviation, which is your S is 3.20 and for two will be 3.18. We have a good reason to believe that people leaving a comedy will be in a better mood. So we use a one-tail test at alpha of 0,05 to test the one, the hypothesis testing. Why are we saying we're using a one-tail test? Like I already highlighted the key thing that I've read from the statement, higher scores indicate positive mood. So therefore already they are telling me that if I need to do some hypothesis testing, I will have to test for higher scores. So it means one has higher than higher scores than the other. Therefore it will be a directional test or what we call a one-tail test. It will be a directional test. So I need to always interchange between the two to get used to your psych 3704, how you state certain things. Because you not always use one-tail test, but you use it with like directional tests. So let's start with our hypothesis testing. So step number one, we need to state our null hypothesis and our alternative already, our null hypothesis. We know that the mean, the hypothesized mean will always be equals to 0. And because we're doing a one-tail test and we know that it said higher, that's the first thing that you need to also notice higher means greater than. So our alternative hypothesis will say the difference between mean one and mean two will be greater than 0. Or mean one and mean one is greater than mean two. I remember mean one is our comedy and mean two is our horror. I also identify a couple of things that I would need when I do my hypothesis testing. Then I go ahead and do step number two. I know that I'm going to be doing a T test and I calculate, I found not calculated. I found my critical value. You don't have to worry about it. Remember how to find your critical value. If they want you to use a critical value, they will give it to you in the exam or in your assignment. Because they don't give you tables. So that value, you will be given. And this value, your alpha value, you will be given. The test statistic then we substitute. Remember the formula. Our T is the mean of one minus the mean of two divided by this variance of one over the sample size of one. Last, the variance of two over the sample size of two. And then we just substitute into the formula all the values that we have. And we get our answer of 4, 461. Now we're ready to go and make a decision. So we go to our region of rejection because we have our critical value where it creates our region of rejection. We know that it was on the right hand side because it was a greater than. Therefore it is on the upper table side. And it was 1.86. So there is our region of rejection. So we need to take our test statistic and see where does it fall. And when we look at the test statistics, it falls in the region of rejection. And therefore we're going to reject the null hypothesis at alpha of 0 comma 0 5 and conclude that the average mood after a comedy is better than the mood after a horror movie. And that's how you will do your hypothesis testing. So now I've done it in this way in the exam or in your assignment. They will give you each portion of everything that I went through the hypothesis testing, how you state the statement, how you state it. They can give it to you as one question to see if you understand how to do or how to state the hypothesis. They can ask you to calculate the test statistic. And make sure that you understand how to calculate, use your calculator to calculate the test statistic. And they can ask you to conclude and when they ask you to conclude, they're either going to give you an alpha value and the P value, or they will give you a critical value. And because you would have calculated the test statistic, you can then make a decision. So you just need to be aware that is why, hence, you need to know all the four steps of hypothesis testing. Okay. Are there any questions? Before we look at... Oh, sorry. Yeah, I forgot to mention the P value. So in terms of the P value, in terms of the P value, I've already explained what the rule is. If they give you the P value and they say, make a decision based on the P value and they told you that the P value is 0,0007. They gave you your alpha value of 0,05. So our P value, because it's one sided, so the value I see there, that will be the value I use. So 0,007 is less than my alpha of 0,05. Therefore, I'm going to reject the null hypothesis based on the same information. So whether you use the critical value or your P value, you will still get the same answer. When we find more activities or exercises dealing with P value, I will explain in more detail. But if not, remind me before the session end to explain how the P value works. Because it's not as straightforward when you have a two-tail test. And I need to explain that. Okay. Are there any questions? If there are no questions, then let's start looking at the exercise. So here I expect you to talk to me so that I'm not the only person who is talking. Consider the following statistics regarding the post-training attitude score. So here we have group one and group two. And we are given the mean and the standard deviation of group one and group two. The first question they are asking us in this exercise is, what are these values that are shown on the table? Are these values population parameters? If it means this comes from the population or are these sample statistics? Are these measures coming from a sample or are these test statistics? Are these things that we calculated when we calculated T? Which is our test statistic. Any search? Anyone? Do you know what these are? The answer is already there. It is not test statistic because test statistics is either you calculated the Z test or you calculated the T test. It is not a population because from the group one, group two, we always use the samples statistics, not the population parameters. So and already on here they told you that the following statistics already they tell you that this comes from the sample because this will be your sample me and your sample standard deviation. Look, you need to speak to me because I cannot be the only person working and in order for me to also know that you guys understand the work or you need more help. It's if you engage with the content with me. I know it's nice to listen to my sweet voice, my husky sweet voice, but I also want to hear from you guys. So let's look at the next question. Consider the following statistic, which is the same table that we looked at. We can go down into the question. It says group one N is 20. So they give us more information because they didn't give us that information in the beginning. So N is 20 and for group two N is 20. What is the value of the test statistic? Yeah, expecting you to do is to go and calculate your test, your T test statistic by using the formula, the mean of one minus the mean of two divided by the variance, which is the sample standard deviations squared of one, divided by the sample size of one plus the sample standard deviation of two squared divided by the sample size of two. Now we know that this is the mean. So this will be the mean of one and this will be the mean of two and this will be your standard deviation. That's all. This is your standard deviation. This is your standard deviation. So standard deviation of one standard deviation of two, we just substitute to the formula. So our mean one is 21.65 minus 20.40 divided by the square root of 2.99. And I need to square the answer divided by 20 because they told me that N is 20 plus 3.05 squared the answer and divide by 20. I just want to pop this and share again my entire screen so I can go to my calculator. If you have a Casio calculator, then your calculation will be easy to calculate. So if you don't have a Casio, you will have to calculate each step bit by bit writing down the answer. So you will start with everything that is at the top, write the answer for what is at the top, then go underneath the fraction underneath the square root. Do the first bit, 2.99 to the power of 2 divided by 20, write the answer plus 3.05 to the power of 2 squared or to the power of 2 divided by 20, write the answer. And the two answers divide or take the square root of the sum and then write the answer and you must write all the numbers. Do not take a shortcut and round off. You must write if it's 10 digits, write them on as they appear on your calculator. And then once you're done, you take the answer of the top divided by the answer of the bottom and you will get the answer that they are looking for. So if you have a Casio, we can use the fraction portion because the first part which is the top part is at the top of the fraction and the bottom part is the bottom of the fraction. So we're going to start with capturing our data 21.65 minus 20.40 and we go down and we put the square root and on our square root, there is two fractions. So the first fraction is 2.99 and I must square that and I go down divide by 20 and I use my arrow to go to the left and I put the plus sign with another fraction and 3.05 and square the answer and go down and 20 and press equal. And that is the answer. The answer is 1.31 and it is in one decimal. And to round off to one decimal, they say if the number to the left of where you want to round off to the number to the right of where you want to round off to if it's bigger than or equals to five, you add one to the number to the place where you are rounding off to. Otherwise, you leave the answer as it is. So our answer is 3.1.30. The zero is not bigger than five or equals to five. So therefore we do nothing. The answer will be 1.3 and hence our answer will be option three. And that's how easy it is. Are there any questions? So since you allow me to just talk talk talk, we'll get to more questions and I expect you to engage with the content because I can say as much as I can. But at the end of the day, you guys need to take the initiative to do the work as well. Okay, exercise three. It says suppose the two tail p value for the test of the differences between two minutes in the previous question is 0.19. If a test is set to 0.10, what is the decision regarding their null hypothesis? And here I can definitely give you the background in terms of the p-value. In terms of the p-value, remember we have directional test and we have a non-directional test, right? And a one directional test is a one tail test and a non-directional is a two tail test. Now, when we talk about a two tail test and here they tell us that from the previous, our two tail test is 0.19. So this is 0.19. So if they say is 0.19, if I want to take this two directional test and create a one directional test, therefore I must divide this because remember you will have two sides. So therefore it means this 0.19, it has this side and that side. So this has two sides. So you will have to divide it by two to get the portion of both of them. And because it's two directional tests, we added both of those areas to create 0.19. So if I want a one directional test, so it will be 0.9. For a one directional test, it will be 0.19 divided by two and it will be equals to 0.095. So the answer will be 0. My p value here for the same will be 0.095, not 9295. And we are only talking about either that side or we are talking about only that side. So this can be 0.095 and this can be 0.095 depending on which directional test you were doing. So if it's on this side, the p value will just be 0.095 and if it's decided 0.095 the same way, this is 0.095 and 0.095. And the addition of both of them should give you the same way. So those are the things that I wanted to explain in terms when it comes to the p value. So you need to remember this because sometimes in your exam and in your assignment, they might ask you if I was given a p value from a two directional or two tail test, what will be my p value for a one directional test? Or if I have a one directional test p value of this much, what will be my two directional test p value? You need to know the concept of calculating for both of them. Right, that is unrelated to the question that we have on our hands. So let's answer the question. Yeah, I was just explaining the concept of p value since we didn't draw too much on it. So I hope you may note of it. Suppose the two tail p value of the test statistic is of the difference between the two means in the previous question is 0.19 and the alpha value is said to 0.10. What is the decision regarding our H not? You always go back to the rule. What does the decision rule say? This is your Bible decision. That's where you go. That is your law. You always refer back to your law and quote your law. So the decision rule says if my p value. If is less than my alpha value. I'm going to reject the null hypothesis. That is the rule. It's not something that you can change and swap and change. It's the rule. So if that is the case, then let's look at what we have our p value is 0.19. Our test statistic or our alpha value is 0.10. So is my p value less or greater than my p value is greater than therefore we do not reject the null hypothesis. So now you can go and answer the question. So once you know all that you can go and answer the question and see which statement is correct. So what will be the decision? The decision will be we do not reject the null hypothesis because our p value of 0.19 is greater than our alpha of 0.10. Remember this means big. This means small. If it's eating the words it means it's small. If it's making a current a right current it means it's big. That is in terms of the sign. You always need to remember that. I know that we are not all mathematicians to always remember what is greater than and what is less than sometimes. It's very confusing. But just always remember if the sign eats the letters it means it's small. If the sign is pushing the letters then it means it's big. Right. Moving on. I'll ask a question about this one. Please ask. Why then? Because it's a two-tailed test. Why did you not divide it by two? Because we are not asked to make a decision based on a one-tailed test. We are given the p-vein for a two-tailed and we are asked what will be the decision. We just make a decision based on the p-vein. Let's assume that they gave us this. They told us that the p-value from this was from a one-tailed. But yeah, they say what will be the decision regarding the H0 if we test for a two-tailed. Okay. So it's about the test. Yes. If they gave you your one-tailed and they ask you to make a decision based on a two-tailed test, then you're going to take this answer so you will say 0,19 and you will add 0,19 because this is a two-tailed. It's both one-tailed test combined. Right. Okay. Changing the subject. Going back to the original and change this if they said for a one-tail, then we're going to take a 0,19 and divide it by two. Okay. Thank you. That is why I explained it first on the site so that you can understand next time unless we get another question when we do activities. But I see that we left with 30 minutes and I still need to do one more. So this I'm just going to rush through is just for information. So we do have also what we call the core handy for effect size. You will need to go and read what core handy because those who started with me this process, you know the disclaimer. I'm not a psychological researcher. I'm not even exposed too much to that. I'm more of a statistician. I only explain things that I know, things that I do not know. I will always refer you back to your notes and I will just note them so that they are for you to note that I don't skip all of them. But I'm just going to note like this one. I'm not going to go into detail on this. What you will be expected to do is to know how to explain the core handy for effect size. You need to be able to calculate the core handy because they might give you the mean, the pulled variance. They will give it to you. They will give you both the sample means and you can calculate it because core handy is just the D is equals to mean the difference between the mean divided by the variance. And you also need to know how to interpret core handy. Right. You need to know how to interpret it. And here it says if D is equals to one, it will indicate that the two groups differ by one standard deviation. If D is equals to two, therefore it indicates that they differ by two standard deviation and so on and so forth. So if you get the answer as four, it means it differ by four standard deviation. If it's three, it's three standard deviation. Things like that. And also in terms of the effect size, when you calculate it, you might get the effect size in this proportion where it gives you a 0.4 or 0.1 or 0. Like that or 0.9. You need to know that what does that mean in terms of the importance. So if it's between 0.4 and 0.8, it has a medium importance. If it is greater than 0.8, it has a large effect. Now one, two, three would have already gave you an indication that it has a large effect. So let's look at an example. If you calculated a core handy, and this is the answer that you get. It's 1.15. And they ask you, how do you interpret this? What is this 1.15? You will say it shows the large practical importance of the difference between the two means. That's all that you would say because it is equals to one. It's greater than 0.8. If the answer was, if you calculated G and the answer was 0.38. Let's not make 0.38. Let's just leave it at 0.3. Let's say the answer was 0.3. 0.3 is less than 0.4. Therefore it means it has a small effect. It has a small practical importance. That's how you will make a decision based on that. So let's look at an example from your actual data. So they will give you a table that looks like that. Where they give you, based on the sample, she finds that they're following. And the table has the sample size, the sample mean, which is our x bar. So they've got group one and group two. So therefore our x bar one and x bar two, already they have given them to you. On the formula, you will be able to know how to calculate that. And they gave you the standard deviation. Now the challenge with this is, since they didn't give you the pooled variance, you need to know how to calculate the pooled variance. But they would have given it to you after all. Because they don't expect you to know how to calculate it. It's a complex calculation, which they save you a whole lot of time. So the pooled variance, which is that, that is the pooled variance. So you just substitute the values until the equation and calculate. So let's substitute. It's 5.5 minus, because it's mean one minus mean two. So 5.5 minus, my slide decided to move. 5.5 minus 4.9 divided by one. Because it's one comma zero. The other thing, when you look at old UNICEF papers, especially the past exam papers, when you see a space in between, you can see that a number, there are no spaces. So it's a whole number. When there are spaces, there is a dot that is missing for a comma. So always treat it as a comma like that. Okay. So let's calculate hour for handy. So 5.5 minus 4.9 equals divide by one. It will just give me zero comma six. So that is the answer that I get. Zero comma six. So based on the calculated effect size, that's what they told me. The researcher can conclude that the practical importance of her finding is. So now they will also be nice for you and give you this table as well in the exam. So you just use the table as a reference to make a decision. So where is 0.6? Medium. It's medium. So it will be number two. Yeah. So that's how it is. Simple straightforward. How you calculate this. So you don't even have to go and scratch your head and say, how do I calculate SP? They will give it to you. So the pool is your SP. And you will get it from the standard deviations. Because this is just your S for one, S for two, and this is S for P for pool. Okay. So let's use the next 30 minutes to learn more about pad samples. So when we talk about pad samples, I'm not going to go and start again and explain the before and the after we've dealt with that. So since because we're working with two before and after, we're going to have to find the difference of the two so that we can use those differences to calculate your measures of central location like your mean and the standard deviation. And treat that as a one sample because it's actually one sample from the same population, but we collected data twice from the same. So we need to find the difference and create that. So when we do this, we are eliminating the variation that exists among those subjects. And the assumptions will always be both populations should be normally distributed. And if not normally distributed, then it means we need to use large sample, which is more than that. Okay. So for related samples, there are a couple of things that you still need to calculate. Sometimes it is because they will give it to you, but you need to in case they don't give you this measures already calculated, you need to know how to calculate them. For example, the difference will be your observation for the before minus the observation for the after. And once you have all the differences, you need to calculate the mean of those differences by taking the sum of the differences and dividing them by how many they are. If there were 20, we take all 20 records, subtract one from the other and take the sum of all of them, the sum of the differences and divide by 20. You also need to know that you have to calculate the standard deviation of the difference by calculating using the square root of the sum of your difference minus the mean squared divided by n minus one, which is the sample size minus one. We're going to look at the examples so that you can see how we calculate them, but 100% sure that they are not going to have you do the calculation. They will give you the differences and they can either give you the differences and they ask you to make a decision or something like that. We will look at the examples from your past exam papers, what is expected from you. In order for you to calculate the test statistic, the test statistic 100% sure that they will ask you to calculate it. So for calculating the test statistic, we use the mean of your differences minus the population mean difference. We will give it to you, divide by the standard error, which is your standard deviation of the differences. Remember you would have calculated that or they would have given it to you, divide by the square root of n, which is your sample size. Same because this is hypothesis testing for the pet, you need to know how to make a decision based on the critical value, based on whether you're doing a one-tail test or a two-tail test, a directional or a non-directional. Same using t-test. Now let's look at an example. Assume you send yourselves people to a customer service training workshop. And now you want to test whether the training made a difference in decreased number of complaints you have collected. The question is you want to make a hypothesis testing to test the difference whether there is a decreased number of complaints. And you collected the data before they went on this training and after they went on the training. So this is the information and this are your salespeople in your company. You've got five salespeople. Before they went on training, this were the number of complaints on each one of them. After training, these are the number of complaints. Now we need to calculate the difference. So sometimes you can say one minus two or you can say two minus one and don't think it will make any difference that much. So four minus six is minus two, six minus 20 is minus 14, 32 minus three is one and zero minus zero is zero and zero minus four is minus four. And the difference is minus 21. That is my difference. So we're going to use these differences to calculate or do our test statistic. So we need to calculate the mean. The mean is taking all of them, adding all of them, which is 21, divided by how many they are, they are one, two, three, four, five. So you will say minus 21 divided by five. And that will give you minus 4.2. That is the mean, calculating the standard deviation. We're going to take the sum of the difference squared or the difference between the mean and the difference. So you're going to say minus two minus four point two. And the answer you're going to square that. Plus minus 14 minus minus four point two, because our minus four point two is is our mean squared. Plus you repeat all of them and divide by n minus minus one. So our n is five. And you take the square root, the answer of your standard deviation will be five point six, seven. Now, what I expect is in your module, they will give you the mean difference. They will give you your s difference and they can ask you to calculate the test statistic because these are complex calculations that they want you to draw too much in the calculation. That is why our statisticians are there. When you want to undertake a research, you always consult with us and then we do your statistics and we send it back to you and you go and publish your report. But you need to know and be able to interpret and know and be able to discuss your results. So in order for you to be able to know that, that is why we take you through all this. It's very important because you cannot just assume that your statistician will do a good job. And you go off and you go and present and then they grill you there because you don't know what is happening. So you need to know this. This is the basic things that you need to know how things are built up so that when questions comes up, you are able to explain them. So the mean and the standard deviation are given. And now let's go and do our hypothesis. Our question is, has the training made a difference in the number of complaints and alpha value of 0,10. So this is our alpha value. So the first thing we do is to state the null hypothesis and our null hypothesis because if I go back to the question, you still remember key words in the statement. These are things that you need to look out for. So the key weight was that the difference has decreased. So it means in our alternative hypothesis. So our null hypothesis I told you in the beginning, null hypothesis always has an equal side. So even the sign that we put there, it doesn't matter whether we put there greater than or equal, there will always be an equality sign to it. The most important sign goes in your alternative. Which now we know that it said decreased decreased means less than. So the mean difference is less than zero. And we know the other things that were given to us and the other things that we calculated we given the alpha value we given we calculated the difference. And some of these things they will give it to you. You don't have to worry about is still the same thing the critical value which creates the region of rejection. You don't have to worry about how do we calculate that or how do we get that they will give it to you if they want you to use it. Then we can calculate our test statistics our difference we did calculate that it was minus 4.20. Our mean, remember our mean difference is what is hypothesized always equals to zero. Divide by the standard error which is our standard deviation of the difference we calculated in the beginning it was 5.67 divide by the square root of 5. And when you calculate this you will get the answer of minus 1.66. And then now let's make a decision based on the critical value we can draw our original projection. Remember the sign set less than so it will be on the lower side of our limit. So our critical value was minus 1.533. And our test value is minus 1.6 where does it fall does it fall in a not reject or does it fall in a reject and it falls in the rejection area. And we make a decision to reject the null hypothesis and conclude by saying there is a statistical difference in the number of complaints. Because we reject the null hypothesis and state that training did offer or did make a difference in terms of the number of complaints after people attended training. And that's how you will make a decision. So if you're working HR this can be a good thing because when people say performance management and all those and they say people are not doing the work in the call center. You can just say okay let's look at their call center complaints if you do have that the feedback you got the score. And then you suggest for training they must go on communication or ethic or whatever training that you can take them to or customer service. Still training and when they come back after a month you collected the data and you check you test the weather taking them to training did it improve your processes within your call center and then you can make your decision. And this are very important tests that you need to learn. Okay so moving on Excel also has the capability sometimes in your module they will refer or they will give you screen shots that comes out of Excel output and on Excel. It has the capability to do data analytics like running a regression or a correlation or running any of this test that we just went through and and this I ran it out of Excel. So it creates an output which you will have to select the correct test statistic that all the tests that you are running on Excel. And yeah I selected the pair to test and you can see that most of the things that we have calculated our test statistics was 1.66 you can see they is 1.66 what else was there that we calculated. Okay the other thing that it does it doesn't calculate the difference. So you won't be able to see the difference but you will be able to see your hypothesized mean which is always equals to zero. If you want to calculate the difference you will have to calculate your difference from outside. So this just what it does it just creates the mean and the standard deviation and the variance of the individual test or groups the before and the after. And then it also calculates automatically on your Excel it will give you your p-value p-value for one tail and p-value for two tails. You will see if you take this p-value for one tail this value there if you multiply it by two you will get this value. If you take this value and divide it by two you will get the value that I put the arrow on and also you will have your two regions of rejection. And because this is a two tail test it took the same critical value and multiplied it by two but those are your critical values and you can use this to make your decision as well. So we left with 15 minutes let's look at more exercise unless if you have a question before I continue going once going twice and while we are at going and going please complete the register. I'm just going to also post it again on the chat for those who haven't completed it. Okay so let's look at more activities. So this is another question a researcher wants to test the following hypothesis. The null hypothesis states that the mean is equals to the mean mean one is equals to mean two and the alternative states that the mean one is greater than mean two. So already based on the statement that they have given you you could already sense that this is an independent test right. You can also based on the sign you can say this is a directional test and or it is a one tail test. So those are the things that we just I just make assumptions off now. On the basis of the data provided the output or the other thing always remember all the activities that we do I also post the where I got the question from so that when you go back and found and find past exam papers you will know that we have dealt with them in class as well. On the basis of the data provided the output of the computer program indicates that the T value is 1.72 was found with the P value of a two tail test you can see the right to tail test and yeah we have a one tail test. Given as P of 0,056 what should the researcher do to evaluate these results at the level of significance alpha so now they gave you a two tail P value but that researcher is testing for a one tail test. Make a decision or look at the question sometimes is not always I have come to realize that in your module sometimes is not always a straightforward you need to read your question and the options given so to guide you in terms of what you need to do. So let's first read the statements and then decide what will be the correct answer for this question. Number one divide the P value by two before comparing it to the alpha. Number two multiply by the multiply the P value by two before comparing it with alpha. Number three divide alpha by two before comparing it to the P value. So what do we do? Number one it will be number one we need to divide it we need to divide the P value first so we're going to divide the P value of 0,056 divided by two before you can compare it. That's all see. Let's go to the next question. A researcher suspects there is a difference between the creative ability of boys and girls. That is something that you need to take note of boys and girls in a school for gifted children. She uses the test for creativity that has been standardized in such a way that the mean creativity abilities go for the general population is 50. Which of the following is a possible way to state the null hypothesis? You read the question, the statement from there it says there is a difference between the creativity ability of boys and girls. What type of a test is this? Is this an independent test or is this a dependent test? Independent. It is an independent test and how would you state your null hypothesis? I think number one it will be number one because the null hypothesis we always use the population parameter. We do not use the sample parameter. So number one will be the correct way of answering the question. Yes. Look at the next question. How will we state the alternative of the same question that we had? Number three. It will say because yeah they just say the difference. Remember when I was showing you the three types of how we make a decision I had some key words they increase, decrease, difference. So with the difference it will be a two-tailed test because they didn't say anything like decreased or smaller or greater than or less than or something like that. Or bigger than or more than or yeah or increased and all that. So always remember if they just mentioned difference just know that it is not equal in your alternative. So number three will be the answer. I think this will be our second last question if we still have more. In which of the following research situation is the most likely that a test for comparing independent groups will be used. Now here you will have to think long and hard and interrogate the statements because the statements are very long. One, evaluate the effectiveness remember they asking most likely that the test will be comparing two independent groups. And remember what the definition of independent groups are. Evaluate the effectiveness of a new medicine used for pain relief by measuring how much the pain is reduced after taking the medication. Two, evaluate the difference in self-esteem between persons who actively participate in sport and those who do not participate in sport. Number three, evaluate the development of verbal skills between the ages of two and three for a sample of girls. Let me assist. The first one, it talks about the before because it says use pain reliever by measuring how much the pain reduces after taking the medication. So it means before and after. Number three, it says evaluate the development of sample girls. So it's just one group, which is what we did last week, right? One sample group because it's just only those girls. So it's the same sample. Number three, evaluate the person participating in sports and not in sports. So there are two groups, indie and independent. So that's how you will highlight and identify or look for the key words in the sentence that will give you the answers. Are there any questions? We left with five minutes. Let's see if we have more questions. Oh, now we get Cohen D. Cohen D refers to the, what does it refer to? Is it one, two, or three? Remember everything we discussed today. I know two of them are mentioned here, but there is something that we didn't draw too much on it. You still remember what Cohen D. So it's not three. It's not one. It is two. It's the effect size. So let's see. Oh, we still have more questions. Okay. This one I'm going to skip. Reason being. Because we talk about Z test. We used it last week. I want to skip this one. We can deal with this type of questions when we deal with them in the exam as well, exam preparations. Let's deal with what we discussed today. A market researcher is asked to conduct a study to examine people's reaction to a movie trailer. He draws a random sample of 20 males and 20 females who saw the trailer. He asked them to indicate how likely it is that they will go and see the movie on a seven point scale. Where one indicates not at all and seven indicate definitely he wants to compare to establish whether males and females differ in their intentions to see the movie based on an exposure to the trailer. Suppose the research finds that the mean and the standard deviation for each group is as follows. For the males, the mean is 5.7. The standard deviation is 2.1. For the females, the mean is 4.19. The standard deviation is 1.6. What is an appropriate way to indicate the researcher's hypothesis which is to be tested? Go back to the question. Reading the question, especially the last punch. He wants to compare to establish whether males and females differ in their intentions based on their exposure to the trailer. The first thing you need to also ask yourself, this is independent anyway because there is no other thing that they are asking you. So there the key word differ, different. How will these date the null hypothesis? I'm going to help you. The first one will always be wrong. Any way where you see X bar in your null hypothesis statement, you must always assume that that is incorrect. So you are left with two now. Which one is the correct one? Number one, number two or number three? It will be number three because different means not equal and equal. So number three will be the answer. So always take note or keep in mind of those kind of key notes like increase, decrease, less that. I know that I'm repeating them again and again and again and again and again. I just want to make sure that you don't forget so that even when we are sleeping you always remember me saying greater than greater than less than. And you will see when you go write the X that we will still remember those things. Okay. It is half past. Let's see if. I'm not going to answer all the questions. Then it means I'm not teaching you anything if I give you all the answers. So here is another question that you can do on your own. They asking you which appropriate test will this be. So you need to, you need to identify whether this is for one group for independent or for paid tests. So that is what they are asking you there. They also asking is the same thing as that one. Read the statement is this the before and after situation or these are two groups independent groups or is this for one group. What will you be comparing? That's all what they want you to find out. So you need to read the question. Identify the key things from the question. Remember, if there are things that you're not sure about then you want me to assist to unpack. You can send me an email and CC CT and tat and you need some ACA I should be able to assist you. Exercise nine. You need to identify which formula you're going to use to answer the question. Is it one or two or three? You need to read the question and whether is it for one group. So this is will be one sample. This will be for independent samples. And this will be for paid samples. So reading the statement. Which one should be able to answer that because I gave you the answers already. Exercise 10 as well. You should be able to answer whether this is a one tail or a two tail by reading the question identifying key words such as whether they did they talk about less than greater than more than equals. It's between. Is it. Things like that. But I'm not sure. I'm just looking at key words. It might not be but you need to be able to read the sentence and take note of what it's given in this sentence. Question 11. A researcher wants to test the hypothesis on the basis of this. This is one of those. So they gave you a to take. They gave you a to take. They asked you to make a decision. What will be the step remember on this one. Read the questions first in the statement so that you know which will be the appropriate answer for that. Whether it will it be just take this compare or are you going to divide it or are you going to because they are the same. Right. So you just need to know how you're going to use the PVM from yet dealt with that. There are more questions. How many questions did I have. Also this one you just need to select the appropriate hypothesis testing. On this one you need to select whether is this an independent a one group or is it a paid samples. I think I've got more questions here in this session than most of the. Do you have any questions we have only one minute left before I close the recording. No questions. Just to recap we looked at how we compare two independent samples whether we given independent sample or paid samples comparing two samples independent samples and also paid samples. So independent samples the group has no effect on one another. Head sample is the situation where you deal with the before and after. Remember to count to be a you need to be able to calculate the core healthy. You also need to be able to know how to calculate the test statistic if they give you questions like that. But based on some of your past exam papers you can see that no way where they ask you to calculate the test statistic but you can never be prepared. Make sure that you know how to calculate some of them so that in case in the one of the exams they give you a question where you need to do calculations you are able to do that. The other thing always remember the alternative hypothesis. How you state it will be based on the statement that they give you remember to keep look out for the key weights like less than greater than more than decreased increase. They give you a guidance in terms of how you state your alternative hypothesis difference or no increase it or different or equal will always be equal or not equal. What else is there? That's all. Oh, you need to know how to do the P value as well. If it's a two tail and they ask you for one tail, you need to know that you need to divide the two tail P value by two to get one sided. If you are given one sided and you are asked to find the two tail, you need to know that you need to multiply and the decision rule. Remember the decision rule. You need to compare your P value with your alpha value to make a decision. And if your P value is smaller than your alpha value, you reject the null hypothesis. And that concludes today's session. I will see you next week when we look at I think regression or correlation or high square. One of the two, but it will be about testing the relationship and have a lovely evening. Hi. Very much. Enjoy your evening.