 Please remember to complete the register so that then we can have that out of the way. So today we're going to discuss ANOVA, how to calculate the analysis of variants and how to complete the table because with ANOVA sometimes when you answer questions you will require to complete the table in order for you to be able to answer the questions that they are asking you to do and always remember all our sessions we're going to refer to the Newman Error Prompt where we're going to ask ourselves before we answer the question what is the question asking us to do what are the facts that are given in the question what are the formulas that we need to be aware of and then only then we can start calculating the question asked. Okay so let's look at our session plan for the next couple of weeks. Today we're looking at ANOVA to take taken out the F test for two population variants we can do that at the later stage we're only going to concentrate on the ANOVA analysis today and then the following week we're going to look at the chi-square test and then we look at the non-parametric test that we have the three non-parametric tests and then we'll look at the linear regression and lastly then it will be the time series so if we have enough time then we can go back to the F test for two population. Okay so that is the schedule for this month. Do you have any question or query before we start with the session for today? If there are no questions then we can just dive in into this week's session. Like I said we're going to be looking at ANOVA the analysis of one-way variants which is the requirements for us to do this we need to know the formulas you need also I forgot here to mention not only the formulas but you also need to have your table and I'm not sure how your tutorial letters this year looks if you have a tutorial letter in a PDF format then they've put the the tables we will require those tables so this one is from the 2020 tutorial letter 101 so they did give them some tables at the back and today we're going to be using this but we're going to be using the F test so we we need to go and look for the critical values of F and that will be the table that we are using for today so this will be the table that we are going to be using I will explain it a little bit later so coming back to our presentation we also need the calculator remember this is maths you need to be doing some calculations and we will be introducing so many formulas that you also need to remember and know how to calculate them. Okay by the end of the session you should be able to learn the concepts of experimental design you should be able to know how to use one-way analysis of ANOVA to test it for the difference amongst the means of several population which we refer here amongst the groups and then also later on we're going to learn how to how do we use a randomized block design to do the same to test the differences amongst the groups so in terms of the ANOVA setting an investigator in a natural wants to control one or more factors of interest and this way each factor needs to contain two or more levels so they at least should be more than two levels within that for example if you have a male and females as one of the gender category and then on the other side you have yes and no you can do a chi-square test but if you have male female and unknown and on the other hand you have tv bars um chocolate ice cream um in terms of things that people are interested in then you need to do because you are comparing two factors you need and those two factors have different levels so the gender has three levels and things that you are interested in has five levels so therefore you can use an ANOVA so if they only have one level or two levels then you need to use the chi-square test okay and the levels can be either numeric or categorical so with ANOVA there is no problem whether it's numeric or categorical you can still use an ANOVA test and the different levels produces different groups so they cannot be the same groups so they need to be they need the two levels or the different levels within that group needs to be different in each group as well think of each group as a sample from a different population and that's what you need to always bear in mind when you are working with the ANOVA as well you just need to make sure that you think of them as two samples that were selected from two different population even though they might come from the same population and you just want to check the differences amongst the groups but think of it as if like they are distinct we need to also observe the effects on the dependent variable so you will have to choose which one will be your dependent variable and that when you do the analysis that way you will need to make sure that the groups are different and also you need to convert all the categorical into a interval scale so that you are able to apply the numerical sorry to apply the ANOVA and when it comes to the experimental design remember now with the experimental design it means then you can select things that you just want to experiment with before you run your normal data analysis as well so you first need to plan how you're going to collect your data that you're going to be using in order for you to calculate the ANOVA so in terms of the ANOVA we can run three different types of ANOVA we can run a one way ANOVA which is the one that we're going to be concentrating on as well as the randomized block design we're going to look at that in your module you're not going to cover any two-way ANOVA which looks at the interaction effect we only look at one effect as well okay so in this session we're not going to touch too much under the take key and multiple comparison we're not going to touch much in terms of Lavine test for homogeneity of variant and also we're not going to touch much in terms of the taking multiple comparison you will have to go and read about that because those form part of the tutorial here we only deal with the skills but I'm going to show you at a basic level how to do an ANOVA test okay so if we need to do a completely randomized design so therefore it means our experimental units needs to be randomly assigned to the group and we need to assume that the subjects are homogeneous we also need only one factor or independent variable needs to be part of the analysis and that factor needs to have at least two or more levels and then we are able to analyze one-way ANOVA if those assumptions are met so in order to evaluate the difference amongst the means of three or more groups also the following assumptions as well needs to also be met as well your population needs to be normally distributed because ANOVA is also one of those tests that you run on a normally distributed population because you also want to infer back the results back to the population your population should have equal variances and the sample should be randomly and independently drawn and that is why you need to treat them as if like they come from two separate populations so for example accident rates for first second and third shift expected mileage for the five brands so those are the accident rates and we can test the difference amongst all this group of accidents uh sorry the we can test the group of the shift expected mileage in terms of the accident rate as well so because with ANOVA we also state then hypothesis because we testing something we test in the difference amongst the group so therefore it means we need to state the null hypothesis and the alternative hypothesis the null hypothesis will state that all the mean populations are equal so depending so for example the previous one we had three shifts so it means we can say uh the mean for shift one mean for shift two mean for shift three and so forth and so forth they are all equal all population means are equal but it means that there are no factor effect in terms of that so there are no variations amongst the groups that you have selected or the groups that you are using the alternative will state the opposite and it can state that all of them or not all of them so your alternative hypothesis will state that not all the population means are the same so there can be one or two population means that are the same but not all of them should be the same at least one population mean is different and that that will mean that there is an effect a factor effect and it will refer to that the with that factor effect that does not mean that they all the population mean will be different for the hypothesis for the alternative hypothesis so there might be some that are the same as well so let's look at this example of stating the null hypothesis so if we say that the population mean are equal therefore if there are three populations you can see that they are blue red and green and all of them have the same mean at the dotted line they all have the same mean so the null hypothesis is true in terms of this and we can test that so the alternative will state otherwise the alternative might state that the mean one is not the same as the mean two is not the same as mean three or we can say the mean one is the same as mean two but it's not the same as mean three or it can state that all of them are not the same that is the null hypothesis so at least one of them should not be the same as the other the others or all of them can be different okay so how are we going to calculate the ANOVA we need to understand certain definition or certain calculations so in order for us to calculate the analysis of the the one-way ANOVA variants which we need to make sure that we understand that with an ANOVA we always want to calculate what we call the total variant and a total variant it's partitioned into two groups or parts and what do we mean by the total variance the total variance in formula format it's called also it can also be referred to the sum of squares or the total sum of squares which is just the aggregation of the variation of individual data values across the various factor levels then we also because it's split into two and it's split into the group and the errors because there will be some marginal errors in between so therefore the SS group also referred to the sum square among the groups or we can call it among the group variation it is just the variation among the factor sample means and then the errors which we can call them the sum square within within the groups so those we get within the groups the errors I will show you just now I'll demonstrate how each one play a role they within the group error which we can also call it the within group variation it's a variation that exists among the data values within a particular factor and to explain that in a way so remember that we said total variation is your total it's your we said it's your aggregated total variation of the individual data values across various factor levels so now in terms of mathematical formula the total variation or the sum square total we can calculate it by using the sum of your x squared minus the sum of your x squared divided by n so those two are different so the other one is the sum of x squared which means it's x times it will be x plus x plus x plus x sorry it's x squared plus x squared plus x squared plus x squared minus the sum of x it will be when I add all the x's together adding them up and taking the square of the total of all the x's together so those two are different divided by n and in terms of explaining what I just talk about in terms of the total variation so here we have three groups group one group two group three and each group has different values in it or different types in it we can see that the the green dots represent group one so if you look at your dots I represent the the points or the responses in terms of your x value so if I take the mean of all these values so which is the mean of group one group two group three and I'm able to see those differences between the data points and the mean and that is your total variance that we are talking about the variance amongst the group that is the SS group it's given by n times the sum of your x observation the mean of the x minus the mean of all of the groups that you have squared so therefore it means if I have three groups so I'm going to calculate the mean let's go back there I'm gonna calculate the mean of this group so that will be mean one I'm gonna calculate the mean of group two that will be mean two and I'm gonna calculate the mean of group three that will be my mean three but I'm also going to calculate the mean of all these three groups which is this mean x bar so in terms of your sum amongst the group so it mean group one mean of group one minus this mean of all of the groups squared plus the mean of group two minus the mean of all of the groups squared and so forth when we do the example you will see how to use this formula and this will just show you the variation to the difference amongst or that exists among the groups so if I have group one and group two here you can see that the mean of these two groups are different as well and then we also have what we call the mean square measure the mean square group for the group since it's correspond to the sum square of groups the mean square group it's given by your sum square group divided by the degrees of freedom and the degrees of freedom in this instance will be your number of groups you have minus one so it will be the number of groups that you have minus one you're within the group variation which is also called the s s e is given by the total it's given by your sum square total your s s t minus the sum square error which is your s s you're sorry not the s s e but the s s group so your sum square total minus the sum square group will give you your s s e or the within group variation and in terms of the mean square error you will calculate it by using the sum square error divided by the degrees of freedom and the degrees of freedom here will be the number of group times how many um uh sample sizes you have minus one and this will just be summing the variation within each group and then adding them all over the other groups as well and you will see when we look at the ANOVA table how all this come into play so obtaining the sum square measures we already explained this so we did look at the sum square amount which is your s s your m s group and your m s e which is the mean square within uh we did look at the degrees of freedom and then the m s total which is our mean square total it's uh given by the sum square total divided by n minus one where n is just your number of the all the the the values of the group as well so if there are three groups and there are 5555 each n will be equals to 50 because it will be three times five okay so this is the one way ANOVA table that we need to complete when we answer the question so you need to have the source which is either the group or the error and the total you need to have the degrees of freedom for the group it's k minus one for error it's k times n minus one and for the total is n minus one the sums of squares which are your s s so you will calculate your s s group remember your s s group is n times the sum of your mean group minus the mean of all the groups square and then we also need to calculate the s s e which we also remember that the s s e is your s s t minus your s s group therefore it means you need to first calculate your s s group and then calculate your s s t and your s s t you calculate it by using uh the sum of the sum of x squared minus the sum of x squared divided by n so you will have to calculate the error first and then calculate uh sorry you will calculate the group then you will calculate the s s total therefore you will then you can calculate your s s error and then the m s error which will go into the mean square block will be your s s group divided by the degrees of freedom so you just take your sum square divided by the degrees of freedom the same will happen with the mean square error you take the s s error divided by the degrees of freedom that is not the end you need to also calculate the test statistic which is your f test statistic in order to calculate the f test statistics for the one-way ANOVA then we use the m s group divided by the m s error and that will give you your f test statistics but you are not done because you need to make sure that because we're doing a hypothesis testing we need to check whether are we going to reject the null hypothesis or accept the null hypothesis therefore it means you need to go and find your critical value and finding the critical value we're going to use the degrees of freedom and we always use the degrees of freedom by looking at the degrees of freedom for the group and the error so your degrees of freedom for number one will be k minus one which is the for the group or for amongst the group and your degrees of freedom for error will be within the group will be within the group and that will be k times n minus one which n is your sample size okay so how do we find the degree the critical value you will find the critical value by coming to this table where it looks and it gives you the critical values of f distribution and you're going to look at the value of your alpha as well as the degrees of freedom in terms of your table i'm going to rotate it where is my rotate okay so if you look at this table it will always be split by if i scroll to the left it will be split by the alpha value so if they tell you that the alpha is zero comma zero five so you will come to this side if they say alpha is zero comma um other number so this is still zero comma zero five there are other alpha values zero comma zero two five as you can see here so you will scroll through the table to find the right alpha value and then also you have your degrees of freedom one and the degrees of freedom two so remember we use df one in the presentation and on your on your table it will say v one for degrees of freedom one and v two for degrees of freedom two and then you go and look for your degrees of freedom one at the top and the degrees of freedom two on your left and the value that corresponds with where both of them cross that will be your critical value and that's how you will do an ANOVA test so and once you have your critical value then you can make a decision if your test statistics is greater than your critical value it will fall in this tequest color then you're going to reject the null hypothesis and if it falls in this white area you do not reject the null hypothesis so let's look at an example you want to see if three different golf clubs yield different distances you randomly select five measurements from a trial on an automated driving machine for each club at zero comma zero five significance level there is a difference in the mean and they give us three measures of the club in terms of the distance they yielded so we can see that club one yielded those distance you know most when you hit a golf it moves a certain distance so those are the distance calculated for club one and club two and club three now we need to test so let's read the question so we are given five measurements from every time so we are given our n which is our sample sizes which is within each one of the groups so we are now we can already from here we can already identify we have our k which is our group there are three we have n because there are all five measurements measurements in each so that will be five what else are we given we are also given the level of significance which is alpha which is zero comma zero five so therefore it means if we have to go and find the critical value we're going to look at alpha of zero comma zero five and we're going to find the degrees of freedom and come and find the critical value so we have all the information that we need in order to answer our hypothesis test so now remember the steps of hypothesis you need to first step the first step is to state your null hypothesis and have I started the recording maybe I haven't or we did just wanted to check because sometimes you forget and we don't have any recording so now let's start doing the hypothesis because the question is is there a difference in the mean distance so we just need to find out if there are difference among the three groups that we have so the first thing that I need to do is I can visualize the data so that I can see where the points are you don't have to do that in your exam so I'm just demonstrating yeah so here is my group one and I can see where the mean of group one is at and I can look at group two as well and I can see where the mean of group two is at and I can do the same with the group three and I can see where the mean of group three is at because I'm able to calculate the mean knowing that the mean is just the sum of observation divided by how many there are so if I add all this observation for club one and divide them by five I will get 249.2 which is that mean that I see there and I can see that only three distances were above the mean of one and two distances way below and I can also look at club three and club two and if I look at club two only two values one on the borderline but almost two of them on the borderline but two are above the mean of club three and the other one is just on the line and the other one is also like yeah two of them are below the the mean of that but I can also calculate the mean of all this because I can take the mean of club one the mean of club two plus the mean of club three divide by three of them because there are three so it will be the sum which is 249 plus 226 plus 205 divide by three will give me 227 and I'm able to see that with regards to club one, club two, club three the mean of all of them is 227 and I can see that club one all the values are above the mean average and club two only two values or two points are above the group average as well and this is just for visualization purpose and I think much that you need to know about so now there are a couple of things that I also need to calculate so I know that I have calculated the the means I know what the means are but there are a couple of other things that I will need because if I'm going to calculate sst, ssg and sse or msg and ms e and ms t in order for me to complete the entire table there are formulas so for example we know that ssg is the sum of your mean of group one so that one will be easy to calculate because I've got the mean of group one and I've got the overall mean so I can calculate ssg but we will get to that in order for us to calculate ss t we need the sum of x squared we will require the sum so for us to calculate ssg we need the sum of x squared minus the sum of x squared divided by n so here I can calculate the sum of each one so the sum of club one is 1246 the sum of club two is 1300 the sum of club three is 1029 so I have calculated that one part but I also still need to calculate the sum of x squared so to calculate the sum of x squared I must just square each one of these values and square them and it's going to take me forever I've already calculated that and that is the sum of x squared so here you have the total if I add all of them they will give you your sum your sum of x which is the value that we have and I already calculated the sum of x squared so it means we are able to calculate the sum square total now we know the mean I just brought them from that side those were the mean of that we calculated from the beginning and the mean of all of them then we also have the sample size and your overall population and the number of groups so some of this we already identified in the previous slide as well so now let's calculate the sum square total which is the sum of x squared minus sum of x squared divided by n so we did calculate that it was 778,771 I'm hoping that I am reading that correctly there are so many numbers minus 3,405 squared which is the sum square divide by 15 which is our n and remember now our n in this instance we're going to use the capital letter n because it's all three of them and that gives us 586 the sum square group which is n small n so we need to change also that one so this should be a capital letter n there because it needs to include all three groups but with the sum of groups we use the five times the mean of population one or club one minus the mean of the population squared plus the mean of club two minus the population mean of 227 squared plus until you get the rest of them and you calculate and multiply by five and the answer we get it's 4,716.40 and that is our SS group then we can calculate the SS the SSE we know that it is the total minus the group so we just take 5836 minus 4716.4 and that gives us 119.6 so we do have our SSG and our SSE and our SST we can calculate our mean square so let's calculate our mean square group which is your SS group divided by your degrees of freedom which is your k minus one and we know that our k is and we have calculated our our SS group which is 4716.4 divided by three minus one and we find the answer is 2358.2 and we can calculate the MSE which will be your SSE divided by the degrees of freedom which is k times n minus one so we know that k is three and n is five minus one and the answer we get is 93 so now we have the MS group we can then move on and calculate our test statistic and our test statistic is given by the mean square group divided by mean square error so we do have our group divided by our error so our test statistics will be 2358.2 divided by 93 which gives us 25.27 so now this is part of the table so I have my table I can go ahead and start doing my hypothesis testing because I've got all the information I require in order to continue with my hypothesis testing so the first step we state our null hypothesis and our alternative the mean of the populations are equal not all means are equal that is the alternative we need to go and find the critical value we have identified that our alpha was 0.05 our degrees of freedom for one remember we did calculate them actually because the degrees of freedom for one so this will be v one and this will be v two so for one is three minus one which is equals to two so v one will be equals to two and v two will be equals to five minus one is four four times three four times three is twelve so we do have our v one and v two so we can go to the table our alpha is 0.05 we go look for v one of two and v two of 12 and both where they meet that is our critical veil and that's how you're going to identify your critical value and that is your critical value and you can draw up the graph because we're going to use that to state whether are we rejecting the null hypothesis or not remember we did calculate the test statistic so now I don't have to go back and Greek calculate it so there's our test statistic it is 25.275 and we can locate where it falls it falls somewhere way beyond 3.8 because our critical value creates a region of rejection and it's somewhere in the region of rejection and therefore we go into reject the null hypothesis at alpha of 0.05 and conclude that because we are rejecting that they are not different or they are equal we are rejecting that therefore it means there is evidence that at least one mean differs from the rest of the groups and that's how you do analysis of variants. Any question before I move to an Excel template? No question at this point Lisa. Okay so the same information you can use Excel as well to create an ANOVA and this is a screenshot of that Excel as you can see there are your counts there are your sum, square, measures so if you go back to your sum square measures all these values there and the mean and the end they are included on your Excel template as well once when you run when you run your Excel and you get an output that will look like this so it will give you your sum, square measures your mean and your variances and then you can also find the table the ANOVA table so in your module you are expected also to draw up this ANOVA table or to complete the ANOVA table if they give you so they give you the groups the error and you can see all the values that we calculated just as they are from here and it will give you the critical value it will also give you the p-value but we're not going to talk about the p-value so you can go and find the critical value it doesn't give you sometimes it does sometimes it doesn't and if it gives you the critical value it will give you the critical value and you use the critical value and the test statistic to make your decision and that was just an example. So far we have described one way analysis of an variant and we looked at the logic of ANOVA we looked at the assumptions and how to find the test statistics and for the difference of groups so now let's look at an example from one of the past exam paper so that we can practice so if looking at this question from May June it says a partially completed analysis of ANOVA variants which is ANOVA table for a completely randomized design is shown so what is very important as well when we read the question these are the keywords that we need to always read. Randomized design will only have your your groups and error and in this instance the groups are called treatment and the error you will see later on when we look at the experiment or designed then you will see a different table. So this table it provides us with the source it provides us with the sum square measures the mean square measures and the F test so there are question marks as well. We need to be able to know how to complete the entire table without panicking so if you remember your degrees of freedom for treatment if I can draw the same table just the end next door to it as a generic table I'm going to call this the source and I'm going to call this treatment error and total now my treatment and error my treatment and total are almost exactly the same but there are two different measures okay so we know that we have a degrees of freedom we always need to remember that the degrees of freedom for treatment is the number of groups you have minus one for the error is the number of groups times the sample size minus one for the population it's the total population minus one so now if you look at the degrees of freedom you should already be mindful of what is happening so it means the six is the same as k minus one is equals to six if I need to find out what my k is I can always say k is the same as six plus one which means k is seven so there were seven groups in this dataset because of that six in order for me to find my error I know that my degrees of freedom for error so degrees of freedom for error it's given by k times n minus one but I don't know what my n is it's easy to find the degrees of freedom because this value here the k minus one plus the k times n minus one should give me n minus one so if I have this question mark there I can also use I can say it's 41 minus six and that will give me the the error the degrees of freedom there so which will be what is 41 minus six 41 minus six it's 35 it's 35 so that is my degrees of freedom here so I can also come here and put it there it's 35 so I know what my k is so I can also come in right there k is seven which I don't have to use it anymore I just need it because I think we might need it some way when we answer some of this question so what else can I do they didn't give me this value which is the sum square error so I am told what my s s s e s s t is I am given I'm gonna call this not t but I'm gonna call it group for argument sake so that then I have two different distinct t's um and then I'm also given s s e which I don't have so we need to find s s e but I'm also told what s s t is so I know that s s e is s s t minus s s g so if I have my s s t and I have my s s g I can just substitute 46.5 minus 17.5 and that will give you 46.5 minus 17.5 equals 24.5 24.5 so the answer here is 24.5 now I can calculate my m s error now my s m s t and m s because I need to find my test statistic my f test as well so calculating m s just leave it yes so I'm interrupting for s s e I got 29 you got 29 what did I do wrong it's 46 oh sorry my bad I think I calculated something wrong yes okay uh it's 20 29 25 29 just 29 right just 29 is yeah I think I pressed something wrong on the calculator as well okay so that is 29 so in order for us to calculate the f test we need m s we need m s m s g and we also need m s m s e but yet I didn't put a question mark it doesn't matter whether they put a question mark or not the fact that on the f there is a question mark and in the question as well there is a an f test so we need to calculate m s g which is your s s g divide by your degrees of freedom which is your k minus one so our s s g 17.5 divide by our degrees of freedom of six because already they have given it to us so you go and calculate that once you've calculated that you can also calculate your m s e which is your s s e divide by your k n minus one which is also we've calculated that which was 29 divide by 35 we did calculate and that will give you the answer so let's calculate so 17.5 divide by six what do you get 2.912.916 okay I'm gonna leave it at four decimal and what is 29 divide by 35 that's a 0.8286 0.8286 so now we're not done we need to calculate the f the f state which is the test statistic and let's calculate the f state which is your m s g divide by your m s e so our m s g is 2.9167 divide by 0.8286 2.9167 divide by and we might have a problem because we might have a problem of rounding off too quickly 0.8286 equal 3.52 I'm just gonna leave it at two decimals 3.52 so we have most of the we have that value there that value there that value there so we completed our ANOVA table so let's look at the answers which statement is incorrect the number of treatment involved in the experiment is six we're looking for the incorrect answer the number of treatments involved so how many treatments do we have that is our k how many treatments do we have we only have seven so this is the incorrect one but we can also check the logic of other questions the total sample size is equals to 42 the sample size is n capital letter n in this instance I'm gonna use big capital letter n to differentiate between the total and the sample size because there is a small n and a capital letter n so 41 how did we get 41 41 is the same is the same as n minus 1 which is equals to 41 which n will be equals to 42 so the total sample size is 42 right that will be correct the mean square of treatment which is mst is 2.9167 what did we get 2.1967 right that is correct because our msg is is our treatment since we have got two t's we changed it to msg the the mean squares for error which is our mse is 0.8286 which is what we calculated which is correct and our test statistics we also did calculate it which is correct so and that's how some of your questions will look like in the exam and the same will be the same or apply when you are calculating also the ANOVA for their experimental design okay any question I wanted to find out do we get those formulas for the exam okay yes you do get the formulas so now because you're writing an online exam depending if this year they go back to normal and they say you venue based you you will receive formulas for the online exams you just need to bring your own piece of paper that has all the the formulas close by that you can use as a reference like if for example your lecturer might send you closer to the exam to say these are the formulas in sheet that you can use to answer the exam questions because with formulas they also like remove all the noises all the other irrelevant formulas that you might find and then give you the only formulas that you might need for that exam as well but it will also depend on your lecture but formulas are given okay so now let's look at how we do a randomized block design so similar to the one way ANOVA for the randomized design there is one way you can create blocks within the groups so we are with this one we test for equal population means as well but we want to control for possible variation for the second factor and the levels of the second effect effect secondary factor are also called the blocks so because we're adding another level because we are going to split the total variation into those three because to accommodate the blocks that we are splitting the the levels with so we're going to say our sum square total or the total variation will be given by a variation among the group variation among the block and the eras okay so we have used those three before sst ssa and sse and we know that sse is sst minus s ss group now because we do have the block we also need to take that into consideration so the formula for sst stays as is the formula for ssa stays as is and we only just need to make adjustment when we calculate sse we going to say that we need to accommodate the block so in terms instead of just saying ss ss t is ss sorry ss e is ss t minus ss a we just going to say it ssa plus ss bl which is for the block okay however in terms of finding the degrees of freedom and calculating the ms blocks the formula will be different because then it will be ssb divided by the degrees of freedom and here the degrees of freedom will be r minus one and for the among the groups will be c minus one and then for the eras it will be r minus one times c minus one which are your number of groups your c will be the number of populations that you have and r will be the number of blocks that you have as well so let's look at an example of an ANOVA table of randomized block and that is ssb ssa ss e and ss t and your degrees of freedom are minus one c minus one and r minus one times c minus one and for the total it will be your number of population times your number of blocks which will just give you the total number of observations that you have minus one and when we calculate the test statistic you're going to have to calculate it for both the amongst the block and among the group um by for the blocks you're going to say msb divided by ms e and for among the group we're going to say msa divided by ms e so let's do the test so the mean the hypothesis testing will still state the mean population one and mean population two are the same and your hypothesis your alternative hypothesis will state that not all populations are equal and you're still going to calculate so you're going to be asked to calculate the test statistic for among the test statistics are among the block or among the group so you just need to always remember that so if they say calculate the test statistics among the groups then you need to know that you're going to be using the ms among the group which will be your ssa and amongst the block we will use the ssb and you're also going to find the critical value and make a conclusion either you reject the non hypothesis or not the method of the previous session still applies okay so let's look at an example here you are given an ANOVA table in a randomized block design and there it is the keyword block design and I can also see that on my table you will you have three things previously you only had two things so now I have three things so I'm looking at the block design and it's shown it is shown below where the treatment will refer to the blood pressure and the drugs and the blocks will fit to different groups of men with high blood pressure use the given ANOVA table to answer the question can we infer at five percent level of significance that the treatment differ so that is what the question is asking so the first step that we need to do number one is to state the null hypothesis and the alternative hypothesis it's easy null hypothesis the mean one is equals to mean two it's equals to mean three how many are they there are so many I'm just gonna go and say it's equals to mean n and the alternative not all population mean I can just use mu not all population mean are the same that is our hypothesis we are given number two our alpha is 0,05 therefore we can go and find our degrees of freedom or our critical value so our degrees of freedom for v1 and v2 remember we take them from here now you need to be very very careful with this remember now we have treatments and blocks so you just need to always remember that blocks it's r minus one it will be your v1 and groups will be your v2 right so let's go and do that so our blocks oh you can also just it will not really matter that much but let's say our our v1 it's for our v2 is six we can also go in and check if we do the vice versa one so let's go find the critical value so v1 we set the answers are four and six four and six and we said we can also check if it's six and four will give us the same answer so four is our v1 and six our v2 so there for this one we get 4.53 if we use the other the other side we said it's six and four can you see that it doesn't give you the same answer it will give you two different answers and that will confuse you so you just need to make sure that you assign the right v1s and v2s to the right groups as well so in this instance we can say this is our v1 and this is our v2 and go and answer there and we found that it is 5.43 and our f of zero comma zero five which is our critical value it is four comma what did we say five three four comma five three now we need to find our test statistics you can see that they have two different test statistics but we need to read the question carefully it says are they different for the test so it means we only need we only require that one actually I even did it all wrong because we're not supposed to take this one we supposed to take the degrees of freedom of error not of so v1 and v2 is your error which is 22 sorry my bad I used the wrong the wrong number it's 24 so therefore this is not also correct since I used the wrong one your v1 is for the treatment because we're doing the mean difference for the treatment so we must remember it's not the same as the previous one so v1 is for the treatment and v2 will be the error and we need to go back because we did it all wrong so v1 we said it is four and v2 is 24 so it's four and we need to go to 24 and where they meet 2.78 2.78 now we know what our our test statistic is our test statistic is f is equals to f that we've calculated that it is 14.6 I mean I leave it at 2 decimal 14.61 now we need to go and make a decision so I can just draw so for this f test always a one sided test so they will be your f critical value here your f critical value which is 2.78 and anything that falls in the shaded area we go into reject so 14.14.61 falls in the shaded area so we go into state that therefore our decision will be we reject the null hypothesis at alpha of 0 comma 0 5 and we conclude that there is sufficient evidence that not all mean population are the same and that's how you will conduct a hypothesis testing let's look at another so this is the same sorry that's how you state your null hypothesis how you find your critical value so I just did it manually but you can see that we followed all the steps and we can conclude that we reject the null hypothesis at least two more than two treatments differ or they are not the same okay so in the exam you do get tables like this they will already have calculated and completed everything they just asked you the questions so all you just need to remember is know what is calculated where how do we calculate the treatment degrees of freedom how do we calculate the block degrees of freedom how do we find the the critical values how do we state the null hypothesis so looking at these two questions or this question question number 12 it says given the information in the table which one of the following statement is incorrect number one is that true or false based on what we know about stating hypothesis testing for ANOVA is that statement correct or incorrect the statement will be correct because always remember yes always remember that when you do ANOVA the null hypothesis will state that all populations mean are equal okay um the alternative is that statement correct or incorrect it's correct it will be correct because it has to state that at least two or it can say not all of them differ or all of them are not the same or at least two means differ from the rest of the other groups the number of blocks used in this treatment is three is that true how do we then find the blocks the number of blocks the degrees of freedom we still remember how do we find that it's c minus one for the treatment is r minus one so are they three number of blocks yes yes that is true because c minus one should be equals to two in that instance therefore your c will be equals to three because if I move one to the other site it will add to two and create them the number of treatment used is four is that correct that's incorrect that will be the incorrect one because they have four degrees of freedom treatment therefore they are r minus one r minus one so then it is r is equals to five the number of observation collected in this experiment is 15 that will be true because remember that will be r minus one times c minus one um oh sorry I'm using the error one rc minus one rc rc minus one yes so it's rc which is three times five it will give us 15 am I doing something wrong now yeah let me not even go there okay and this is another example of how they can ask questions in the exam now if you look at this they give you some details about the one way and over some calculations that they already did they give you your n your mean your standard deviation you don't even need the standard deviation s one is two s three but they give you that and then oh sample one sample two sample three I'm not sure what they refer to s one s two s three then they also give you your some square total and some square error and they're asking you calculate your test statistic what they haven't given you here is what your ms your msg or ms yes they didn't give you your ssg they didn't give you your msg but if you can remember that ss e is equals to ss t minus ssg you can find ssg right because if I move ssg this side it will be positive and I move ss t this ss e this side it will be negative and then I can just substitute the values so I have 500 and 60 minus 693 and that is 133 minus 133 33 so if I have my ssg and my ss e then I can also calculate my you should be able to calculate your msg msg but now my msg will be negative otherwise you can use that for the formula so let's see if we use the formula if we can get also the ss remember that your ssg is given by your n times the sum of your mean observations minus mean groups which are those groups that you are given minus the mean of all of them squared so now if I you need you can calculate the mean which will be that mean that we're going to substitute there which is 40 plus 48 40 plus 48 plus 50 equals 138 divide by 3 which is equals to 46 so you can also calculate it and say there are what is your n n is 10 plus 10 plus 10 is that oh no n will be this small it's 10 because then we're going to multiply everything with that so it's 10 and we're going to say 40 minus 46 squared plus because is the summation 48 minus 46 squared plus 50 minus 46 squared close bracket and we can calculate everything so I'm just going to open my calculator we'll do it all at once 40 that is if it will allow me to include all of them bracket 48 minus 46 plus brackets plus open bracket 50 minus 46 brackets with and close bracket and equal I get 500 n 60 yes I've got the same as well unless if these people calculated s s t wrong they used that is why it will not make sense to get a negative answer so yes so your s s s g it's 500 and 60 right so this they did something wrong there that is why I was wondering why would it be a negative value cannot be a negative value there okay but then it's fine because now we have s s g and and and s s e we can calculate m m s g which is s s g divide by degrees of freedom maybe this are the degrees of freedom I am not sure now also with this question that they have so I'm going to use this as the degrees of freedom so that will be no it cannot be five minus one it cannot be so there are three groups so it will be k minus one oh okay it is k minus one so our s s g is 560 divide by our groups there are three so it three minus one and what is five five 60 divide by two it's 280 that will be 280 and we can do the same with m s e uh m s e is s s e over k times n minus one our s s e is 693 over our k there are groups there are three times 10 minus one which is 693 over 9 times 3 is 27 so 693 693 divide by 27 is 25.66666 25.66666 i'm just gonna keep four decimals so i'm gonna remove one of the six and put the seven okay so you know the question was calculate the f test statistic f stat will be given by your m s g divide by divide by m s m s g is 280 divide by 25.6667 20 280 divide by 25.6667 equals 10.90 10.90 i think i must leave it four decimals for now 909 and if i leave it to four decimals then it will be 9091 okay so let's look at the answers so that is the correct answer so the tricky part here is to also be mindful of the values that you have because if you have taken what you have here and did what i did initially then you would have gotten the wrong answer so this is misleading we can just ignore that if you do verify your data you can see that your s s t will be smaller than your s s e then it means there is something wrong because we know that s s t it's your s s g plus your s s e that's what one thing that we know of okay so those are typical questions that you will get in the exam when it comes to the ANOVA i do hope you can find more activities that you can go through so just in closing and to wrap up we have to the end of the session we have described the one way ANOVA we looked at the logic behind the analysis of variants we looked at the assumptions how do we do some calculations in terms of finding the test statistics and making the decision therefore it means doing the hypothesis testing to test for the difference amongst the groups we also looked at the randomized block design where we looked at the block effect in terms of how we do the test of differences when we include also some blocks um any questions any comments any query we have one minute left are there any questions no no questions it was a good session actually i think it's just more to practice to get used to using the formulas as well yes yes yes yes yes all right so if there are no questions we can go and enjoy our evening have a lovely evening and bye bye thank you please don't forget to complete the register