 So, now we will construct the analysis of variance table or ANOVA. In the analysis of variance table we have 5 columns. The first column is the source of variation from different factors and their interaction. The degrees of freedom associated with each of the factors and their interactions. The sum of squares linked to each factor and its interactions. Then mean square is obtained by dividing the sum of squares by the degrees of freedom. So, mean square is equal to sum of squares divided by degrees of freedom. And you also have the contribution from error as the source of variation last but not the least and the degrees of freedom for error would be A, B, C into n-1 that would be 8 for the present case as we have only 2 repeats per run. And you can see that the sum of squares and the degrees of freedom are used to find the mean square. For each main effect or the interaction you have only one degree of freedom for each of them. For error this 8, so here you divide 110.25 divided by 1 you get 110.25 and then you also divide the sum of squares due to error which is 2.9397 divided with the 8 and you will get 0.3675. So you have 110.25 divided by 0.3675 which is approximately 300. And similarly the next sum of squares for B would be 172.1344 and that you divide by 0.3675 the mean square error contribution you get 468.4. For C it would be 0.25 by 1 which is 0.25 that divided by 2.9397 by 8 which is 0.3675. So you have 0.25 by 0.3675 as 0.68. Similarly you find it for D and you can see that except for C all other f values are pretty high. The critical f value we are choosing the level of significance as 0.05 and you have 1 numerator degree of freedom and 8 denominator degrees of freedom. So the critical f value would be 5.32. You compare the actual f value with the critical f value and see whether the f value is exceeding the critical f value. Critical f value is 5.32. So 300 is obviously greater than 5.32, 468.4 is even more so, 0.68 corresponding to factor C is however less than 5.32 and for factor D 317.3 is higher than the f value corresponding to the critical one. So you can see that factor C is not significant here. So analysis of variance has been used to find that C is insignificant. So the C insignificance is highlighted in this slide. What we mean by C being insignificant is that factor C is not influencing the response and hence it is agreeing with the null hypothesis which said that factor C is not important. On the other hand, the null hypothesis which said factor A is not important, factor B is not important, those hypothesis are rejected because of the f test. You also look at the p value. You can find for factor C the p value is pretty high at 0.433 and so the null hypothesis would stand to be accepted for this case. The critical probability value was 0.05 and any probability or p value lower than 0.05 would lead to rejection of the null hypothesis. However, for factor C the p value came to be 0.433 which was greater than 0.05 and hence we accept the null hypothesis. What is this p value? The p value is the type 1 error. Then what is type 1 error? The type 1 error is probability of wrongly rejecting the null hypothesis. If the probability of wrongly rejecting the null hypothesis is very small then we reject the null hypothesis. If the probability of wrongly rejecting the null hypothesis is pretty high then we accept the null hypothesis. In this case, the probability of wrongly rejecting the null hypothesis is as high as 0.433. Since this probability is quite high, we cannot reject the null hypothesis which says that factor C is insignificant. And we can also calculate the p values for factor A, factor B, factor C and so on. But let us first complete the ANOVA table for the interactions. We have the binary interactions. We do not have all the binary interactions, 4C2 would be 6 binary interactions but we are showing only 3 binary interactions because the other 3 binary interactions are aliased with these 3 binary interactions. So when you look at AB, again the same procedure is followed, sum of squares divided by degrees of freedom, you get the mean square. We are talking about a single combination. So we have a single degree of freedom and here 18.438 divided by the mean square error which is again 0.3675, it is constant for all the effects and their interactions and that comes to 50.17. This is obviously higher than 5.32 and hence we reject the null hypothesis. So what we are doing here is we are finding the F value, we are comparing it with the critical value, the critical value is 5.32 and then we either accept or reject the null hypothesis. And that is one way, we know the critical value and if the F value is much, much different from the critical value, then we are confident that we made the decision correctly. But if the critical value is 5.32 and the F value is 5.33 or 5.30, then it is a marginal case. So it is very difficult to confidently either reject or accept the null hypothesis. But usually we do not get values very close to the critical value, we are able to reject the null hypothesis or accept it by a comfortable margin. But anyway we will be finding the P values and then we will see, very rarely you will find the P value to be 0.051 or 0.049 and so on. So when you look at all these, it appears that all the 3 interactions, all the binary interactions AB which is alias to the CD, AC alias to the BD, AD alias to the BC, all these binary interactions are significant by looking at the F values. They are lying in the rejection region and hence we can reject the null hypothesis which says that these binary interactions are insignificant. We are rejecting the null hypothesis which says that the binary interactions are unimportant. So among the different effects we have considered ABD, AB, AC and AD, only C was considered to be insignificant and all other effects are significant. We are not looking at ternary interactions ABC here because that ternary interaction ABC is alias to the factor D, ternary interaction BCD is alias to the factor A, B is alias to the ACD. So we are having ternary interactions and these ternary interactions are also alias to the main factors. So we are not able to obtain the ternary interactions separately. So we can write down the model equation and the model equation is given by this form. Here you have the intercept and this is due to factor 1 or factor A, due to factor B, factor C or factor 3, factor D, interaction between A and B, interaction between A and C and then interaction between A and D. So based on the ANOVA table we have seen the beta 3 X3 is a candidate to be removed from the model because factor C is not significant. Well, I have not given the p value in this table, it does not seem to be there but you can calculate the actual p value corresponding to the f value by using the excel or another spreadsheet application or you can go to the probability tables for the f distribution and try to find out the p value. That would be difficult because finding the actual probability value using the f distribution is somewhat difficult. Only standard values are given for f tables like 0.05, 0.025, 0.1, 0.01 etc. Only selected limited values are given for the probabilities but if you want to find a probability corresponding to a certain f value that would be difficult. For example it may be 0.31 whereas you are having only 0.05, 0.1 so what you want to find out probability of 0.31 you would not be able to get. So in this sense the spreadsheet becomes a very valuable tool. So what you do is simply put in excel f dist 300, numerator degrees of freedom and denominator degrees of freedom. So for example I may put f dist 300, 1,8 and I will get the probability okay and that probability value will be very small because the f value is so high and it is lying well inside the rejection region. So that is how you find out the value of p. I request you to try out a few cases using your spreadsheet or any other statistical software that you may have access to and find the p values. So plugging the values which we found we have for the fractional factorial design the predicted y is sum of 25.03 plus 5.25 divided by 2x1 plus 6.56 by 2x2 plus 0.25 by 2x3 minus 5.4 by 2x4 plus 2.147 by 2x1x2 minus 3.307 by 2x1x3 minus 5.45 by 2x1x4. So except for factor 3 or factor c all others are present in the model and also you are dividing the effect by 2 to account for the jump from minus 1 to plus 1. So the effects are divided by 2 here and they are the coefficients of the model equation. So these are the values and you can also see the coefficient for factor c is much smaller than the coefficient for other terms and since these factors are coded as minus 1 and plus 1 their actual values do not matter and so the coefficients are also on the same basis that is another advantage of coding your experimental data. But without doing a proper analysis of variance we should not conclude certain factors are important and certain factors are not important even after coding because it may be a point not 5 case or point not 4 case and we might have arbitrarily accepted null hypothesis and rejected the term from the model whereas it may be a marginal case. So to be very sure and quantify your answers without any subjectivity it is important that we do the ANOVA table and for doing the ANOVA table you need to have some estimate of the error variability. Sometimes if you do not do repeats you think that I can look at the model equation or inspect the model equation then find out which terms are important and which terms are not important that is not recommended. Do repeats so that you get an idea about the error variability and then you can carry out the analysis of variance where the mean square of a particular effect is divided by the mean square of the error and then you can make the correct conclusion and also quantify why you rejected some factors and why you included or did not reject the other factors. So this is very important for you to do repeats and hence find this error variability do the analysis of variance exercise and get the p values or see whether the f value statistic is lying in the critical region or whether it is lying in the acceptance region or in the rejection region. So we have the aliasing table here where we list the source of variation which we could detect in the ANOVA table even though we are talking about factors A, B, C, C was found to be insignificant and D, A, B, A, C and A, D we are also talking about the aliases as well and that is what the statistical software will also report. So you have B, C, D, A, C, D, A, B, D, A, B, C, C, D, B, D and B, C. So here if factor C is insignificant we may automatically think A, B, D also would be insignificant. Montgomery has an interesting discussion on this. What he says is the combination C and A, B, D is insignificant but this combination may involve a very high value of C and a very low value of A, B, D so that they are even though they are individually powerful when they combine they become weak because C is highly positive and A, B, D is highly negative and when you take the combination of these 2 the net effect may be insignificant, may appear to be insignificant and C on its own and A, B, D on its own perhaps were exercising very strong effects and but in the opposite sense and hence you cannot afford to ignore them that is one argument somebody may put forth but that is very rarely the case Montgomery also says that usually the simple explanation that both C and A, B, D being insignificant is usually the correct explanation for this even though you can find out some special or extreme cases where this may not be applicable. So the net effect is if factor C is found to be insignificant from the partial factorial design the factor aliased with it here this A, B, D may also be deemed insignificant but we never know whether that is a correct assumption or not until we carry out the second fraction also. So to repeat the same thing main effects are aliased with 3 factor interactions and 2 factor interactions are aliased with the other 2 factor interactions. To emphasize factor C was considered to be insignificant and it is likely that A, B, D is also insignificant because rather unlikely that A, B, D is large and negative and C is large and positive so that they nearly cancel each other out as said by Montgomery 2009. You also look at it factor C is insignificant but the binary interaction involving C namely A, C was significant that let us verify. We saw that factor C was insignificant very high P value so the null hypothesis that C is not having effect was accepted. So you have the C case and the F0 value was very low 0.68. On the other hand A, C is having a F0 value of 119.03. So somebody may ask naturally if C is insignificant how come AC is exerting such a large effect. If I expect if C is having a large effect then I can expect AC to be having a strong effect because we know A is also having a strong effect but C is not having a strong effect or it is insignificant then how come AC is showing up so strongly something is not correct. Actually if you recollect our discussion AC is aliased with B, D so rather than the effect of AC showing up as significant it is the contribution of B, D which is actually contributing to the high value of the F okay. So the high value of F is 119.03 AC factor is also aliased with the B, D so it may be the B, D interaction which is actually showing up and as a sufficient sum of squares so that the F value is quite high. So to repeat the contents of the slide factor C is insignificant but binary interaction involving C is significant whether A and C's combinations somehow contribute to high interaction on their own or the alias of AC namely B, D is responsible for the high interaction will only be known after the complete factorial design is carried out. So factor C is insignificant we may ignore C and consider the design as a 2 power 3 design involving only factors A, B and D. So what we are doing is we are using the first fraction itself involving 8 runs as 2 power 3 design involving A, B and D. Let us see what results we get the carrier fluid does not have an effect on the absorption on the absorption rather and hence we ignore it in our analysis. When you ignore C you get the following effects you find that C is not there and you have ABD you have AB, AD, BD, ABD and you get the values and the coefficients and the sum of squares. Now let us compare the partial 2 power 4 design with the truncated 2 power 3 design where C has been deleted and the results are not as surprising as they seem to be okay. The effects of A are identical, B is identical, D is identical you have AB that is also identical and AC instead of AC you are having BD and the values are also identical. AD you have same-5.45 and instead of C you have ABD which is 0.25 that is because it is a same response and instead of considering C you are deleting it completely and then you are putting factor D and since C is no longer in the picture you are having BD and you have ABD and these were exactly the terms which were aliased with AC and C. BD was aliased with AC and ABD was aliased with C and even the sum of squares are identical as you may notice from this slide. The total sum of squares is also identical. So essentially the truncated 2 power 3 design is telling us the same thing as the partial 2 power 4 design but it is giving prominence to the aliased factors for involving B and D and AB and D because C is no longer in the picture okay. That was a brief interlude which you can do without doing any additional runs using the statistical software. I used Minitab here and now the next and final step is to do the complete 2 power 4 design. We already have seen once you have done the first fraction based on I is equal to ABCD or D is equal to ABC generator we can construct the next fraction based on I is equal to-ABCD. In other words we have to look at ABCD column in the design matrix and then take all the minus ones in the ABCD. This we have already seen previously I am just bringing it to your attention. So whatever plus ones we had in ABCD constitute the first fraction and those settings are given in blue colour. The remaining would be all minus ones and those settings are also given in the red colour and those settings would be now experimented in the second half. All these are minus ones so we will perform experiments corresponding to the settings where minus one is present in ABCD. So we can use I is equal to-ABCD or D is equal to-ABC and conduct the experiments for the second half using the settings listed in the table I showed a couple of slides back and then we can analyse the results. So this was the first fraction even though they are coded in red and blue please do not confuse them with the first and second fractions. I am just showing the first repeat in red colour and the second repeat in blue colour for the first fraction maybe I should have used different colours. So this is the first repeat second repeat for the first fraction then you look at the second fraction. Second fraction are corresponding to the table entries of minus one in ABCD column and so these settings are the same but they are again repeats and you can see the responses given here. D is equal to-ABC let us check it out this is D that is-ABC so this 3 will be minus one but you are having plus one here that is because D is equal to minus of the product of these three entries the product of these three entries would be minus one so minus of minus one would be plus one and if you look here D is equal to-ABC ABC is here plus one but it is minus one here because D is equal to-ABC so that is fine that is consistent everything is correct. So you are having the second fraction given here and results from the second fraction we can analyse them separately and you find these values reported here now by now I think you should be confident to carry out these calculations on your own and the results are anyway here so you can carry out the calculations and verify. What you do is you find the values of the effects and then you divide by the two to get the coefficients you also find the sum of squares and the second fraction you have ABC D AB AC AD you do not have the three factor interactions because they are allias to the main factor interactions and the two factor interactions are allias to the other two factor interactions and so you have only these entries here. Now what I have done is combine the two fractions first fraction and the second fraction the moment I combine the first fraction with the second fraction I get the complete 2 power 4 design and hence I can now analyse all the effects. So here you have A, B, C, D you also have A, B and C, D you can see A, B and C, D you are also having the alliest effects that means even the alliest effects in the fractional factor design are showing up in this plot means this is the full design there is no allicing here because this is a full design combination of two fractions and you have B, C, D which means it was earlier in the fractional factorial design B, C, D was alliest with A. So now you are having A separately and B, C, D also reported so this is a full design and you can see in the full design so many of these are insignificant only a few 3 plus 3 6 of the total number are actually significant this itself is giving us a clue that performing the fractional factorial design was a wise idea. So the results from the full factorial are represented here the value is given here the effects value is given here and the coefficient is divided is obtained rather by dividing the value by 2 and the sum of squares are also included the sum of square values are now higher because you are combining the two fractions to get the full design and so I gave you the values for the single factor and the binary interactions now you have the ternary interactions and the full A, B, C, D interaction you can see that the values are pretty small the coefficients are divided by 2 and hence even smaller and more noticeably the sum of squares are pretty much negligible when compared to the 100s and 30s you had previously you can see that the sum of square values are very very small you also have the error sum of squares which is 3.359 and when you do the ANOVA we get the source of variation degrees of freedom sum of squares means squares in the F value and as before we find the critical F value corresponding to 0.05 level of significance 1 degree of freedom in the numerator 16 degrees of freedom in the denominator corresponding to the error degrees of freedom we get 4.49 we compare these F values with 4.49 and see which are lying in the acceptance region and which are lying in the rejection region. So factor C is lying in the rejection region this is old news we already saw it from the first fraction and now we see that BC is also lying in the rejection region and so is CD. So any binary interaction associated with C is also lying in the acceptance region okay. So coming back you look at C, C value of F is pretty small at 0.71. So this is lying in the acceptance region all the other terms listed here all the other F values listed here are lying in the rejection region. Now when you look at it again you find the binary interactions associated with C namely BC and CD are having pretty low F values. So they are all lying in the acceptance region what is the critical F value critical F value is 4.49. So even AC is having a low F value it is lying in the acceptance region that means you can accept the null hypothesis that the binary interaction between A and C is negligible. Again the AC interaction is involving the factor C. So any binary interaction or even ternary interaction with C becomes negligible from this analysis BC is negligible CD is negligible but other binary interactions are significant okay. So ternary interactions are also there and you can find that all the F values are pretty small and the critical F value is 4.49. So all these things the ternary interactions and quaternary interaction will vanish. As a matter of curiosity what was the error sum of squares previously and what is the current error sum of squares. So previously the error sum of squares was 2.9397 and now the error sum of squares is 3.359 that is increased because we have considered the second fraction also. But it is not linear it would not be exactly half the error in the first half and exactly half the error will be coming from the second half it is not like that because error is random so you cannot predict it how it is going to behave. So the error contribution from the second half seems to be lower than the error contribution from the first half but this is expected and so you have the full equation here which is pretty huge and you can identify all the binary, ternary, quaternary and even single factor contributions. So I just plug in the value of the coefficients in this model equation and so this table compares the coefficients not the coefficients the effects of the different designs. So these are the effect values for the full 2 power 4 design and these are the effect values for the fractional 2 power 4 design the first fraction and the truncated 2 power 4 design. So you have 5.325, 5.25 and 5.25 these values are matching and this value is different slightly from the fractional factorial design C is 0.138 and this is 0.25 but anyway C is insignificant so we do not really care about what number it is taking D is minus 5.545 for the full factorial design and minus 5.40 AB is 2.206 and this is 2.147 AC the values are quite different. The fractional factorial design wrongly attributed so much of effects negative effect to AC but the full factorial design sets the things in order by saying that AC is not significant it is only contributing to 0.19 but it is BD which is the factor aliased with the AC which is contributing to so much of effect. But anyway we could have even guessed it with the fractional factorial design alone because in the fractional factorial design C was shown to be insignificant even C is shown to be insignificant AC is also likely to be insignificant and hence this was this could have been attributed even in the fractional factorial design to the binary interaction BD because AC is aliased with BD. So that is what you get and then you have AD interactions which are comparable BC anyway is was not considered in the fractional factorial design because BC was aliased with AD, CD was aliased with AB okay. But anyway even though we did not find these interactions they were eventually proven to be insignificant in the full factorial design. For example BC is insignificant, CD is insignificant they were not detected in the fractional factorial design. The truncated design also found BD because we ignore the factor C in that analysis and that interaction effect value is comparable to the full factorial design for BD. So I think now we are coming to the end of the story all the ternary interactions and the quaternary interactions are insignificant as shown here okay. So we also now can compare the sum of squares and these are listed in the table here. The full factorial design has a larger sum of squares because it is using both the 2 fractions 2 half fractions and you can see that the full factorial design gives very less sum of squares to the ternary interactions and the quaternary interaction. So now we can list the fractional factorial design model and the truncated design model. Everything is identical because the same data was used but only thing is since factor C was ignored instead of AC interaction you are showing it as BD interaction but the coefficient is 1 and the same. The same data is used but C is thrown out and we are considering only AB and D. So instead of AC the aliased effect BD is represented here and the full factorial design model is given here. After checking out the insignificant terms we get this model. So now we can compare the fractional factorial design model with the full factorial design model. These 2 are comparable, main effect A is pretty much there, main effect B is also pretty much comparable because the main effects are aliased with 3 factor interactions we do not have to worry about them. So they are pretty accurately represented even in the fractional factorial design model and then you see factor D it is also accurately represented. For C was anyway thrown out because it was not significant either in the fractional factorial design or in the full factorial design it was thrown out. And then you see the AB interaction they are comparable. Only problematic case is AC interaction and BD interaction they are not accurately represented here but that is AC is AC interaction is aliased with BD interaction. But when you know that C is insignificant you can say that BD interaction is actually shown up as AC interaction in the fractional factorial design otherwise the fractional factorial design did a very good job. I request you to go through this even the fractional factorial design would have suffice if you could assume that C is insignificant so would be AC. You could have concluded this from your process knowledge and hence instead of taking AC even in the fractional factorial design you could have taken the aliased effect BD as the main contributing binary interaction in the AC BD combination. AC binary interaction BD binary interaction even though you are finding out AC it is actually the aliased factor BD which is showing up. Because you could have probably guessed from your process knowledge. So we will conclude now and we have covered quite a bit of ground I request you to go through the problems try to solve them on your own to the extent possible or even check some of the calculations partially with the results shown here. This way you can develop confidence in your problem solving and analysis technique. See what else you can think about the fractional factorial design and see how it may be modified to suit your purposes. What would have happened if you had taken one quarter fraction of a 2 power 4 design? What effects you would have identified? What effect probably could have been omitted even in the quarter fraction? So please do these calculations and see what results you are getting. Thank you for attention.