 Welcome to dealing with materials data, we are looking at the collection analysis and interpretation of data from material science and engineering. We are in the module on fitting and graphical handling of data. In this session we are going to discuss analysis of variance ANOVA. We have already learnt about ANOVA from the other sessions in this course, but this session we will learn how to do ANOVA using the computer using R. What is ANOVA? You know that if you do some experiment, let us say the same experiment by 3 different labs and each lab does it more than one, so they will have their own mean value for the experiment and different labs will give you different means. What we are testing is that these observations are they the same statistically speaking or if the differences between the different labs is statistically significant and if they are different how different. So this is done using one way ANOVA and you have seen an example and I will use the same example but I will just show you how to do it using R. The labs are named as A, B and C and the measurement is uniform that is the uniform sample and same type of equipment was used by all the 3 labs but only their calibration was independent each calibrated according to their own standards and then did the measurement. And what measurement they did is the measurement of composition of 2 alloy components, let us call them as X and Y. So you have 3 labs A, B, C and they measured 2 quantities X and Y and we are trying to see if the measurements of the 3 labs of these 2 components X and Y are statistically same or different. So this is the data, so the lab is A and then B and then C and component X the measurements from A is here and here is B and here is C. Similarly, component Y there are 3 measurements from A, 3 measurements from B and 3 measurements from C. So let us take this data and try to do the analysis, we are going to do very simple one way ANOVA but there is a good book called practical regression and ANOVA using R by Julian Farraway and it is available for free at the R site, so you should download and take a look at this book. We will also go from here to design of experiments and ANOVA but as a case study in the next module. So let us do this analysis and for doing that start R and let us get the. So here is the analysis for the component X, the lab is 3 ABC and we have to give the data also in the corresponding order ABC, ABC because that is how the data will be read. If you look at the way the data is represented in this table, it is all A first and all B next and all C third. But we are giving data as ABC, so we should give 1, 1, 1 and 2, 2, 2 and 3, 3, 3. So this is important because if you do not give it in the right order then you will get wrong results, you can do this as a test also for yourself and you can see that the values are given 5.59 and then 5.38 and then 5.64, so that is the first value for C, then 5.68, 5.76 and then the second value 5.64, 5.72 and then there is 5.53 and 5.59 and 5.74. So these are the values that we have read and the idea is very simple. So develop a statistical model which connects the values to the labs and then do ANOVA on that model. So you get this F value and basically the F value whether it is small or large tells you whether you can accept that within the statistical differences these data as the same or that within statistical difference these data are significant. Let us do it for Y, it is rather trivial, it is the same type of exercise. So the numbers have to be fed correctly and if you feed then you call for the statistical model and do an ANOVA and you get the analysis of variance table. So you can see that F value is 0.86 here, 18 here, so probably in this case the differences are significant and probably in this case the differences are not significant. So this is something that you have already seen in the other part of the course, this is just to show you that in R you can do and you can do it with just one line command and the book by far away gives you more information. So to summarize fitting leads naturally to analysis of variance because different labs can do the same experiment and do the fitting and give you the data. Remember when we were looking at the data that was collected by NIST from different sources they did not do this, they mixed everything and then looked at the entire data and made an average out of it. However, because different labs measure different ways and there could be small systematic differences between those measurements or statistically they could be really significantly different. That is not something that we did when we took the copper data, we just took all the data from all sources, mixed them all up and did an analysis. That could be sometimes not quite correct and you have to do careful experiments and establish that statistically speaking the different labs or groups actually give you the same value and the way to do that is to use ANOVA. In the case of NIST data of course the raw data is not available, so what was reported in the literature was what was taken and the analysis was done. So you can do that kind of analysis also and we will also see in our case study also that you can look at the literature, take the numbers and do the analysis. But if you want to compare across different groups and labs it is also essential or if you want to set up a standard that if you want to use the values from any lab and then decide whether it is acceptable or not and things like that. So it is very important to do this kind of analysis and with R of course you can do and you have learnt the idea behind this analysis in the other part of the course. So we will, we have come to sort of end of this module, we will take up some case studies to better fix the ideas that we have learnt over these 5 different modules in the next module. Thank you.