 Welcome to dealing with materials data, we are looking at the collection analysis and interpretation of data from material science and engineering. And we are in the module on data processing and in this session we are going to learn about estimating the mean and the mean squared deviation of data. So these are the properties of data, the mean of the data basically tells where the, if you assume that the data is normally distributed for example, it tells you actually what is the value about which you see a spread. But if you do not assume anything about the distribution of the data, the mean also tells you the likelihood estimate or maximum likelihood estimate for the given parameter. Mean squared deviation basically tells you the spread of the data and root mean squared deviation is just root of this quantity, so it is also a measure of the spread of the data. So mean is very simple, so you add all values and divide by the total number of data points. So if you have data with unequal statistical weights, then you have to make sure that you weigh by the weight and then take the average. In fact, for data with unequal statistical weights, any average calculation is to be weighed by the weighting function W i and normalized by summation W i and if you take the this W i is themselves normalized in such a way that summation W i is 1, it is just summation W i X i. We will see an example, we will look at the cluster size frequency data that we have digitized and used to look at taking means of this type. Mean squared deviation from average is nothing but for every value you subtract the mean, you square, so it is mean squared deviation, it is mean because you then take the mean of those values. So X i minus mean whole squared is the squared deviation from the mean from the average and you are taking a mean of that by just dividing by the total number of data points. And root mean squared deviation from the average is just the square root of this value, this is msd and so rmsd is just square root of this. So let us do this for the data that we have, let us start R and let us start with this exercise. So I am going to do the DP conductivity data and mean is of course 101.32, that is straightforward. You can calculate the, so what did we do? We took the conductivity and subtracted the mean from every value, we squared it and we took the sum of all those squares and divided by the total number of points. So that was the mean squared deviation and of course you can also print the root mean squared deviation. So you can see that 0.0096 is the mean squared deviation and 0.09979959 is the root mean squared deviation and how do we do this for data, let us do that exercise also with different statistical weights. So that is what we want to do, first let us calculate the mean, the average. So we need to read the data and we need to decide how many data points are there and then we need to give the weight, the frequency and then we have to divide by the sum of the frequency because this is WI, the frequency is the statistical weight and we are going to add them all up so that we will normalize it. So this is the way to calculate the average. So this is 239 and because the data as you have seen, so if you plot, so you will see that the peak is somewhere around 240, so the average turns out to be 240. So that is expected. Now let us calculate the mean squared deviation and root mean squared deviation. So this is another way you can get the weights themselves normalized first so that they add up to 1 and of course you will get the same number because it is just the algebra, nothing else is different. So now let us calculate the mean squared deviation. So how do we calculate the mean squared deviation? You can see that it is the same way so we take each x value and subtract the average and square the value but now this has to be weighed by the weighting function and weighting function is something that we just now calculated. So we are and remember it is the other type of weighting function so I took all the frequencies first I summed all the frequencies so divided by it so I have the weighing factor. This weighting factor is what I am going to use so this is the statistical weight for me to do the calculations. So mean squared deviation is nothing but mean squared deviation and this sum W is going to give 1 so that is the mean squared deviation. So I see mean squared deviation is 1020 and root mean squared deviation of course is nothing but square root of mst. So you get about 32 as the spread of this data which is what is given here also. So to summarize in this session we have seen that you can take data it could be raw data or it could be data with unequal statistical weights that you got from some analyzed data. In both cases you can calculate the average and you can calculate the spread of the data by looking at how far away from the average the data points lie and so we use mean squared deviation and root mean squared deviation to get this spread of the data. So we have shown that for two cases one is copper conductivity the other one is the particle size of titanium aggregates that we have taken from the literature. So we will have more such exercises during this week sessions for you to become familiar with this kind of analysis. Thank you.