 Welcome to dealing with materials data, we are looking at collection, analysis and interpretation of data from materials science and engineering. We are in the module on fitting and graphical handling of data and we are going to look at data which has a known functional form in this session. So we have looked at data which we know to be linear and we know how to fit that. We also know of data which has power law or exponential type of dependence and we can turn them into linear by doing logarithm, transformation on that expression. And so you can use the transform data and then do the analysis. And here is a third case which is also very common where we know that there is a explicit functional form. That functional form might not be a power law, it might not be exponential form or it might not be linear. Now how do we deal with such data? That is the exercise we are going to do. For this exercise we are going to consider the variation of specific heat of copper with temperature in the low temperature regime. So we are going to consider between 0 and 20 Kelvin how the specific heat of copper varies. So we are going to take this data it is from heat capacity of reference materials copper and tungsten by white and collocut. And this data we are going to take and first we are going to plot the data and look at whether we can guess the functional dependence and then knowing what the functional dependence is. It is known that Cp is At plus Bt cube whether we can determine A and B. So these are the two things that we are going to do with this data. So let us do the exercise. So let us start or we are in the right directory. So let us do the first exercise. So we read the data which is the specific heat, copper specific heat at low temperatures data. It is in a CSV format. So I am going to store the specific heat in the variable Cp and temperature in the variable temp and because we know that the data fits to At plus Bt cube. So I am going to make the t cube as transformed into another variable Z. So in terms of Z this becomes linear. So we are going to fit a linear model for Cp. It depends on temperature and the temperature cube but that we have turned into a linear variable now. So in terms of Z it is linear but in terms of temperature it is actually cube and we are going to generate a sequential, a sequence of numbers from 0 to 20.2 in 0.01 and use this fitted coefficients to calculate this is A plus Bt plus Ct cube. And once we have that we are going to plot that in red color and we are going to plot the points that we have got from the experiment. So you can see that you have the data and the curve goes through very nicely and you can do the plotting the residuals. So you can see that the residuals are really not random and they have some nice functional dependence. The error in this data is not normally distributed. Of course you can look at QQ norm also and you will find that that is the case. So you will see that it is really not a straight line. So there is some systematic error in the data that you can make out by doing this analysis and you can look at fit and you see that there is an intercept that it is giving. So this is the coefficient for T, so that is the A and this is the coefficient for Z which is the coefficient for T cube. So it is At plus Bt cube, A is minus 1.041, 10 power minus 3 and B is 5.897, 10 power minus 5. This coefficient is very small because it is multiplying T cube. For example at 20, T cube is like 20 into 20 into 20. So that is of the order of 8000. So if you multiply by this small number, so this number looks small but because it is multiplying a large number it will still give you a contribution. So that is what is happening and you can do this exercise. So you can, when you are plotting this line that we were plotting, you can see by leaving out this for example. So fit coefficient 3 is very small and so if you plot how does it look? So you can see that it looks very different. So that is obviously because even though the coefficient is small it does contribute because the multiplying number is very large. So you can play around like this just to get a feel for these but we do see that it gives an intercept but we know from our theory that the specific heat should fit to At plus Bt cube. At T is equal to 0 we should not get any intercept. Now how do we tell R that it should not put any coefficient? Because we tried T plus T cube but it is adding the constant. So I have to explicitly say that the constant should be 0, well it is very easy. You just have to add a 0 to say that if there is a constant that is 0. So if you do that, that is the only difference between the two scripts that we have written. So this script is exactly the same as our earlier script except that now it is temp plus Z plus 0 to indicate that the intercept should be 0 and so there are only two coefficients. One is for temp, other one is for temp cube which is called Z in our case. So you can do and of course you find that it fits and fit parameter you can look at that there is only coefficient for temp and these coefficients are different from what we got in the earlier case. See the temperature coefficient now is very different from what we had when it is 1.04 temp minus 3 which has become 2.2 temp minus 4 and that is because there was an error that was coming from the intercept which should have been 0 which was not 0. On the other hand I think the parameter that multiplies Z it is 5.9 here and it is about 5.7 here. So it is relatively small but because again this number multiplies T cube even small changes here can make a difference so this is a much better fit. So we now know exactly how to fit to a given polynomial form. So we have fitted to at plus bt cube and sometimes for example there are other functional forms that are known to fit for specific heat at other temperature ranges. So sometimes it is at bt squared ct power minus 2 and things like that. So we will do one more exercise and we will see in all these cases if the data is perfect then you fit it for example even you can fit it to a third order polynomial and it should give 0 for the intercept and these temperatures squared term. But that is not going to happen because there are errors in data. So if you try to fit it will generically try to fit and it will give you some numbers for those quantities also which should in principle be 0. And in those cases it is useful to explicitly put the functional form. This is something that we have seen in the other case also where theoretically we know that it should be t to the power 2.45 but if you just try to fit for generic b it will give you a number and that number was something like 2.9 so it was very different from 2.45 20% of. So in order to make sure that you do not make such errors then you fit it exactly to that function like in this case we said intercept should be 0 so you can demand that it should be 0 and if you do that then you will get the right parameters. So to summarize we are looking at fitting this is to take the data and there are independent parameters that we change and we look at the parameter that we measure. And there are errors in this process and taking this into account can we say anything about the functional relationship between the independent variables and the measurements is what we are trying and we have looked at several cases linear and quantities that can be turned into linear form and general functional dependence and how to do the fitting. And there are more data that we are going to use like I mentioned at the beginning of this module there is lots of data on copper in terms of its grain size and temperature and cold work percentage etc and what is their effect on the strength of copper for example. So we are going to look at such data we are going to do more of this analysis more of this fitting and in general the fitting itself as we have mentioned could be because we know that there is a functional form so we want to get the parameters could be because we do not know the functional form so we can we guess what the functional form is going to be from the data and sometimes we are not even interested in actually getting a functional form or this thing it is just a way of making a simple mathematical model which mimics what happens in the real experiment. So this you can use it like a tool that I have done some 20 experiments I know the data based on this I am going to predict if the input parameter comes something what is going to be my output parameter. In that sense also this is very powerful and these are the techniques that are useful for building statistical models for making predictions which we are not going to get into in this course but when you get into those problems like machine learning or artificial intelligence and things like that they are all based on these ideas so there is this idea of fitting a statistical model and that fitting is done in some sense by optimizing the error or the deviation of the data from the fitted line so it is a combination of this statistical tools and optimization tools that actually makes it possible for us to build such predictive models so regression or this fitting exercise happens to be the very basic statistical concept that we need to understand to better understand this model so we are going to be spending some more time we will do few more exercises to look at this fitting and graphical handling of data before we move on to some case studies to close this course thank you.