 Welcome to Dealing with Materials Data. This course is about the collection analysis and interpretation of data from material science and engineering. We are in the module on using R to do descriptive statistics and we have already looked at ways of looking at data, visualizing data and preparing rank based reports of the data and presenting them as well as preparing summary based reports on data and presenting them. We are going to continue with representation or presentation of data and this session we are going to look at presenting experimental results in which the data is presented with error bars. Sometimes we know that the data is obtained with given error bars. How do we incorporate this information when we present the experimental results as well as in interpreting results without the error bars if you interpret you might make wrong interpretations. So we want to look at these two aspects. For doing this we are going to go back to the conductivity data from the electrolytic tough pitch copper from the thesis of Dr. Harshavardina and this data consists of 5 columns. This is the percentage deformation that was given to the sample 2.9 percent, 8.7 percent, 12.1 percent etcetera up to 60 percent deformation and in the deformed state the conductivity measurements were made and the mean conductivity and standard deviation is given and then all these deformed samples were annealed for the same amount of time and at the end of the annealing treatment these samples are taken again the conductivity is measured and the mean conductivity and standard deviation is given. And in each one of these cases 20 measurements were taken to calculate this mean conductivity and standard deviation. So the experiment was repeated, the measurement was repeated 20 times to make sure that you get the proper conductivity and standard deviation. So each of these values is a result of 20 measurements. You deformed by 2.9 percent, you make 20 measurements, get these quantities, anneal the sample, make another 20 measurements and get these values. So this is the data that is there and it is stored as a CSV file which is what we are going to load and use and as you can see the mean conductivity of deformed samples is decreasing and annealed samples remains more or less constant and there is large conductivity that is reported for 60 percent deformed samples in both the deformed state and the annealed state as compared to the other values. So we will look at this aspect also as we move along. So you can plot data with error bars and scatter plots help identify the outlier and you can also ask the question why the measurement is not consistent, right, why the measurement is not consistent that as it is deformed this is going down and suddenly you find that the value has increased and is this even meaningful, I mean can you get in 60 percent deformed sample conductivity which is much higher than some sample in which you only had some about 3 percent deformation, right. So this is a question that we have to look at and to answer these questions, so we are going to look at this aspect. There is something known as skin thickness and this skin thickness is given in millimeters and it is given by this formula it is approximately equal to 664 divided by square root of f mu r sigma where f is the frequency of the eddy current probe and in this case it was 60 kilo hertz mu r is the relative permeability of copper so this you can read from tables this is some 5, 9s and 4 so that is the value and sigma is the conductivity of the sample in percentage I SES, right and we know that the conductivity of copper the ETP copper that we are dealing with is somewhere around 100, 102 etc. Let us say that it is about 100 I SES then delta is of the order of 0.3 millimeter if the sample thickness is comparable to this value then the results of eddy current measurements are not reliable and the reason why in 60 percent deformed samples we are getting meaningless numbers and inconsistent numbers is because given our initial thickness when we made 60 percent deformation on these samples the thickness became comparable to this skin thickness and so this information I mean that we need to go look at this value and find out why it is not consistent came from looking at the data and once you pay attention to such outliers and try to understand why such outliers exist of course you learn a little bit more about the experiment and the data and the material and so on and so forth. So it is very important to look at the outliers and try to understand them and try to make sense out of such results because everything else being the same if you make 20 measurements and if you get a number like this of course you can also see the standard deviation also is relatively high but that is high for all 30, 40, 60 right for example even in annual samples these give relatively higher standard deviations but the numbers themselves does not make sense and that is because of this reason. So it is very important to pay attention to outliers and learn from them okay so all plots when the information is available should be plotted with error bars if you have information on standard deviation which you would generally have you should always put the error bars in the plots it is incorrect to plot without error bars and I will show you why and how and one should always pay attention to outliers and when we look at trends in the plots we should look at the information that is coming from error bars also to interpret the results. So this is what we are going to learn from this session and we are going to learn that by doing the analysis in R. So let us do that okay so it is version 3.6.1 and we are in the dealing with materials data directory so we are ready to start okay so let us get the data read and let us plot it import the data so this imports the data this is copper deformed and annealed copper conductivity and we are you going to use ggplot and as usual to ggplot you have to tell which is the data what is it that you are trying to plot in the data so let us open the data file and see to understand what is it that we are plotting. So it is percentage deformation, mean conductivity of deformed samples, standard deviation of deformed sample, mean conductivity of annealed sample and standard deviation for annealed copper so this is the data so what we want to plot is percentage deformation and the conductivity mean conductivity right. So this is for annealed sample so it is percentage deformation and after annealing what is the mean value and we are going to put points and typically you will see that students also join the data points with a line so I have also added a line and then x axis is percentage deformation y axis is conductivity in percentage is and it is title deformation versus conductivity. Let us do this so if you see then you get a plot like this and I have seen sometimes this kind of plot being interpreted as follows so initially the conductivity remains a constant with deformation these are annealed samples and then it decreases then it increases then it decreases and then it increases a lot if you just look at the data and if you just join them by points and that seems to be the way the data is behaving. However these interpretations are wrong because we know that irrespective of what the deformation was if the sample is well annealed the conductivity should remain more or less a constant it is not clear why suddenly there is an increase but these all should be the same value that is the expectation from knowing the physics and material science of this system and that information you can also see if you plot with the error bar right so that is what we are going to do let us do that okay. So we have read the data so it is not needed so we are going to use ggplot data is x and we are going to look at the deformed sample first so it is x1 versus x2 and we are going to put points and we are going to label the xy axis and we are going to give a title its deformation versus conductivity and the data points are plotted with error bars and what is the error bar so we are going to take the standard deviation plus or minus about the mean right and so that is what we have done here and let us look at the plot. So you can see that the conductivity with deformation decreases and these are the error bars and you know this decrease is for example these two there is some overlap but this and this for example or this and this for example or this and this for example have certainly decreased so you can see a trend that the conductivity is decreasing with increasing percentage deformation in deformed samples except that 60 percent shows really large value more than even something that is deformed only a few percent so it does not make sense anyway so that is 1 and let us do now for the annealed samples and it is the same kind of plot. So we have the data and we are saying 1 to 4 and the error bar is given by the standard deviation which is in the fifth column and we are plotting with geometry. Now with the error bars you can see because the error bars are spanning all these points except these two where here also it is touching so except for this point which is just touching you can see that more or less the conductivity remains a constant which makes sense because that is what the expectation is. So after annealing we see that irrespective of what percentage deformation you gave annealing got rid of all the effects of that and you have this conductivity to be a more or less a constant and of course 60 percent is giving you something that is not meaningful right. So like we mentioned and one has to so here this is clearer so the data is somewhere here and one should consider this to be a straight line. There is a little bit of deviation here maybe it is like this so it is a straight line with small deviation here. So why is the 60 percent deformed sample showing different value like I mentioned it is because of this. So let us calculate that delta and that delta was 664 divided by square root of 60,000 that is a 60 kilohertz and the mu R and the conductivity is about 100 right and the delta value is about 0.3 millimeter right. So it is about 0.3 millimeters so it so happened that in this case in the 60 percent deformed case the thickness of the sample is about 2 or 2 and half times this 0.3 and that is the reason why we got this outlier. So we now understand why this happens and of course the conductivity is not really 100 we know that because if you have a deformed sample let us say 60 percent deformed maybe its conductivity will go down to 3 percentage IACS points and very well annealed samples can also give you above 100 like we had a mean of about 101 but sometimes you can also get 102 etc. So this leads us to the next question. So we found an outlier we found an explanation for that but it so happens that one of the quantities that goes into that formula is not quite correct it is not just a number it has some uncertainty or error and what is the effect of that error on this quantity that we are calculating right. Because we just used 100 we could have used 102 we could have used 98 and if somebody has some slightly impure copper and they have let us say the conductivity of that copper alloy to be some 40 percent IACS then what is the skin depth right. So if there is a large variation in this quantity what happens to this quantity or if there is some small error or uncertainty in the measurement what happens to this quantity. So this is the next question that we want to address and that we will do in the next session. Thank you.