 Welcome to dealing with materials data, we are going to look at collection and analysis and interpretation of materials data, specifically we are going to look at how to use R to do these things. And this is the first module, this is the module to introduction to R and we are coming almost to the end of this module. So I want to do a few more data sets and case studies using R to look at the data, import the data, plot the data or manipulate the data and save the figures and save the data and so on and so forth. So that is what we are going to do in this session. So this is a session on case studies in importing and plotting data. Once you import the data, you can manipulate the data, you can manipulate the data both the numerical data and the strings. So I am going to show you an example of how we can do that and we can save the modified data. So sometimes after you have done the manipulation maybe you want to save this data and you can save it as a CSV file. Like I mentioned it is also possible to use a write and write it to a file but that is something that we are not going to look at at least at the moment. So if it is needed later we will do it. So that is the first one that we are going to do, take a data and do some manipulation on the data and save the modified data. And then the second case study is to consider another data set. It is the same as the elements data set except that it has now also the ENX modulus data. Once you have ENX modulus data, we want to know how to plot specific modulus that is modulus divided by density against the melting temperature. So previously we just took the two columns and plotted but now we want to do some algebraic manipulations on the columns and such modified data we want to plot. And you might be wondering why such plots are needed. These plots are very useful and they are known as property charts are Ashby plots and they are used in learning about the properties as well as in the selection of materials for engineering applications and some of you might have seen in properties and selection courses such plots and so you will see how to generate such plots for yourself. So we are going to use the melting point versus specific modulus plot and once we do that such a plot will also show you the extreme points. So we are going to plot and we are going to look how the points look like and specifically we are going to learn that iron is an extreme point and so then we are going to manipulate the plot in such a way that we leave out iron and look at what happens. And in the process there is some programming thing that one learns and I want to emphasize it at this point. Whenever R returns any error message you should read it and take the corresponding action or if it gives some warnings you have to learn what the warning is about. So this is what conversing with the machines means you have to read the messages you have to understand the messages and you have to respond to the messages. So it is exactly like knowing language if you say you know a specific language it means that if people ask you something or people tell you something you will be able to understand and respond to them. In a similar fashion when you say I know our programming language what is expected is that you are able to converse with the machine if it gives some messages be it error or warning messages you are able to read understand and respond. I know that the tendency for beginning programmers is to go look everything up on the internet and then copy paste the solutions it is sometimes useful and it can help you pick things fast. But if you want to be a good programmer at some point it is better to read the error messages and try to decode them yourself before taking help from the net. Of course net is a very good resource and you should utilize it but before doing that it is useful sometimes to try it out on your own because that way even when you get the solution from the net you will have a better understanding. So it is important to read the error and warning messages and you will see that when we do some of the manipulations it does give some messages and you should pay attention to those messages. The second case study that we are going to do is for specific strength and we will see that for things like Ashby plot sometimes the logarithmic scale is needed and you will also see why property charts or Ashby maps are useful and you are going to notice that the materials are grouped according to the kind of material it is and you will also see some grouping according to geometry for example all fibers and whiskers irrespective of what type of material they are are going to fall in one way and it also is very useful to notice the outliers and in this case the outlier happens to be a natural composite namely wood. So this is what we want to do in this session we are going to take some data in CSV format we are going to import them we are going to use ggplot for plotting and we are going to do with a little bit of manipulation and we are going to pay more attention to the data and try to learn about the data by doing this and so this is going to be our almost last session on introduction to R. So let us go do that. So this is the first one is to take the elements data that we have already done so let me open R and let me take these commands. So elements so we are going to read the CSV file from data and then we are going to order the elements according to melting point and we are going to write this data into new CSV file called elements melting point sorted. So that is the thing we want to do and so you can see x of course we can see x and you can see that it is sorted according to melting points you know previously it was first aluminum or something and you can see the original numbering it was 1 here and then 2 was nickel and 3 was gold and so on 4 was silver and so on but that ordering has been changed because now it is going by the increasing order of the melting point so it is 321, 426, 50, 660 etc and so we have written a new data file and that is there in the elements tms sorted so let us go there and see so there is this elements tms sorted so let us say okay. So according to melting point they are now listed and this data is available and to do this so I also have this name sorted let me remove this file okay so I am going to remove this file so in this directory there is no name sorted now but let us do sorting according to names and write that file okay so let us do that this here is where we are going to do okay so again we are going to read the data and we are going to write the data and we are going to so the difference between the previous one and this one is that previously we said okay elements according to melting point you have to order and here we are going to say that order according to the names element element right. So what does it mean to say order according to this as you will see it will alphabetically order and the file will then be written to a CSV file okay. So now let us say x and you can see aluminum, beryllium, cadmium, chromium, copper etc so this is now alphabetically ordered and this is what would be written here so element names sorted or CSV is available for us now and as you can see now it is sorted according to the alphabetical order. So in other words it is possible to take data import it and manipulate it and you can manipulate numerical data as well as strings so you can alphabetically order or order according to numbers increasing decreasing or ascending descending order and then you can write them so the write.csv is the equivalent of read.csv for writing csv files and you can also so help order will also tell you for example if you want to do not ascending order but descending order what should you do okay. So decreasing equal to false which would mean that increasing is true so you can order so should be sort order be increasing or decreasing okay so that is what this does so you can use the increasing command and so you can read the help file to get more information but this tells you how to manipulate data and order them and write them and so on so forth okay. So now let us look at the second data file like I said there is an elements 2.csv data file which has the extra information so in addition to this data it also has the ENX modulus and so we want to read this data and work with this data so this is what we are going to do okay so let us do this let us find out what this command is okay so we are going to read into elements the data from elements 2.csv and we are going to order and the ordering is according to the specific modulus because elements x modulus by elements density is the specific modulus so we are going to order the data according to this specific modulus this is just as an exercise so it is not essential that you have to order and we are going to say specific modulus is this quantity okay that is what we are going to show store and we are going to now use ggplot2 to plot this specific stiffness right that is the specific modulus versus melting point and we are going to save it as a PDF file and so the device off means that it will execute all these plotting commands and then close the PDF file and the plotting commands as usual so it is ggplot it says take the x data aesthetics is x is specific modulus y is melting point and color should be according to crystal structure and the geometry is that it is a scatter plot it is a point and you are going to label the points of course according to element and there is a justification for the label so that is what this plot commands are so this we have done several times and see when you do this now there is no plot that you can see here that is because the PDF file actually wrote the figure here right so this is what the data you have and as you can see the specific modulus versus melting point most of them fall in this column there is nothing beyond the 0.05 or 06 nothing beyond 0.05 actually except for iron which is really really an outlier so its specific modulus is like that is modulus divided by its density is very high it is falling somewhere beyond 0.15 so as compared to all the other elements iron is really an outlier so this you can see so in other words the modulus with respect to density if you normalize by the density value different elements have different densities and if you account for those differences then you see that the relative ratio is different for iron as compared to all these other materials and within of course given specific modulus now the melting temperature is different for different materials so you can see that there are many points cadmium, beryllium, titanium and tungsten they are all falling almost on the similar values but their melting points is different so this is what we see and so and there was no plot because we plotted it to the file. So let us now go back and do some more manipulation so of course you can get a plot by now after dev off if you put the command then you will get the plot here again showing that iron is really really an outlier here. So let us do the next one. Next one is that so we want to take a closer look at these values and because that is an outlier we really do not want beyond this point. So we are going to change the range when we are doing the x plot so that is what we are going to do now let us go back and do it here. So we are going to read so the reading and this thing has been done already in the data so probably we do not need up to this. So we have and we also do not need this let us start with ggplot and so what we want to do is that specific modulus melting point according to color but now we want to restrict ourselves to 0.05 we do not want to go beyond this and geometry is point and labeling is according to element. So everything else is the same as earlier as here except that we have introduced a new limit for the x range right. When you do that it gives you a warning message it says removed one rows containing missing values and removed one rows containing missing values for both the point and label that is because it had the iron point and it had the label iron and R is just telling you that when you did this rescaling of the x axis you actually missed one data point and this is very useful because sometimes if you do not pay attention and if you just give some x limit or y limit in the process if you leave out some data it is better to know that we have left out some data and that is what R is giving you as a warning. So in this case we know we wanted that data to be left out so it is okay for us but in some cases you should not inadvertently leave out data points and so it is always a good idea to read the warning messages and to know. Now because we have expanded the x axis in the relevant regime you can see that the data has spread out and previously we thought that you know all these points were at the same level but now you can see that cadmium and tungsten are actually slightly apart probably tungsten and gold are at the same level and so this kind of gives you a better picture and the more closer look at the data and ggplot just by adding one more layer allows you to do that. So that is what we have done and this is what is shown here also okay. So now we have done now let us take another data it is called specific strength and here is the data. So this data file I have generated by reading the data table that is given in one of the Wikipedia pages about specific strength and it has the material like concrete, rubber, copper, polypropylene etc. As you can see they are of different material types right this is a composite, this is an elastomeric, this is a metallic material, this is a polymeric material and some of them are fibrous, some of them are synthetic fibers and so on. So for example spider silk is a natural fiber, silicon carbide fiber is a synthetic fiber and glass material and so on and so forth okay. So and I have even put iron viscose as synthetic fibers and so on so forth. Then there is the density of these materials and the specific strength of these materials that is strength by density. As you know strength is the resistance that a material gives to permanent deformation or plastic deformation okay. So we want to know the specific strength how do these materials perform and as you can see the numbers change a lot I mean you have something that is going in thousands then you have numbers in hundreds then you have numbers in tens and then you might even have numbers in ones right. So there is a large range and by the way in these cases when the numbers were given in a range for some of these materials I have just taken the average to be the value just to make life easy for us when we are doing the plotting. And so here also you can see the numbers can change from point something to some large number like 8 point something. So there is two orders of magnitude here it is more it is about 3 orders of magnitude. So the numbers vary a lot so it is difficult to get them all on the same plot unless you do some manipulation which is to plot them on a logarithmic scale okay. So that is what we are going to do so let us start let us take this data and load them okay. So we are going to do the specific strength .csv file we are going to load and we are going to order that and store it as x and now if you say x yeah so it has all this data so it has the material type density and specific strength okay. Now we want to use ggplot and we are also going to use the library scales because we want to change the x y axis to a logarithmic scale okay. So let us do this and let us plot and see what happens. So ggplot we are using library scales as usual ggplot you have to tell what is the data what is the aesthetics this density versus specific strength we want to do. Now we are going to color according to material type metallic materials in one glass in another concrete in another and so on. And coordinate transformation both x and y we are taking logarithmic axis so we are plotting on logarithmic scale and the geometry is of course points and labeling has to be done according to the name of the material. So from the data it will take the name of the material and do the plotting right. So let us do this. So now you can see that okay so it is a logarithmic scale obviously so 1, 2 and up to 10. So here you saw that its density was not changing some 0.7 to something but here there was a large scale difference and so you can see that and according to material type the coloring is done right composites elastomeric glass metallic etc they are all so we can zoom and see okay. So you can see that there are different colors and the moment you put it in colors you can see that all these polymers for example are clustering together all these metallic materials for example are clustering together and all the fibers for example are clustering together I mean they are all of different type glass fiber basalt fiber silicon carbide spider silk and iron whisker so they are all of different type of materials but in terms of their structure or geometry or morphology they are different and you can see that they are also clustering together. So there is a effect of the material type there is an effect in which form that material is and both these come out very very nicely using these property charts and the other thing that comes out very very nicely is that wood is really an outlier so in terms of density very low densities but specific strengths comparable even more than I mean you know comparable to metals or more than and in terms of density there is a huge difference between these but they do have a very good specific strength. So this kind of information is very useful and it is also nice to have if you are trying to choose a material for a particular engineering problem and so that is what this plot does. So yeah so we are going to do some more plots so I want to that is a log plot now I am going to make it logarithm to the base 10 so it is possible to do that so everything is the same except that the transformation is to logarithm to the base 10 so you can do and you can get the data so you can see now it is 100,000 etc so it is on the logarithm to the base 10. So we can do the next exercise which is the same so once we have this let us do this and let us change the X limit so why do we want to do like we did earlier so we want to remove some of these outliers and take a closer look at the other data because of this balsa we are having this range and there is nothing much that is happening here so let us remove that and look at the close data and again it tells you that you know you have already missing a point which we already know and the scale is already present of course obviously it was already present so we do not have to worry about it but it is saying telling you that X was already present so it is adding another scale for the X according to what we gave so now you can see that things are now spread out a little bit and you can see again the greens clustering together the blues clustering together and so on so forth so it is really nice to have this kind of property charts. So that is the figure we have generated and that brings us to the end of this session. So what we have learned now is that you can read data and before plotting you can also manipulate the data and plot the manipulated data and do an analysis so this pretty much brings us to the end of the introduction to our session. We will have a summary session and complete this module. Thank you.