 Welcome to dealing with materials data in this course we are going to look at collection analysis and interpretation of materials data, we are in module 1 which is an introduction to R and we are learning how to get data into R and manipulate it specifically plotted and in this process I want to now talk about specific plotting libraries that are available and so there are R can do plotting on its own as we have seen we can use the plot command and do plotting but sometimes it is useful to use some of the other libraries that are available in R, they are very powerful and they also give you lots of handle on how to go about plotting, specifically we are going to use ggplot2 library, so in this session we are going to learn a little bit about ggplot2. So there are many libraries for plotting grid lattice and so on and some of the R textbooks do describe these libraries but ggplot2 the gg stands for grammar of graphics, grammar of graphics is to identify the components and rules to put these components together. So it is exactly like grammar for a spoken language, how do we get the grammar we say that these are the components like subject verb object for example and how do you put them together to make a meaningful sentence, so there is a grammatical rule to combine them and once you learn then all sentences or many, many sentences can be built in this fashion. So the grammar of graphics is to identify the different components for making plots and find a way of putting them together so that you know all graphs that you see or all plots that you see can be constructed using these two, namely the individual components and rules for putting them together. So that is what this ggplot2 library is based on, it is based on the philosophy of grammar of graphics and the components for plot for example is the data, we have been working with elements which is the data frame, so that is the data, the geometry of the plot is the scatter plot, so we had density versus melting point and wherever you have a specific density against that melting point we were putting a point, so it is a scatter plot. So that is the geometry, it is a point plot, so that is the geometry of the plot and then there is aesthetics that is the axis, the labeling, the color naming the plot and so on. So there are many, many things that can be done and the philosophy in ggplot2 is that we build the plots layer by layer. So we take the data, we put the geometry, we put one component in, let us say the plot is in, then we name the axis, then we label the points and then we give colors to them and so on and so forth. And each layer that you add is added using a plus symbol, so that is how ggplot2 works and of course there is lots of help available for using ggplot2, there is also a book called Grammar of Graphics which some of you might be interested. For using ggplot2 I have referred to the book by Irizari which is freely available and it has a nice chapter on ggplot2, so I strongly recommend that you take a look at it. There is also a cheat sheet that is available online, so I want to show you that cheat sheet and it is here. So it is called data visualization with ggplot2, so as you can see what is the basic, so you have data and you have the geometry like what is x, what is y and you have a coordinate system and the plot is putting them together. And so there are many, many different things that you can do and this basically, this cheat sheet gives all those commands. Of course there is one more way, so you can take the data, you can do some statistical analysis and then take the geometry, coordinate system and plot. This is done for example, if you want to do a histogram plot or cumulative distribution etc. So in which cases we have to not just take data and plot it but we have to do some analysis and do it, so the ggplot can do that also. So it is very useful to have this cheat sheet downloaded and stored and this chapter by Erisary also gives a link to this cheat sheet online. So this is what is helpful if you want to get some help with ggplot2. So this is an example of how ggplot works, of course we have to load the library ggplot2. So we say that okay let us do a plot and this is the data, elements is the data and this is the aesthetics that is which is the x-axis and which is the y-axis. So elements, the third column is x-axis, fourth column is the y-axis and the color should be done according to the second column, that is what we have said and we had added a layer. What is the layer? The layer says that at every x, y you have to put a point. The geometry is basically a scatter plot, it is a point plot. So like this you can go on adding more layers for range and for labeling the points and labeling the x, y-axis and so on and so forth. So we are going to learn about all that using our example. So let us do that. So as we did earlier, let us open r and let us first get the data in place. So copy the data okay, so as we did earlier, so it is a data frame, so we just have 4 columns and we have named the column and we have given the data for those columns. And so the data is loaded, you can see there are 15 observations and 4 variables, so always a good idea to check that everything is in place. So it is a data frame, 15 observations, 4 variables, etc. So now let us do the plotting, do that, we are going to use ggplot and this is the first, this is what we saw. So we load the library ggplot2 and then we say take the data elements that is the data frame, the aesthetics is x is the third column, so that is the density and y is the fourth column that is the melting point. And these data points have to be colored and the color is according to the factor, there are 3 levels, BCC, FCC, HCP, so that is what the color should be and the geometry is that it should be a scatter plot. So you can see that it is the density versus the melting point and these are colored and you can already see that unlike the earlier case, the color scheme or the labeling is done automatically. So it says that what is red is BCC and what is I think green is FCC and blue is HCP and so on. So it also gives you this labeling so you can easily identify what they are. So that is what is shown here, the colors are much clearer here, this is red, green and blue. And so let us do the next one and in this case we want to add one more element, I want to name the, so I am going to do this, I am going to add one more layer and the layer is this. So we are going to put a label and the label is from the data elements and we are going to take the element name from that data and use that as the label and this has just and we just are the horizontal and vertical justification that is where you should put these labels. So you can see, so some magnesium and I do not know what was here maybe aluminum or something. So zinc, cadmium, beryllium, so you can see and unlike the earlier case you can see that I did not explicitly have to say that the labels are also should be color coded according to the points that is done here automatically. It takes the data and because that is there already because we have already built in this layer where we said the color should be according to this. So it is going to use that information and do it consistently, so it is nice that way and very intuitive and very clear. Now we can do one more thing, so you can see that these labels are cut out. So let us say that I want to change the range. So I can add one more layer, x limit is 1000 to 22000. So I do this and then I do that and then I do that, so do this again. So this is ggplot, so we are going to say that this is 3 versus 4 and color according to factor and we have given an x limit, so it will go from 1000 to 22000 so that we will see this names very clearly and then there is a point geometry and the labeling will be done. So you can see now the range is changed, so you can see the golden tungsten clearly they are readable. So what we are doing is that it is the same, so we just say what is the data and what is the aesthetics, what is the x, y and what color, then we are adding layer by layer, we are first saying change the x range, then we are saying put the data points and then we are saying label the data points. So you can do this layer by layer and that is the advantage of ggplot. Of course you can use labels, so we have done x label and y label and somewhere during this one of these sessions we will do that also, so that is just one more layer, so you add another plus and say labs I think that is for labels, x is this and y is this and title is this and so on and so forth. So you can take a look at the ggplot help itself or the cheat sheet or the Irizari's book. So the help itself gives you some of these links for you to learn about ggplot 2 and we are going to use ggplot 2 also extensively in this course, so I recommend that you install this package and we will learn how to use it for more plots, thank you.