 Welcome, this is a course on dealing with materials data, we are going to discuss all aspects of dealing with materials data, we are going to talk about collection of data, analysis of data and interpretation of data, even though not necessarily in that order. This course has two parts, one is the statistical concepts associated with data analysis and interpretation, the other one is hands-on part which we will do using the R programming language. So, for this course I will be teaching the R part of the course. So, we will take as much as possible data from material science and engineering and we will try to deal with it using the R programming language and the concepts that are needed to deal with the data will be taught by the other instructor Professor Hina Gokhale. So, let us start with the first module, this module is an introduction to R, so in this module we want to introduce the R programming language and we want to give a hands-on tutorial introduction to R and for examples we will use materials data as far as possible. So, there are plenty of reading material for R that is available, the first one is the R project website www.rproject.org. This website has plenty of information about the R project itself and the R manuals are available at the cran.rproject.org under the manuals.html page. There are many references and some of them are very exhaustive, some of them are meant for developers, but what I have in mind specifically is an introduction to R and this is available at the cran.rproject.org. There are also many free books of excellent quality that are available, if you go to the manuals.html page for example, there are also user generated manuals, there is a link which you can go and look at the material that is available and there are many people who have written small introductions or specific things how to get it done using R and this information is available and some of them are of excellent quality and they are also freely available. For example, there is a book by Rafael Irizari called introduction to data science and you can download this book for free. If you feel like you can also pay the author for taking the pains for writing this book, but if it is not affordable for you, it is not in your budget at this moment, you can freely download this book and use. So, there are plenty of reading material that is available online and some of them are from authentic sources like the R project website or R manuals. So, I strongly recommend that you try to use some of these reading material in addition to the tutorial introduction that you are going to get from this course, from these modules. There are also plenty of other R resources. R documentation is one of the best resources that is available and that is always available. So, in the R console, you can just look for help and you can get the required help and I am going to show you how to do that in this module. And there are lots of online forums and discussion boards such as Stack Overflow where lots of details are discussed. Of course, these online forums and discussion boards are useful once you have some familiarity with R and you can understand the solution that is given or the kind of questions that are being asked. So, in order to understand, you may have to have some familiarity with R. If you have, then these online forums and discussion boards are very useful and they can answer specific questions, specific problems that you have, even if it is not part of this course or even if it is not part of any course, then you can actually go find some help and using that you will be able to solve your problems. And of course, there are lots of books on data science and R and we are not going to use any of them in this course, but those of you who are more data science minded and are interested in applying these concepts to material science and engineering should take a look at some of these books also. So, having said that, there are plenty of resources and reading material that is available. What is it that we want to do? So, in this module, I want to give a tutorial hands on introduction to R and I am having students of material science and engineering in mind. This is because if you go to the R manuals for example and user generated manuals if you look at, many of these manuals are prepared for specific fields. For example, there are people who have looked at social sciences, R geography, R biology and how to solve problems in these areas in terms of analyzing the data and interpreting the data and presenting the data and graphically describing the data and things like that. So, for that the tutorials are written, but there is nothing that uses material science and engineering problems as the example case. So, that is the first aim of this module to give a tutorial hands on introduction. It is a tutorial introduction because it is no means a complete introduction to R. There are like I said lots of reading material and other material that is available and so you should take help of those material. This will just give you whatever is required to solve the problems that we have at this point and how to do it using R. So, it will be hands on. So, as much as possible when I am teaching the course, I will open an R console or R studio window and I will type the commands so that you can also have your laptop or computer with R open and you can also work along with these tutorials. So, you will have hands on experience with R. The emphasis is on materials data. There are very few courses that use materials data sets as examples or discuss the statistical concepts with examples from materials science and engineering. So, we are going to emphasize on this part and this is also lots of value addition because to collect these data and to put them in one place and use them as example will also help you analyze your own data when you generate because you will be familiar with similar data and similar kind of analysis that you can do on them. So, the emphasis is A on tutorial hands on introduction, B on using materials data. Like I said this module is by no means complete. There are several online courses including NPTEL MOOC courses. There are two or three of them from IT Bombay itself, from IT Kanpur and IT Madras. So, it will be useful for some of you for interested in learning more about R to go to some of these courses. In addition, of course there are also courses at Coursera and ADEX that are available and I strongly recommend that as you are learning R in this course in a tutorial hands on way, you should also try to do some of these other courses and bring in some of the expertise and knowledge on R that you gain from these courses to your own data analysis for material science and engineering. So, what is R and why are we using R? R is a programming language specifically meant for statistical analysis and data visualization and this is the part I want to emphasize because you will find that programming languages like Python also for example can do most of what we are doing but our emphasis is on statistical analysis and interpretation and visualization of data. So, R is a programming language which is specifically meant for such statistical analysis. So, we are going to use R, it is an interpreted language and not compiled by that what we mean is that you can open an R console and you can give commands and you will immediately get the reply or the computation done and the answer given to you. So, there is no need like if you are writing a program in C programming language for example, you have to compile get an executable and that executable has to run to give you the answer that you are seeking but that is not needed. In this sense R is more like GNU octave or MATLAB if you have used or SyLab there are several such languages that are available and so they are easy to work with and they are also very powerful as you would see. Another advantage of R that you will see is that it is available for all operating systems specifically for Linux, Mac and Windows. In this course I am going to use Linux operating system but if you have a Mac or Windows you will be easily able to use R on these operating systems also. There is a nice integrated development environment IDE called RStudio and so I am going to be using RStudio also in this course. So, I am going to use both R console and RStudio to give you a flavor of both and to tell you how to deal with both this of which RStudio is more powerful and more complete. It has several pains I will show it to you and R is just the console and so it is minimalist but it is also equally powerful. So, it can do things that RStudio can do and for example using RStudio is how I have prepared these slides. So, RStudio can work with things like Latex and produce presentations and I am also preparing documentation for what I am teaching using RStudio. So, you will see the documents also prepared using Tech and all of this is done using RStudio. So, it is a really integrated development environment. So, you can use it as a text editor, you can use it to prepare documents, you can use it to prepare the scripts, you can use it to prepare the script and document together and then you can decide to use only the script or only generate the document. So, all sorts of possibilities are there and so you will see some examples of that. So, we are not going to spend too much time on how to do that but like I said there are enough material once you are familiar with this for you to go and explore and learn. So, specifically in this course we are going to use R and RStudio on the Linux operating system specifically I am using Ubuntu 18.04 long-term support version. There is one more resource which is very useful. This is called Spoken Tutorials and if you look up online for Spoken Tutorials for R, this is also maintained by somebody from IIT Bombay Professor Kannan Mowgill and his group. So, you will find that how to install R for example and how to use and these are Spoken Tutorials in the sense that you will see instructions being given to you and so I strongly recommend that you also utilize this resource, look it up and learn more about R, RStudio, the other aspects like installation for example which is not part of what we are going to do. I am going to assume that you have R, RStudio installed on your operating system and that you are a little bit familiar with how to work with them and so I am going to just start. So, what is the first thing we want to do? So, here is a quote from Alice in Wonderland, the white rabbit put on his spectacles, where shall I begin please your Majesty he asked, begin at the beginning the king said gravely and go on till you come to the end then stop. So, what does that mean? We want to have the first session, I want to call it A to Z of an R session. So, we want to create a directory, we want to invoke R from that directory, we want to check the version of R and we want to write the first program which is the Hello World program and then we want to quit R. So, if you can go to a directory, open R, do something close and come out. So, that is the first session that we want to do and that session has a beginning, how to invoke R and it has something in the middle, how to write your first program and it also tells you how to end the R session. So, this is the very first introductory session I want to have. So, let us do that and like I said, I have also prepared the nodes for these sessions and they are written like this and like I said, they are prepared using the R itself using R Studio. So, this will be available to you and the MOOC website, but now what we are going to do? So, we are going to start doing this. So, what is the first thing we want to do? We want to prepare a directory. So, I want to go to, so I want to make a directory called dealing with materials data. So, this is the command in Linux to make a directory mkdir. Of course, if you are using Windows machine or even on Linux, if you are on the X Windows, you can right click, prepare a new folder and name it as DWMD and so on. So, let us go to this directory and we want to invoke R from this. So, invoking R or getting an R console means simply typing R and entering and when you enter, you see that this is the R version that I am using 3.6.1. So, I strongly recommend that you also use this version so that there will be compatibility in terms of versions what I am doing and what you will see. And this version of 3.6.1 is called action of the toast and so it tells you something about R and we are ready. So, we want to write the first program and the first program is of course, to print hello world. So, let us do that. As you can see, it is rather straightforward to print. So, it is almost like English. Print hello world means print hello world. So, we say print hello world. So, there is a parenthesis and what you want to print is given within code marks and so let us put a bang. So, hello world. So, let us enter and R gives the answer immediately hello world. You said print hello world. So, it has printed hello world. And this is how it will appear in the documentation also. You will see for example, the command that I want to give is given like this. This is the command and the answer that R returns is given with this hash mark in these nodes. So, that is how I have described everything in these nodes. So, the commands will be shown and sometimes the answers that R will generate will also be shown. So, you can do it for yourself and confirm that you are getting the same answer. And these markers, the greater than kind of symbol that you see that is the prompt. So, that is never typed and that is already there. So, now that we have done, we want to quit and of course R is very helpful. So, it tells how to quit type Q that is for quit and that is like a function. So, there is a parenthesis but there is no input and you say enter and typically R as do you want to save the workspace image. You can say yes, you can say no and you can say cancel. So, if you say cancel then R does not quit. And if you say yes, then it will save the workspace image and quit. If you say no, it will not save. And if it saves the workspace image by default you can ask it to load it when you open R next time or sometimes you can load it yourself manually. And it is also possible that if you have already have it loaded, then you can also remove those saved information. And that is done using this command unlink within quote marks dot R capital D ATA data. So, you have to be very careful. There is the caps and small letters. So, the commands are sensitive to capitalization. So, you need to use exactly capital R and capital D and the other letters are small. So, by using this, suppose you said, so let us say no and I am going to quit and it quits R. So, this is the session on A to Z of an R session. But suppose if you said yes and it is stored and when you opened R, next time when you say R. So, it does not say anything because we did not save. Now, let us redo this again. Let us say print hello world. So, I have printed hello world and I say quit and it wants to save the workspace image. I say yes. Now, if I say R, so it will remember my previous commands like this. So, you do not want this to happen. So, you will say unlink dot R data. So, then it will not have this information from data or the previous session data from R. So, let us do this and no. So, this is the first session. So, we opened the session, we wrote the hello world program and we quit R session. Thank you.