 And welcome to the course on dealing with materials data. It is a course on the collection analysis and interpretation of data from material science and engineering. I am Hina Gokhale. I am Guru Rajan. I have background of statistics and I have worked with metallurgists during my 25 years career at DMRL. I am interested in modeling microstructural evolution both phase transformation and deformation induced to microstructures. So we use both atomistic and continuum models like Monte Carlo, molecular dynamics and phase field models for our studies. My teaching interests also include modeling and analysis, simulation and optimization, data analysis and interpretation, computational lab and mathematical methods. While my interest remains because of my background in disseminating the knowledge of statistics so that it becomes useful to today's world of machine learning and artificial intelligence. Both the techniques actually hinge on the statistical technique and therefore the emphasis we are going to make in this course to cover statistics useful to analyze the materials data. So this course consists of two parts. One is on the theoretical aspects that will be covered by Hina and the other one is on the practical aspects for which we will use R programming. So I will do the R tutorial parts and it is very important to learn the statistical basics even before one starts experiments or simulations. The techniques that you will learn are agnostic to where the data came from. However knowing these techniques is very essential even before you start collecting data. So we will discuss those aspects and we will show how to take the data and analyze it using R. So what we are planning to do is cover some basics of statistics because that is essential in order to move on to actually look into the applications. When we come to application we will be talking about all the statistical techniques through case studies or a specific examples. So till we learn basics of statistics up to hypothesis testing when we move on to regression analysis, analysis of variance, design of experiments we will be talking in terms of data and case studies. So one of the things that you will realize if you go into the literature is that data is very scarce especially in the form in which you need the data is really difficult to come by and even when you have large data sets for example something that we are going to discuss there are about 65 data sets about the hall pitch relationship for different materials but you will see that each data series consists of 4 or 5 points. Statistically speaking these are not really good but that is what one has to live with and so how to do the analysis we will use these data sets to practice. However it is this course we hope will also set you into actually collecting larger sets of data and curating them in such a way that people can do analysis on them. Actually Guru I would like to add into it that my experience has been that metallurgists are very happy with only 3 data points. There are reasons to it because collecting this data conducting those experiments is a very expensive and time taking task. So that is why this is a course on dealing with materials data because we are aware that the data is scarce, data is small and still you would like to derive or get the maximum information out of it. Also I would like to add that the times are changing, people are collecting data as you just said from the literature and that is how the data size with some variety is increasing. Also the past data from the manufacturing units are also being made available to the management to make a decision. So to take a decision procedure that is where the machine learning and artificial intelligence comes into picture. So now it is as a metallurgist or as a material scientist you must be aware that there is a materials initiative in the United States in which the plan is to develop new materials using the microstructural data that is the structure property relationship and explore this relationship through data and therefore you need again a good background of statistics which is much advanced. But in this present course as we mentioned earlier we will show you where what analysis can be done, we will give you examples of it, we will also show you how to collect the data if you have to collect it in a you know meaningful way. So one of the things one of the questions that might come up is should I do this course? So if you are a starting researcher, a master student or PhD student and you are going to get into research in material science and engineering then this is the course that is tuned for you. However, having said that we live in a world which is filled with data and the techniques of data analysis and the ways of analyzing or understanding or dealing with data is common to all fields of science and engineering. So or even humanities for example. So these techniques will be of use to everybody but we are going to emphasize on data from material science and engineering. So it is specifically tailored for those early stage researchers who want to understand data better so that they can generate better data and come to better conclusions. So I would like to suggest that before this get on to the research it is nice if they go through this course. This is designed in such a way that even if you take individual unit that will also be useful. If you look at only the programming with R language that also would be useful. If you are getting confused or you are getting the responses which are not to your liking then of course you have to go through the sessions on what is the theory of statistics so that you know where what might have gone wrong, what assumptions are not matching etc. You know I feel that the statistical parts are the most important one because I have seen that there are lots of tutorials online in R how to do this, how to do principle component analysis for example. So there is a tutorial and in 5 minutes probably you will take the data and you will do it but once you have the result or the graph in front of you whether what you did is meaningful or whether how do you understand this result now, should you do anything else or should you pick some other technique or whether the data you have and the assumptions approximations that is made is it meaningful to interpret the results in a specific way. So all these are dealt in the other strand of the course so I believe that that is the most important part because it is very easy to give tutorial so you will take the tutorial okay this is the data replaced with another data file and do the analysis but more meaningful analysis can emerge only if you also know the basics and the fundamentals behind what we are doing. Well Guru that was my technique or rather a trick to attract the people because these days people get more interested in the programming language than actually going through the theory but yes I agree as I said while you go through the analysis and I believe in today's world of analytics data analytics and data science one has to go through data analysis and when things go fine the programming language is beautiful when things start going wrong you do not know where to go and that is where we would like to offer this course so that you know exactly what is meant what are the assumptions and how it is being derived. So to conclude welcome to data analysis dealing with materials data and do keep in touch with us and if you find anything that is not correct or you think there could be better ways of doing or you have some data that you would like to share with us do keep in touch with us. Yes please do welcome to the course.