 course on dealing with materials data. In the present sessions, this is the fifth one in the series of this sessions on special random variables, we are talking about certain special distribution functions which we encounter very commonly in our day to day experimental work. Let us quickly review, we first introduce some continuous random variable, before that we introduce some discrete random variables, uniform discrete uniform random variables, then Bernoulli trials and three distributions derived out of Bernoulli trials, we gave a 3D APFIM example there. Then we introduce the uniform continuous random variables and we introduce uniform random variable or uniform distribution, then we talked about in details on normal distribution. We also discussed the some distributions which are derived out of normal distributions, they do not largely naturally occur, however they are useful to carry out certain inference on the random variable which originate from the normal distribution. And the last session, we discussed about the central limits theorem and that basically showed that how normal distribution is kind of all pervading distribution, primarily because it said that with the condition that it has a finite variance, a large number of random variables coming from independent random variables coming from identical distribution. And if you take its mean value, then as n tends to infinity roughly we can say that the mean value tends to normal distribution with the, sorry, the average value of the random variable follows the normal distribution with the common mean and the common sigma swell divided by n or the random, the average minus the mean divided by its variance follows a standard normal variable that is the correct way to put it. Now we are going to come to some of the distributions which are of importance to metallurgical and materials engineering and that is very specific but in general it is good for the engineering distributions or engineering data. The two distribution that we wish to introduce right now is log normal distribution and Bible distribution and then we will follow up with the exponential distribution which is a special case. Why this special treatment to some of this distribution particularly coming from a metallurgical engineering data? Well, normal distribution is a widely applied distribution, central theorem makes it almost all pervading distribution however it has one limitation with respect to engineering data. In normal distribution the random variable varies between minus infinity and plus infinity. It means that it can take any value from the real line while data from metallurgy and material science generally take on only positive values. Now you may assume the distribution to be normal and carry out your inference. Sometimes you encounter a very absurd result and it may amount to the fact that you have assumed normality which takes on all kinds of values from negative infinity to positive infinity and therefore it creates this absurdity. The only thing is that this metallurgical and material science data in particular it has been seen that they have, they are generally a skewed data. They are not beautifully symmetric bell shaped histogram they do not represent those such beautiful histogram, they generally represent the skewed distribution. Also to apply the central limit theorem as we say if the distributions are skewed you need at least 30 data points or at times you may need even more than that if the skewness is very high. In that case such a data availability in material science and in particularly in metallurgical engineering is very rare. It is very expensive to generate such data and therefore it is important that we learn about certain distributions which are defined over the positive part of the straight line that is for the positive random variables and we will also see that how they have been used and applied for the, in the area of metallurgy and material science. So first we introduce what is known as log normal distribution. As the name says it is a positive random variable which has the natural log of that random variable follows a normal distribution then you say that X has a log normal distribution. So there is a random variable whose natural log follows normal distribution then we say that X follows log normal distribution. Such a distributions pdf is given in this manner. You see that this X is replaced in the, this exponential part is same as a normal distribution where X is replaced by ln X and this part is also the same as normal distribution. However, we have a denominator multiplied by X is because the derivative of ln X is 1 over X dx and therefore it comes X here. Here also the two parameters are there one is a mu and the other is sigma. Mu is a mean value of ln X and sigma square is a variance of ln X therefore mu varies between minus infinity and plus infinity and sigma square is always positive. The notation we use is X is distributed as ln mu sigma square there are two parameters to it. Here I have shown you some of the plots I have picked it up from the Wikipedia. The reference is given here. Here what I want to show is that the mean value is kept as 0 and the variance of the normal part of the distribution is varied from 1 to 0.5 to 0.25 and you can say that with the change in sigma value remember that this is not sigma square this is sigma. So if you try to regenerate it please make sure that it is with sigma and not sigma square given here and then you can see that as the sigma changes the shape of the curve changes as the sigma becomes smaller the curve becomes more and more skewed. And also please notice that this is always skewed towards the right log normal distributions are always skewed towards right. As sigma you can see that when sigma is 0.25 it comes very I think I made a mistake in my first statement as the value of sigma increases the skewness decreases when sigma is large the skewness is large and when the sigma is small the skewness decreases and you can see that here at sigma equal to 0.25 this is almost like a normal distribution curve but there is you can see the tail part of it and you can make out that it is still a skewed distribution. The expected value is given thus I leave it for you to derive it and the variance is expressed in this manner exponential of mu is actually the median value of the distribution. So exponential of mu is the here it is 1. So this is the median value of the distribution for every distribution this is the median value. So this is all centered around the median value and median value a number of times is also denoted by L sub 50 and sometimes mu in the all the above equations these equations as well as these equation is replaced by L ln L 50 because remember that L 50 is exponential of mu therefore mu is log normal of it is a natural log of L 50. Now please remember here let us come back that sigma square decides the shape of the distribution and therefore here sigma is called the shape parameter unlike normal distribution it is not a scale parameter it is a shape parameter. Now if you want to introduce a location parameter then it is a third parameter and it makes it a three parameter log normal distribution where the x is replaced by x minus psi where your assumption now is not x greater than 0 but x is greater than psi mu is between minus infinity and infinity sigma square is positive and then the form of the distribution is this. It is denoted as x is distributed as log normal psi mu and sigma square. Let us look at some of this pdf plot here I have kept the to show the effect of the location parameter which is psi the first graph shows that psi is 0 you see this is when the location is sorry location is 0 and you see that this is where the x values are larger than 0. So, it starts from 0 0 and it goes up and down when you make psi is equal to 2 the whole graph is sort of shifted by 2 units and therefore it because starts from 2 and exactly the same trend it follows. So, that is why it is called a location parameter because it shifts the location of the plot or the of the pdf. This log normal distribution is most commonly used to estimate the particle size distribution. So, another way is also that it is heavily used sometimes to represent the data such as the height weight of a human being because generally height and weight is modeled as a normal distribution. But as I said height and weight cannot take a negative value at all and therefore it is actually a positive distribution it is a positive random variable and therefore it is number of times estimated the distribution of height and weight is estimated very well with the log normal distribution. Also I find that the density is also very well modeled as a log normal distribution. It also gives a good model to express the distribution of fracture toughness, yield strength etc. etc. So, it has a very typical very very specific and many applications in the field of metallurgy and material science. Now we move on to the next distribution which is a Weibull distribution. Weibull distribution is named after a Swedish physicist Valori Weibull in 1939 he first time demonstrated that a distribution that can represent the distribution of breaking strength of material and he must have got so excited with his discovery or his this particular finding of the distribution that he wrote a paper in 1951 which titles a statistical distribution of wide applicability. I have given the reference and I strongly recommend that everyone should go through this paper because it is a very interesting and account of how the physical phenomena can be looked into and can be used to derive the statistical distribution. It is a very beautiful distribution about two paragraphs of it but very very inspiring and very interesting. He in this paper he said that not just the breaking strength of the material in terms of yield strength of buffer steel or fatigue life of another steel STs 37. He even found that the fibre strength of the Indian cotton, height of the male born in British Isles, he also found certain length of a certain beans which were produced in a certain farm all of them tend to follow Weibull distribution. It appears that Rosen and Remler had in 1933 had also come up with a very similar distribution in connection with particle size distribution. However, the distribution became very more known or more popular when Weibull put it forward in his 1951 paper and it has been heavily referred. I again once again I said that I strongly suggest that you please go through this I have given the full reference of this paper, please go through this paper. It is a very interesting account of how the distribution is derived. So if X is a positive random variable and it has a PDF that looks like this then it is said to follow Weibull distribution. Here we have two parameters C and alpha. C is called a shape parameter and alpha is called a scale parameter and the cumulative distribution function has much easier expression which is given here which is 1 minus exponential of minus alpha X over alpha to the power C and the notation is generally used as X is distributed as Weibull, alpha and C. First comes the scale parameter and then comes the shape parameter. Please look at this particular expression and note that if you take double log of this CDF you will have a linear relationship between C and rather log C and alpha and you will be C and log alpha and this will be one of the ways of estimating the parameters when we come to that. So we will discuss it in more details at that time. Again I have taken some plots of Weibull distribution from Wikipedia. Please remember here the notation is slightly different. The scale parameter is given as lambda and the shape parameter is given as K and here again for different shape parameters the distributions are shown and please note that this distribution is not just skewed but it can be skewed in either direction. That is the beauty of Weibull distribution. It can be skewed in either direction. So here you see a kind of an exponential graph here. You see another distribution which is very typically right skewed distribution. This one, this is for shape parameter is equal to 1.5. This is another skewed distribution, right skewed distribution. However, when you take K is equal to 5 as such it looks like a symmetric curve but notice the tails and you realize that now it is becoming slowly the left skewed distribution. The longer tail is coming on the left side. That is the beauty of Weibull distribution. In log normal distribution only if the data is positively skewed you can fit the data but Weibull gives you an open field that any data which may be skewed or may have an exponential kind of expression can also be covered as a Weibull distribution, can also be modeled as a Weibull distribution. Weibull distribution also has its variant called 3 parameter Weibull distribution in the exactly the same manner as in the log normal distribution. So there is a location parameter psi and of course psi is greater than 0 and the variable for random variable X greater than psi, this is the expression of PDF. You please recall that the expression only changes that X has been replaced by X minus psi in all the places and the CDF is also generated in the same way and in this situation the notation is X is distributed as Weibull, psi, alpha, C it means that location, scale and shape. How does the 3 parameter Weibull differ? This is the 2 parameter Weibull because I have a psi is equal to 0 and here I have taken psi is equal to 1. You can see that since I have taken the shape parameter to be 8, it is left skewed distribution and it simply shifts is location from instead of starting from 0, I can say that if they need the X greater than 0, so X starts from 0 onwards, here it starts 1 onwards, otherwise it has simply shifted. I also want to tell you from the graph that as the value of shape parameter increases, the shape changes from the right skewness to the left skewness. So here also we have very specifically showed a left skewed distribution arising out of Weibull and how it shifts, it is actually a location, it shifts the location of the distribution. Now there is one distribution, exponential distribution which is very special case of Weibull distribution where the shape parameter is 1. You please recall shape parameter 1, the red color distribution is represented is a special distribution, it is called exponential distribution. The random variable is greater, X is greater than 0, then the distribution PDF has this form and the notation is given by X, X is distributed as exponential alpha. You can derive the CDF, I have purposefully left it out, please derive the CDF of exponential distribution, it will be a good mathematical calculating algebraic exercise for you. Also please note that before going any further to the memory less property, you can have a 3 or rather 2 parameter exponential distribution. Here you have only one parameter which is alpha, now you can introduce a psi which is greater than 0 and say that X is greater than psi follows a 3 parameter exponential distribution where you will replace X by X minus psi, okay. So you can generate a 3 parameter exponential, 2 parameter exponential distribution, please I stand corrected and then you will have X distributed as exponential, first you will have psi and then you will have alpha. This distribution has a wonderful property called a memory less property. What does it mean? It says that probability if X is distributed as exponential alpha, then probability that X greater than t1 plus t2 given that X is already greater than t1 is same as probability that X is greater than t2. This is a very important and very interesting property, let us understand this. Because this is a time line, this is a time t1 and this is a time t1 plus t2, okay. This is the value of t2. What it really says is that you already know that the random variable is larger than this. Then if you try to find out that the random variable is larger than this given that it is already larger than that, only means that it is beyond the time t2. What it has done in this range, it has already gone beyond the t1, it has forgotten. It only remembers that it has gone further t2 time, that is all it says. So this is called that it has forgotten the memory that it was already larger than t1 but now it only considers itself as larger than t2. This proof also I leave it to all of you to consider it. It is a very interesting proof, you should try it, it is not very difficult. So with this let us summarize what we did today. We considered all the distribution with the random variable which takes on only positive values. Why did we do that? Because in material science and metallurgical engineering, most of the data we encounter is positive in nature. You can assume normality and carry out the inference but at times you encounter certain absurdity in your results and that is likely to arise because you have considered a distribution which has which is supposed to have its random variable coming from both positive and negative side while in reality you have got the random variable which is only positive. So we have introduced specifically distributions which have a great applications in the field of metallurgy and material science and it has it takes on a positive value. So we introduced two distributions of this nature log normal distribution, two parameter and three parameter distribution, Weibull distribution also two parameter and three parameter and then exponential distribution as a special case of Weibull distribution. I would like to let every one of you know that these are not the only two distributions useful to with having a positive random variable and useful to metallurgy and material science. There are many such distributions, one of them is Bernbaum Sonder's distribution, there is a gamma distribution, there is beta distribution, there are many such distributions but right now we have introduced most commonly useful distribution in the field of material science and metallurgical engineering. With this we complete the sessions on introducing the special random variables. We can have a quick look at it, we started off by saying that there is a random variable which is a function from the probability space to a real line and then for such a random variable we talked about it expected value, its variance etc. Then we introduced the special random variables which take on very special distribution. So we also we talked about the discrete random variables of special kind where we had uniform distribution, Bernoulli trials, binomial distribution, geometric distribution, we also had a negative binomial distribution, hypergeometric distribution and all of this we showed through 3D, APFIM and other examples that they are all useful to us in the field of metallurgy and material science. Then we came to the continuous distribution in which we talked about the continuous uniform distribution and we showed that in random number generation it plays a very central role. Then we introduced normal distribution, we had a long discussion on normal distribution because through central limit theorem we saw that it is largely all pervading distribution if you have a large amount of data. We also got introduced to derivatives of normal distribution such as chi square distribution, t distribution and f distribution. We have not seen the applications of it directly but it is a preparation for the future work that we are going to do in estimation of parameters and testing of hypothesis. Then of course we talked about the central limit theorem and then we introduced ourselves to the distributions which are useful to the metallurgical engineering and material science because most of the time random variable takes on of only positive values. We introduced the commonly used such distribution as log normal distribution, Weibull distribution and we introduced exponential distribution which is a memory less property as a special case of Weibull distribution. Thank you.