 and welcome to the course on dealing with materials data. We are from the previous two sessions we are introducing here the aspect of random variable and its expectation. Let us continue, first we will review what we have done so far. We introduce the function of expectation of a random variable of x as in the case of discrete random variable we defined expected value of x as a summation from i to infinity xi multiplied by f that is the probability mass function of xi. And in the case of a continuous random variable we define it as an integral from minus infinity to infinity x multiplied by the probability density function of x dx. Then we also defined the case moment of x as a expected value of x to the power k and we defined a moment generating function which is defined as m of t and this moment generating function has a property that let us write it down quickly here. I have mentioned it in the bottom that this moment generating function m sub x of t is defined as an expected value of exponential to the power t x and the property of that is that the derivative of m of x of t with respect to t dt if you take the kth k to the power this to the power this at the value of t is equal to 0 it gives you the expected value of x to the power k. So, that is why it is called the kth it is called the moment generating function. So, we defined the kth moment kth raw moment of expected the random variable x and we also defined the moment generating function of the random variable x. Along with that we defined the two important measures one is called the measure or the coefficient of skewness and a coefficient of kurtosis just recall that skewness defines whether your function is positively skewed or is it negatively skewed or is it symmetric. In the case of kurtosis it actually tries to see whether the tip is sharper than the normal or it is less than the normal while the normal will be like something like this. So, it defines if it is sharper then the kurtosis will be more than 3 if it is flatter it will be less than 3 and if it is like a normal curve absolutely symmetric bell shaped normal curve it will be 3. So, we also measured introduced those measures we also introduced expected value of a function of a random variable x as in the case of discrete it is a summation i to i is equal to 1 to infinity g of xi multiplied by probability mass function of xi and in the case of continuous random variable it is integral minus infinity to infinity gx multiplied by probability density function of x dx. So, in this particular session we would like to introduce what is known as joint random variables and their joint cumulative distribution function probability mass function in case they are discrete and probability density function in case it is continuous. Then we will introduce marginal distribution function and the conditional distribution. We will also define the covariance and correlation between such two random variables and finally, we will give an example in terms of Paris coefficients estimated Paris coefficients. Let us move on the joint random variables occur very naturally in our day to day life. For example, I have given a few examples here if a person has a lung cancer and a person is a smoker these two are correlated events and they vary together. Similarly, in the case of metallurgy and material science world, the fracture toughness of an alloy and the fatigue lap and alloy are also closely related and they seem to vary together in some sense. Similarly, a height of an adult male or a female and the country of his residence also has an effect and therefore, they also some kind of vary together. So, all of the above random variables we find that though they are two different random variables, they do vary together in certain sense. Such variables are called joint random variables. Right now we are going to discuss about two joint random variables but we should remember that there is no it is not necessary that we have only two joint random variables. We may have more. For example, you may have three joint random variables, you may have four joint random variables but the theory is going to be more or less the same. So, we are going to start off with the two random variable jointly distributed random variables and the other case will be left to you for understanding because it is a simple generalization. So, let us define in we write it as an in a bracket X and Y to show the jointness between the two. So, X, Y is a joint random variable then the cumulative distribution function. Remember I had said earlier also with any random variable there is one entity always attached to it and that is called cumulative distribution function whether it is continuous or it is discrete. This quantity is always attached to it and remember it actually comes from the definition of random variable itself because it comes from the probability space. So, here we define the cumulative distribution function of X, Y as capital F sub X, Y of small X, Y is equal to probability that capital X the random variable is less than or equal to small X and random variable Y is small is less than or equal to small Y. The marginal of CDF is defined as F sub X of small X which is equal to F of X, Y of X and infinity it means that you take all possible values of Y. And similarly you have a definition of F of Y that is the marginal density of random variable Y as a joint distribution function of sorry I said marginal distribution function of Y is defined as a joint distribution function of X, Y where X takes the all possible values. Therefore, it is shown here as an infinity. We will go into the detailed definition of this in future. So, if X and Y is a discrete joint random variable then the probability mass function of this X, Y is defined as probability that X takes on a value X, I and Y takes on a value Y, J where I varies from 1, 2, 3 onwards and J varies from 1, 2, 3 onwards and therefore, the CDF the cumulative distribution function of the joint random variable discrete random variable XY at AB is defined as summation of all XI less than A and summation of all YJ less than B of F of XY that is the discrete probability mass function of X joint random variable XY. In case of continuous now it is very obvious to you PDF of a joint continuous random variable XY is defined as small f of XY is such that probability of XY belonging to a set C is the double integral over the area C of f of XY that is the probability joint probability density function dx dy and therefore, probability of X belonging to A and probability Y belonging to B is this integral on area B plus integral for X over area A f of XY XY dx dy and then the CDF can easily be defined because in that case you have to take integral for minus infinity to B for Y minus infinity to A for X f of XY that is your probability density function on XY dx dy. Let us define marginal distribution in a very more with more clarity marginal distribution and also conditional distribution functions or conditional probability mass function. So, here both the cases are shown in case of discrete random variable marginal of any random variable X is defined by integrating out or summing out on all possible value of the other joint random variable. So, here the random variables are X and Y. So, if you are looking for a marginal for X you have to sum it up you have to in the case of continuous you have to integrate over Y here you have to sum it up over the all values of Y j. Conditional PMF just as we defined a conditional distribution function it is a joint distribution function or joint probability mass function divided by its marginal on which you are conditioning upon. So, it is a marginal of random variable Y in both the case you can make out as to if it is continuous how you define it and if it is discrete how you define it it is very similar in nature. The expected values are also defined accordingly in case of discrete random variable expected value of X is summation over i 1 to infinity j 1 to infinity X of i the probability mass function of X and Y and you have to sum it over and otherwise you integrate it over the that is you fully integrate over Y and X, but you multiply with a value of X the probability mass function. Similarly, if you have X, Y you have to put X i, Y j and here you have to put X, Y in the case of continuous random variable and then you have the expected value of X and Y and you can make it now what is the expected value of G of X or what is the expected value of another function H of X and Y both they all can be derived from this particular method. The there are certain coefficients that we are interested in one is a covariance of X and Y and the other is correlation coefficient between X and Y. Please recall we have done the same thing in discrete in descriptive statistics this we are doing with a general random variable descriptive statistics has dealt with the data. Now we are dealing not with any specific data but with a specific random variable which can take any value which is the your data. In future you are going to call those as a sample values and these are going to be the actual random variable functions. So, covariance between X and Y is defined as covariance X, Y is equal to expected value of XY minus expected value of X multiplied by expected value of Y and the correlation coefficient is defined as covariance of XY divided by square root of variance of X multiplied by variance of Y. Now we say that X and Y are two independent random variable if and only if the joint density function or probability mass function is actually a product of two separate marginals of X and Y. This is true both with respect to a continuous random variable and a discrete random variable. So, in this slide I am not distinguishing between the two and in the same way the cumulative distribution function of XY, joint XY is a multiplication of two cumulative distribution function of X and cumulative distribution function of Y. If this happens then X and Y are called independent of each other. If X and Y are independent then also this happens that is why this condition of if with double F which means that it is if and only if both of this imply each other. Let us take an example of a joint random variable which we come across in the material science and metallurgy. Let us take the Paris relationship of crack growth rate per fatigue cycle under linear elastic fracture mechanics and that is given by this equation which is where A is a crack lens, N is the number of stress cycles. So, it means that dA by dN is the rate of growth of fatigue crack as per fatigue cycle is equal to a constant Paris coefficient C stress intensity factor range delta k to the power m. Now, we experimentally generated 7 such crack growth rate curves and we found the you will know in future that this can be solved by what is called a log transformation through linear regression and you can find the values of log C and m that is log of the first coefficient of Paris coefficient and m. These values are found as covariance of log of here it is the data. Let me first show you the data. These are the 7 data points we have. I have not shown it properly. Let me write it down here so that it is clear to all of us. This is log of C and this is m. This information is missing and therefore you can see that here with this data if you try to find the covariance of log C and m it is negative 0.107. The variance of log C turns out to be 0.2. The variance of m turns out to be 0.061 and the correlation coefficient turns out to be negative 0.97. It means that they are very closely correlated but negatively that is when m increases log C decreases this is the relationship and in fact this is known to be distributed as a bivariate normal distribution. We have not yet introduced the distribution functions and the special distributions as normal distribution but just for your information that bivariate normal is this like a two variable normal distribution and these two random variables are log C and log m. Please note that this though Paris coefficients as they are called they are constants but we have to remember that when you estimate them for different 40 crack growth curves that is for each of this you generate different crack growth curves they come out different because each estimate becomes a random variable. From each variable you get some value of C and some value of m and they tend to vary and therefore they appear to be a random variable and here I am showing that they are a random variable in some sense the estimated value. Please remember the in the Paris equation it is not said that these C and m are random but when you actually perform the experiment there is a random error into it which gets reflected as a these different values of log C and m and they become the random because each experiment is a random experiment and therefore these are random manifestation of log C and m and that is what I am saying that they are highly correlated as we can expect because they are not supposed to be random but they are but from the random experiment we are getting different values. So, there is you know with when you perform an experiment there is always a little randomness in it and that gets reflected in this estimated value of coefficients and this is what I am showing they are jointly distributed because this is not 0 the covariance is not 0 and therefore the two random variables are not independent and therefore they are dependent random variables and they are jointly distributed random variables. Let us quickly summarize we have we introduce the joint random variables then we introduce joint cumulative distribution function in case of discrete joint random variable probability mass function and in the case of continuous joint random variable we defined a probability distribution function. Marginal distribution we defined as well as conditional distribution we defined the measure of covariance and correlation coefficient in case of two joint random variables and we gave an example of joint random variable which have been obtained as Paris coefficients in several random experiments of generating Paris curves. Thank you.