 course on dealing with materials data. In this session, we will introduce what is known as random variable and its expectations in statistics. First, we will have a brief background. There is something called a sample space. When you perform an experiment, it has several outcomes. So, set of all possible outcomes of an experiment is called a sample space. Any subset of this sample space is called an event in the terms of probability and probability of an occurrence of an event is defined as probability of E and this probability has three qualities. It is defined in with the three qualities. One is that it is between 0 and 1. The probability of the whole sample space is 1 and if you have mutually exclusive n events, then the probability of union of all those events is a summation of all these probabilities. This also it can be derived from this that if A and B are not mutually exclusive events, then the probability of A union B can be written as probability of A plus probability of B and please remember that we would have counted the common area because they are not mutually exclusive. The common number of results of experiment twice so we have to subtract it once. I think this background you already have. Now, let us move on. So, in this particular lecture, I am first going to make an introduction. Why do we need a definition of a random variable? Then we will define the random variable. Every random variable has one quantity attached to it which is called cumulative distribution function. Then we will define two kinds of random variables. One is discrete and the other is continuous. For discrete, we will define the probability mass function and for continuous random variable we will define probability density functions and at the end we will give some examples. So, let us move on. We perform a lot of experiments and the experiments may or may not have the result which is numerical. For example, if you are performing a casting experiment to cast a metal form, sometimes the metal casting may form and make into a form, a casting form and sometimes it just would not form and something would have gone wrong in the experimentation. So, in that case the experimentation results in success or failure. So, it is not necessary that every time the experiment results in a numeric data. However, having the numeric outcome or mapping the outcome into a numeric value is of immense help and it is desirable because that way we are able to do a lot of analysis on the data and therefore this mapping that we say that it is desirable, a mapping of outcome of a random experiment to a real number is what we called a random variable. So, let us formally define it. Random variable is a function mapping a sample space omega to a set of real numbers r. So, rv is a random variable short form is a sample space with a probability p mapped into a real line real number r. The notation now on we are going to use is all the capital alphabet x, a, y, etc will be called a random variable and actual numerical values that it takes will be denoted by small alphabets such as given here. So, as I said in the case of metal form if it is a successful experiment we call that x random variable x is equal to 1 and if it is a failure we say that the random variable takes a value 0. A yield strength of a super alloy or yield strength of any alloy or any metal is a random variable mapped on to itself because yield strength itself is numeric. Efficiency of a process is also a random variable which is a function of actual yield of the process and the theoretical yield of the process. So, for example, if you take a ratio of actual yield of the process and the theoretical yield of the process and you call it an efficiency then that is also a random variable. As I said before with all the random variable because it is based on a it is mapped from the sample space omega to a real line there is always a quantity which is called cumulative distribution function attached to any random variable. So, if x is a random variable defined on a space probability space of omega and p then probability that x is less than or equal to small x is defined as f of small x, f is a capital F is called a cumulative distribution function or CDF in short or CDF of random variable x. Please note that this is a small letter and this is a capital letter. So, this is a value that x takes. So, a notation for this is used is x is distributed as this signifies distributed as distributed as distributed with CDF f. So, this is a notation. Let us move on there are two types of random variables. One is countable and the other is uncountable or continuous. What do we mean by that? Consider the experiment in which you are counting the number of defective out in outcomes that come out of a processing unit. So, these are going to be numbers like there are two defectives, three defectives, high defectives etc. So, these are called countable and with the countable random variable which is called a discrete random variable there is a quantity attached to it which is called probability mass function. While continuous is like yield strength is a continuous random variable and it actually refers to the measures. It measures a certain quantity and therefore it generally is a continuous random variable and with a continuous random variable probability density function is attached. What is probability mass function? Well, if x is a discrete random variable taking distinct value x1, x2, x3, xn and so on and so forth. Remember that it is a discrete. So, it takes a countable number of value. You call this all the values that it take as an a, then probability mass function is a function defined a small f defined from a to 01 as f of xi is equal to probability that x takes a value of xi and it is 0 if x does not take a value of xi. You can see that summation of all the xi is equal to 1 to infinity f of xi has to be 1 because probability that the complete set occurs is 1 and therefore it is the probability mass function says that summation of all the values of probability mass function that it can take is 1. Summation if I have made a mistake I correct myself summation of all the values that a probability mass function can take is 1 and the cumulative this is please note that this is again a cumulative distribution function. This is CDF. So, this CDF of the random variable x is probability that x is less than or equal to b. In terms of probability mass function it can be written as summation over all the xi's which are less than b f of xi probability density function. Suppose x is a continuous random variable and f capital F is a connected cumulative distribution function of x then the probability density function small f is defined as a function such that it is always it takes on always positive values its integration over all the possible values of x. So, integration from minus infinity to infinity of f of x is 1 in other words the area under the curve f of x is 1 probability of a less than x is less than or equal to b is the integral between a and b f of x dx and hence f of a is that is the cumulative then distribution function of a x at a is probability that x is less than or equal to a and therefore, it is integral of minus infinity to a f of x dx where f of x is probability density function. And from this you can say that the derivative of the cumulative distribution function with respect to its argument gives you the probability density function in case of continuous random variable. Here is an example these are very simple examples suppose I say that f of x is a continuous random variable and it takes on a value the probability mass function sorry probability density function takes on a value c multiplied by 2x minus x square if x is between 0 and 1 otherwise it is 0 it is a pdf that is probability density function of x then find the c the solution can be find because the property that over the complete range of x which is here 0 to 1 because otherwise it is the f of x is 0 the probability density function the integral is 1 and therefore, you can easily show from simple algebra and arithmetic that this is actually 3. So, with this we cover up what we summarize what we have covered today. We have introduced probability space which is a sample space with a probability measure on the probability space we defined a random variable which is from probability space to a real line with every random variable whether discrete or continuous there is a cumulative distribution function attached which is f of x which is equal to probability of x less than or equal to x. There are two types of random variables that we encounter in real life one is a discrete the other one is continuous. The discrete random variable is generally has a nature it is countable and it has attached to it a probability mass function which actually is defined by small f of x i x i is a realization of random variable x that small f of x i is probability that x takes on the value of x i i is number 1 2 3 n onwards. The continuous random variable actually takes on uncountable values and therefore, it has an attached to it a probability density function which is defined as probability that a is less than the random variable x is less than or equal to b is an integral from a to b of f of x dx and of course, f of x dx is a non-zero entity and if you integrate over the complete range of x it will be 1 that is it is another quality and if you take an integral from minus infinity to a value b then it gives you a cdf the cumulative density this cumulative distribution function of the random variable continuous random variable x and therefore, the in the case of continuous random variable the first derivative of the cumulative distribution function with respect to its argument gives you the probability density function. Thank you.