 Welcome to the course on dealing with materials data. In today's session, we are going to talk about certain special random variables. Let us review what we have done in the past. We introduced formally the definition of probability through the definition of sample space and events. We said that any random experiments, the collection of all its possible outcomes is called a sample space and any subset of a sample space is called an event. Then we introduced three postulates of probability which basically says that for any event probability is positive, for the complete event that is for the sample space probability is 1 and if there are mutually exclusive events finite or infinite of them, the probability of union of all of them is the sum of all the probability of sum of probability of all the events. Then we also introduced what is known as conditional probability that instead of sample space, if you have a knowledge of one particular event and in the light of having the knowledge of that event, if you look at the probability of other events, it is called conditional probability and from that we introduced Bayes theorem and we briefly talked about the importance of Bayes theorem in today's life. Then we went further and we also introduced what is known as random variable which is a function, a real function defined from any probability space which is a sample space and a probability measure from probability space to a real line and then we said that it could be discrete or continuous depending on the kind of values it takes. If it takes a countable values finite or infinite, it is called discrete and if it takes different kinds of continuous values, continuous numbers, it is called continuous random variable. With every random variable, we introduced that there is a cumulative distribution function attached which is probability of that random variable taking a value given then a specified value x. Now depending on the discrete random variable, you have a probability mass function attached to it and if it is a continuous random variable, you have probability density functions attached to it. This time what we want to do is, we want to introduce certain special discrete random variables or discrete distributions which we come across more frequently in our studies of materials data. Similarly, there will be some distribution, some random variables which are discrete. So we will have discrete distributions, some random variables will be continuous. So we will have some continuous distribution. While covering this what we would like to do is, we would like to see very briefly what is the attached sample space, very important to know where it is defined you must know. So we defined a sample space, we give mean and variance formulae for the given distribution, we also give standard deviation because it these are the two very basic parameters that one would like to know to describe the data. And if there are any other special features, we will talk about it as the time comes, wherever possible we will give an example or we will discuss those examples. So let us move on. The first distribution that we would like to talk about is the first discrete uniform distribution, this discrete uniform distribution is the simplest of the distribution. So if you are rolling a die, then we all know that there are 6 possible outcomes and if the die is fair, then probability of any outcome occurring that is whether it will turn of phase 2, phase 3 or phase 4 whatever, it has the probability of 1 over 6. So this also comes from the postulate, the total number of say if you are looking for probability that the phase, the die phase turns up 2, then the event can occur 2 is only 1, the number of times it can come is 1 and the total number of such phases can turn up are 6. So the probability is 1 over 6 is the example we have given here. In general it can be defined that if the sample space is consists of n equally likely outcomes, then the probability of f of x that is the probability mass function f of x is 1 over n for each x is equal to x1, x2, x3, xn. The next we would like to talk about is Bernoulli trials, this is one of the very important trials, it is any experiment is called a Bernoulli trial if it lands up with exactly 2 outcomes, 2 possible outcomes, it could be success or failure, it could be defective or non-defective or it could be answer yes or no. So whenever you have an experiment which throws up only 2 answers it is called a Bernoulli trial. So if you are tossing a coin it is also a Bernoulli trial and if you are talking rolling a die, but saying that your outcome is success if you get a number between 1 and 3 and if you it is a failure if you get number 4, 5, 6 then also it is a Bernoulli trial. So in that case the sample space is only 0 and 1 because there are only 2 outcomes. Generally we say success is 1 and failure is 0 but this is a convention. Let the probability of success be P, the probability that a Bernoulli trial is successful is P then the PMF, the probability mass function f of x is defined as P to the power x multiplied by 1 minus P to the power 1 minus x where x will take value 0 or 1 and the in that case the mean value mu is P for the distribution and the variance of the distribution is given by P times 1 minus P. I am going to leave the proof to the students to try it out for themselves. Here is a very simple example of a tossing a coin. Next there are distributions which come out of Bernoulli trials. How you conduct the Bernoulli trials, how you conduct your experiment counting the Bernoulli trials gives you raise, gives rise to different distributions. So the first one that we would like to consider is binomial distribution in which you throw n identical Bernoulli trials with n you count you want to know the probability that there will be exactly x successes. Another way to look at is that you throw n trials before you encounter the first success. This is called a geometric distribution and then you have n Bernoulli trials until you get k successes here there is a spelling mistake it should be k successes and this is called a negative binomial why the adjective negative and why still binomial we will look when we look into the distribution. So first let us go to binomial distribution. Consider n identical Bernoulli trials you have conducted and exactly x have resulted in a success. The probability of success is p then your sample space is there can be no success or there can be exactly n successes here it should be it should not be dot dot it should be n please correct it. Probability of success is p then the probability mass function p of x is out of n choose out of n you choose x successes because it can occur anywhere p to the power x multiplied by 1 minus p to the power n minus x please recall that p to the power x tells you the probability of successes in x successful trials this is probability of n minus x unsuccessful trials and this is the different ways the successes can occur in a sequence of n trials. I will again leave it to you it can be shown that the mean value for such a PMF for a such a distribution is np and the variance is n times p times 1 minus p sometimes 1 minus p is also denoted by q so it is also called sigma square is npq few things we need to notice number 1 why it is called a binomial distribution because these are binomial coefficients please recall your expansion of a plus b to the power n then this is a binomial coefficient and therefore this is called binomial distribution. Number 2 the mu being np and sigma square being np multiplied by 1 minus np it very clearly says that whenever you see that the variance is directly proportional to the mean value you have a chance that you are actually playing with a Bernoulli trial. So, if you are coming up with some data in which you see this is a property very specific to the Bernoulli sorry n Bernoulli trials or binomial distribution the notation is when we say that random variable x is following binomial distribution with two parameters n and p remember x is the small x is the value that a random variable x takes. Here is an example we will work out suppose a production unit has a probability of producing a defective unit is 0.1 and a batch of 100 unit is chosen randomly what is the probability that exactly 6 defective units will be found. So, this becomes a case of Bernoulli trial with a probability of defective 1 you have taken 100 independent Bernoulli trials and then once you have the 100 independent and identical Bernoulli trials the probability that x takes a value 6 is 100 choose 6 times the probability of p to the power 6 1 minus p to the power n minus 6 and therefore it comes to this you can work out using simple calculator. So, the earth please remember the probabilities 0.059 because we are going to work out the same problem using Poisson distribution by taking an approximation, but that will be in future let us continue with some of the Bernoulli trials. Now, we go to the next distribution which we called a geometric distribution what we are going to do there we consider x Bernoulli trials are performed until the first success has happened on x set trial. So, it means that the first success can happen on the first trial or second trial or third trial or any trial in future. So, s is an infinite set probability of success again is p as in Bernoulli trials and in that case the probability under geometric distribution probability of x is actually p the time x set time when you have got the success multiplied by x minus 1 unsuccessful trials. So, it is 1 minus p to the power x minus 1 and in this case the mean value is 1 over p and the sigma square is 1 minus p divided by p square this also you are welcome to work out by yourself. Now, we come to negative binomial trials or negative binomial distribution. So, now we are considering a sequence of independent Bernoulli trials and the random variable x will denote the number of trials when the nth success will occur. Please remember in geometric distribution we said on the x set trial the first success occurs here we say that on the x set trial nth success will occur n is a fixed number. So, right in the light we said that first trial first success occurs nth is a fixed number it is the nth trial which occurs then we say that x follows a negative binomial distribution. Therefore, your s that is the values that x can take is n, n plus 1, n plus 2, etc, etc probability of success it is a Bernoulli trial. So, we still consider it as a probability p and in that case the probability of x is the number of trials you have to make to reach the nth success it means that you have variety of successes and unsuccessful trials in the previous n minus 1 successes have come in the x minus 1 trials. So, this have been chosen it is x minus 1 how many ways it can occur is x minus 1 choose n minus 1. The n successes have occur so it is p to the power n and then you have x minus n unsuccessful trials and therefore it is 1 minus p to the power x minus n why it is called negative binomial is reasoned out here. If instead of looking at the successful trial you look at the unsuccessful trials then the total number of unsuccessful trials have occurred when you have taken x trials is y is equal to x minus n and in that case you can show that p of probability of y unsuccessful trials is n plus y minus 1 choose y because you have taken so many trials of which y trials are unsuccessful and therefore the n successful trials multiplied by 1 minus p to the power y basically you have to replace x minus n by y and you get this formula straight from this formula. Now this binomial coefficient n plus y minus 1 choose y if you simplify it it turns out to be it is minus 1 to the power y minus n choose y. These are also binomial coefficients with negative integers and therefore this distribution is called negative binomial distribution. Please do not worry about going into so many details if you can understand what exactly is the phenomena rest of it will follow by itself. So, if as far as we understand that there are sequence of independent Bernoulli trials in which a random variable x will denote the number of number of trials we had to go through until we arrive at the nth successful trial n is a fixed number then we say that x follows a negative binomial distribution. So, if this is understood rest of the mathematics will follow by itself. In this case the it is easier to calculate the expected value of y and expected value and variance of y from which you can of course work out the expected value of x minus x that is y being x minus n you know the expected value of x and variance of x. So, it turns out that very expected value of y is n times 1 minus p to y of p and variance of y is n times 1 minus p multiplied by p square divided by 1 over p square. Now, we move to another distribution very commonly used to distribution which is called Poisson distribution. This is a random variable which takes a value 0, 1, 2, 3 onwards and has a mass probability mass function defined in such a way where lambda is its parameter which is a positive value which takes on only positive value. So, px is e to the power lambda lambda to the power x divided by x factorial is said to follow Poisson distribution and is denoted by x distributed as Poisson lambda. I think I have forgotten to mention the notation for distribution. So, let me do it here for your information. If x follows a geometric distribution, we are going to say that x follows geometric and it has only one parameter which is p. It is called a geometric distribution. If it follows a negative binomial distribution, the notation is x follows negative binomial and remember the parameters are same as in binomial. So, it is n and p. Please note this. Now, we move on. So, Poisson distribution you denoted by x following Poisson with lambda. The mean value is lambda and the variance is also lambda. This is the specialty of this distribution where the mean value and the variance are exactly the same. This can be one of the characteristics of this distribution when you are trying to identify a distribution. Where does this distribution occur? You see we started with a Bernoulli trial uniform distribution, Bernoulli trial and the distribution derived from Bernoulli trial. We had some genesis. Poisson distributions genesis is such that when you are talking about a very occurrence of a very rare event, that is an event which occurs very after a long period of time it occurs and it occurs with a very small probability, it is generally tends to follow Poisson distribution. And therefore, for example, the number of if you have a good edited book, number of typographical error in a page can be, can follow a Poisson distribution. A number of defective parts produced in a day if your system is very well defined and it follows in a very quality assured way, then the possibility of getting a defective part is very, very, very low and therefore, it tends to follow Poisson distribution. A number of accident that can take place on a particular crossroad in a day is a very rare event fortunately and therefore, it is called a Poisson distribution. The Poisson distribution is also derived with this philosophy and from binomial. So, there is a relationship between binomial and Poisson distribution. This relationship is this. If you assume that x follows a binomial distribution with parameter n and p, it means that you have conducted the probability of success is p and you have conducted n independent Bernoulli trials, then as n tends to infinity, the number of trials tend to infinity and the probability of success, the probability of occurrence tends to 0 in such a way that the np that is the, you remember the average of the binomial distribution remains constant, then x as n tends to infinity tends to Poisson distribution with a parameter lambda which is the average value of binomial distribution. Now, you can see why we do not want to call probability p as a probability of success here because we do not want to have the success tending to 0. So, therefore, number of times we simply say the probability and it is a probability of occurrence of an error or defective or something. So, here there is a slight change in the definition of p because otherwise it sounds very funny that you have so many trials and you have no success, that is not what we are trying to say. This relationship is used to approximate binomial distribution with Poisson distribution, when n is large and np remains constant, when n is large I have forgotten to write, when n is large and p is getting smaller and np remains constant. I will show you how it can be proved mathematically. So, here we have x distributed as binomial np, np is a constant say lambda, then we start with the probability of x which is a binomial distribution. So, it is n choose x p to the power x 1 minus p to the power n minus x, we replace p by lambda by n. So, it simplifies into this format and this is nothing but the simplification of this and this together is sitting here and this is also the simplification of this. So, this simplifies into this and you will find that as n tends to infinity this tends to 1, this comes out as it is, this tends to 1 the denominator and this tends to e to the power lambda. So, as n tends to infinity this is how it turns out to be a Poisson density and therefore it is proved, therefore it is proved that when you are doing the binomial trials this is what really happens. When you have a binomial trials and n tends to infinity p becomes tends to 0, the distribution the random variable tends to Poisson distribution with np which is a constant as its average value. Here is an example of a Poisson distribution. So, we consider an example for, we consider the same example for binomial that is we had 100 randomly chosen items where the probability of having defective is 0.1 and we want to find out exactly 6 items being defective. So, here for np you know np is lambda which is 100 times 0.1 which is 10 using Poisson approximation to binomial probability that x is equal to 6 is e to the power minus 10 which is lambda 10 to the power 6 divided by 6 factorial and that is 0.063. Please recall that the same value in the past sorry the same value in the past had been 0.059 instead we got approximated value as 0.063. See this approximation was 0.059 and 0.063 which are almost the same. So, this is how the Poisson approximation in number of problems that you solve it is much easier to solve it through Poisson method than finding the 100 to 6 kind of parameter it is much easier to calculate e to the power 10, 10 to the power 6 divided by 6 factorial. The next distribution that I would like to talk about is a hyper geometric distribution. This is not coming out of Bernoulli trials it is a very different one, but remember that when we are considering the binomial distribution we are considering the case of choosing n items with replacement while here you are choosing the n items without replacement this is the main difference. So, here we consider a case of total n items that you have of which m are defective. n small n items are chosen without replacement with the probability that x of n are defective gives rise to hyper geometric distribution. So, your sample space is 0, 1, 2, 3, k where does the k come from? I think I have taken a mistake it should not be k it should be m. Let me correct it it should be m. So, you have x is equal s your sample space is 0, 1, 2, 3 up to m. Probability that you have chosen exactly x defective item is m choose x multiplied by remaining item n minus m you have chosen n minus x items divided by the total number. This is coming straight from the probability theory these are the total number of events you how many times you can choose the n items small n items out of the n and in this circumstances the mean value is given as n multiplied by it is n p. So, you can see that it is n multiplied by m divided by n and sigma square is this. This comes as I said this is because we are considering without replacement binomial distribution comes with replacement. Now where is the application for all this distribution in material science? We say that dealing with material science data right. So, here it comes the all answers I would like you to find out, but here is a case of three dimensional atom probe field ion microscopy. This is the machine and the purpose is that you take any specimen any metallic or any material specimen which has two kinds of atoms in it and you take a very thin sort of a thin cylinder like a you know in a pencil what you have you put it in the microscope and then it uses the it probes and it finally takes out the atoms from the specimen and it detects is over the detector. So, the atoms are field evaporated from the surface and the detector identifies the chemical nature of the detected ion by the time of flight in the mass spectroscopy. So, how much time it takes to fly to reach depending on that it actually detects the atom. You can refer to this paper, but it is very beautifully described that if this is your whole specimen the probed volume will be a piece of that. So, suppose you assume that you are looking for a atom A which is the dark one and you are looking for proportion P of atoms A this is your main purpose. What you are getting out is a probed volume which has a total number of say small m atoms and J are the A type atoms and the proportion of P of A atoms is you are looking for. Now, what really comes is through detector is you see only detected atoms in which you see small n number of total atoms of which I is the A type of atoms and then you only know the proportion P 0 of atoms A which has comes out in the detected atoms and this paper actually describes how to estimate the variation in this P value. This P is the best estimator for your capital P here and this are the questions that you have if you answer you have all the knowledge that you need in this particular area in this for the discrete distribution. So, let us take one after the other n is known a priory and i is observed value of A atoms, i is a random variable observed on n total n atoms detected from the probed volume containing total of m atoms and with J atoms of type A. So, this distribution will be hyper geometric distribution. If m atoms evaporated before n atoms got detected with detection probability of Q then what will be the distribution of m? It is like coming up to the nth success you conduct the Bernoulli trial till you come up to the nth success. So, this is going to lead to negative binomial distribution and then assuming that capital P is the proportion in the specimen given that small m atoms in the probed volume and of which J are the atoms what is the distribution of J? I leave it to you and finally given all that what is the probability of i? See we have already said one distribution of random variable i. I am asking the distribution of random variable i again. It is a good question for conditional probability, negative binomial distribution and hyper geometric distribution. You are welcome to refer to this particular paper. It has the solution, but it will give you the idea that all distributions have some role or the other to play in the field of material science data analytics. So, let us summarize quickly. We introduced special discrete distributions in today's lecture. We introduced discrete uniform distribution which is a very basic kind of a distribution. Then we introduced Bernoulli trials. I have missed out a few things. I will note them down here. We introduced Bernoulli trials and then we introduced three distributions arising out of Bernoulli trial. Binomial distribution, geometric distribution and negative binomial distribution. Then we introduced Poisson distribution and its relationship with the binomial that is in some sense as the number of trial increases to very large quantity. It tends to become a Poisson distribution provided the average of binomial distribution tends to constant value lambda as p also tends to 0. Hyper geometric distribution, binomial distribution occurs without replacement. This occurs with replacement, I am sorry this occurs with replacement, this hyper geometric distribution occurs without replacement. We took an example of 3D AP FIM and raised a few questions and gave you the reference of the paper to understand that all these distributions do occur in the world of materials science and materials engineering. Thank you.